RAID-5 and database servers

Craig White craigwhite at azapple.com
Fri Mar 12 11:45:06 CST 2010


On Fri, 2010-03-12 at 15:57 +0000, Jefferson Ogata wrote:
> On 2010-03-12 15:39, Craig White wrote:
> > On Fri, 2010-03-12 at 07:06 +0000, Jefferson Ogata wrote:
> >> On 2010-03-12 04:26, Craig White wrote:
> >>> On Fri, 2010-03-12 at 02:23 +0000, Jefferson Ogata wrote:
> >>>> On 2010-03-11 22:23, Matthew Geier wrote:
> >>>>> I've had a disk fail in such a way on a SCSI array that all disks on
> >>>>> that SCSI bus became unavailable simultaneously. When half the disks
> >>>>> dropped of the array at the same time, it gave up and corrupted the RAID
> >>>>> 5 meta data so that even after removing the offending drive, the array
> >>>>> didn't recover.
> >>>> I also should point out (in case it isn't obvious), that that sort of
> >>>> failure would take out the typical RAID 10 as well.
> >>> ----
> >>> ignoring that a 2nd failed disk on RAID 5 is always fatal and only 50%
> >>> fatal on RAID 10, I suppose that would be true.
> >> The poster wrote that all of the disks on a bus failed, not just a
> >> second one. Depending on the RAID structure, this could take out a RAID
> >> 10 100% of the time.
> > ----
> > actually, this is what he wrote...
> > 
> > "When half the disks dropped of the array at the same time, it gave up
> > and corrupted the RAID 5 meta data so that even after removing the
> > offending drive, the array didn't recover."
> > 
> > Half != all
> 
> Read it again: "I've had a disk fail in such a way on a SCSI array that
> all disks on that SCSI bus became unavailable simultaneously."
----
of course I read that but the very next sentence expounds... when half
the disks dropped out of the array at the same time, it corrupted the
RAID 5 metadata...

a loss of 2 RAID 5 devices is always catastrophic.
----

> > I had a 5 disk RAID 5 array fail the wrong disk and thus had 2 drives go
> > offline and had a catastophic failure and thus had to re-install and
> > recover from backup once (PERC 3/di & SCSI disks). Not something I wish
> > to do again.
> 
> PERC 5 and PERC 6 are worlds different from the PERC 3/di.
----
agreed
----
> 
> > I don't think I understand your 'odds' model. I interpret the first
> > example as RAID 50 having 5 times more likelihood of loss than RAID 10
> > and I presume that isn't what you were after
> 
> Yes, it is 5 times higher. But it is not 100%; it's actually less than
> 50%. And the probability for RAID 10 is not 50% as you said it was. I
> was just correcting your analysis. I'm still not sure what RAID
> structure you had in mind where a second failure on a
> RAID 10 has a 50% probability of loss.
----
sorry I wasn't clear but I thought you would figure it out.

Say you have a 4 disk RAID 10 array. If you lose 2 disks, your chances
are 50% that the RAID 10 array is unrecoverable. If you lose both
elements of one stripe or both elements of one mirror. That's my
understanding anyway.

I admit I am far from the most knowledgeable person on this topic and I
sat on the sidelines for both of the discussions but felt that the
article from enterprisestorage needed to be linked because clearly there
are sufficient issues with the typical high density, large SATA drives
and RAID 5. I have yet to see anything that would change my mind from
thinking that the only reason to use RAID 5 is to maximize storage per
dollar which may very well come with performance and reliability issues
that should not be unspoken.

Craig


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



More information about the Linux-PowerEdge mailing list