PERC 4e/DC in 2850 - lost 1 disk, RAID5 array failed

Paul Dekkers Paul.Dekkers at surfnet.nl
Sat Jul 8 05:39:20 CDT 2006


Hi,

Fran Fabrizio wrote:
> In short, the RAID5 did not work as advertised.  My understanding is 
> that it should survive one disk failing and continue to serve data from 
> this degraded state, in fact, this is one of the major reasons I chose 
> RAID5.  Am I misunderstanding something here, or did my PERC 4e/DC 
> completely fail to do its job?
>   

Sadly, I faced the same problem a couple of months ago: one drive failed
in the RAID-5 set on the PERC 4/DC in an 2850 and all kinds of IO errors
showed up, the file system got seriously corrupted and I had to rebuild
before everything was really useful again. The rebuild was done on the
hot spare, which was configured but apparently did not safe my life
during rebuild.

In my understanding a RAID-5 array keeps on going as soon as 1 disk
fails, and is just safe and redundant again as soon as the set is
rebuilt. Apparently this was not the case, and strange thing was that
the Dell tech. told me that this was not strange at all. I still don't
fully understand that story (forgot most of it because it sounded
rubbish), and I still don't believe that it is true.

I think that it was due to a bug in the firmware: just a couple of days
-before- the crash on my RAID5 set, there was an update for the PERC
4/DC. According to the download site this was a critical update that
fixed possible inconsistencies during drive failure. There! That's
probably exactly what happened.

I had to upgrade to 351X (from the top of my head) to get this fixed.
Are you on the latest firmware? It could be an explanation.

Paul



More information about the Linux-PowerEdge mailing list