PERC 4e/DC in 2850 - lost 1 disk, RAID5 array failed

eclark eclark at alabanza.com
Fri Jul 7 08:15:47 CDT 2006


If you read through the documentation related to either afacli or dellmgr (per 
your card, this would be dellmgr), you would find that you CAN rebuild your 
raid5 while the server is up. You do not reboot your box. You do not go into 
the scsi bios. Simply write a script that interfaces with dellmgr to do the 
appropriate automated raid5 checking, and when you are notified that it 
failed, log into the box, open up dellmgr, and rebuild the raid5 per the 
documentation that comes with dellmgr. If you have your raid5 set up 
properly, you should just have to watch for related emails and do maintenance 
as neccesary on the array.

On Friday 07 July 2006 09:10 am, Fran Fabrizio wrote:
> But isn't the point of RAID5 that it'll keep running with the loss of a
> single disk?  Am I completely confused?  Yes, I know I need to rebuild
> the disk, but I should be able to do that and at the same time the
> system should still be able to serve requests while it is missing one disk.
>
> eclark wrote:
> > Use afacli or dellmgr. Both can do it. There is detailed documentation on
> > how to rebuild a raid5 with these tools. For afacli, refer to
> >
> > http://linux.dell.com/files/aacraid/aacraid_monitoring_script.txt
> >
> > as this explains how to do automated testing of raid5 integrity.  Check
> > dells website for the docs on afacli and dellmgr.
> >
> > On Friday 07 July 2006 08:50 am, Fran Fabrizio wrote:
> >> To clarify, I'm not saying I lost data, I'm saying I lost -access- to
> >> the data.  I had to rebuild the failed drive in order to regain access.
> >>   All drives in the array had to be online for me to access the data.
> >> Once a drive failed, I started losing the ability to read the data.
> >>
> >> Sander Steffann wrote:
> >>> Hi,
> >>>
> >>>> I eventually had to hard reboot the server, and upon reboot, the PERC
> >>>> complained that one disk had failed and that the array was in
> >>>> a degraded
> >>>> state.  Since it did not want to serve up the data, I'm now trying to
> >>>> rebuild that disk from the BIOS, but I thought all of this
> >>>> could be done
> >>>> online, while still serving data, and not having 12 hours of downtime
> >>>> while the disk rebuilds!
> >>>>
> >>>> Am I not understanding this, or did this PERC completely fail?
> >>>
> >>> The PERC failed. A RAID5 array should never lose data when one drive
> >>> fails. (you will lose data when a second drive fails, but your data
> >>> should be ok with one drive failing)
> >>>
> >>> - Sander
> >>>
> >>> _______________________________________________
> >>> Linux-PowerEdge mailing list
> >>> Linux-PowerEdge at dell.com
> >>> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> >>> Please read the FAQ at http://lists.us.dell.com/faq



More information about the Linux-PowerEdge mailing list