Surviving H/W RAID troubles

Shaw, Marco Marco.Shaw at aliant.ca
Sun Jan 12 11:36:00 CST 2003


I've had about 5 drives fail over the last 3-4 years (I'd consider myself lucky with those stats) on systems with H/W RAID.  2 of the 5 times, I managed to choose the wrong options when being asked to accept changes, and similar dialogs at various point in the recovery process: I may have chosen the wrong option on reboot, or maybe when swapping drives into different bays just to see if the BIOS would then see the disks.

The result: the entire RAID volume/array would be _gone_.

Is there a fail-proof way/document to recover from H/W RAID problems?

The last technique seems to be (without hot-swap) more-or-less:
1. Shut the system down.
2. Pull the good drives out, but leave the one we think is bad.
3. Boot the system, and go into the BIOS to see what might be going on, if a prompt comes up for "accpeting changes", say "yes".
4. Find the drive is bad, bring the system back down
5. Replace the bad drive.
6. Turn the system back on, and similar to #3, when asked to "accept the changes", say "yes".

The system will rebuild the logical volume.  Assming here, for example, that we are dealing with a RAID5 array.

Any input would be appreciated?

I've been asked, in a recent Windows disk drive failure scenario, whether I had the PERC FAST! Utility loaded, which I didn't.  Does this make things easier on Windows & Linux for system recovery from drive failures?

Marco




More information about the Linux-PowerEdge mailing list