Replacing failed disk with MegaCli

Romain GALLET rgallet at fullsix.co.uk
Mon May 7 18:33:15 CDT 2007


Hi all,

My company recently acquired a PE 2950 (Perc 5/i) with 6 * 750Gb disks 
which I configured in RAID-5 with one hotspare drive. The raid array is 
divided into two logical drives, sda and sdb, of  1To and 2To 
respectively. I installed Slackware 11.0 on it and so far I have had no 
issue at all since the machine is brand new.

I have been googling around to figure out what would be the best way to 
monitor and manage the array and I came to the conclusion that MegaCli 
seemed pretty much the better solution for that job in terms of ease of 
installation and minimum dependency hell.

However, even after reading the megacli's manual, I'm still puzzled at 
what steps would occur and in which order if ever a hard drive failure 
was to happen. If I understood well the whole concept of raid, should a 
drive fail and be detected as such by the raid controller, its state 
should show up as "failed" or "offline". From there, this is what I 
should do to get get the array back to a normal state:

1. Set the drive to "offline" if not marked as such already :
   MegaCli -PDOnline  -PhysDrv 8:1 -a0
   Enclosure is 8 on my system, 1 given as an example.
2. Mark the drive as missing
   MegaCli -PdMarkMissing -physdrv 8:1 -a0
3. Add the Hotspare to the two arrays (logical drives)
   MegaCli -PDHSP -Set -Dedicated -Array0 -EnclAffinity
   MegaCli -PDHSP -Set -Dedicated -Array1 -EnclAffinity
4. Prepare the physical drive for removal
   MegaCli -PdPrpRmv -physdrv 8:1 -a0
5. Replace the drive (do I need to power off ?). This also should start 
an automatic rebuilding of the drive
   MegaCli -PdReplaceMissing -physdrv 8:1 -array0
6. Hope it worked ! ;-)

Now the questions :

1. Is the hotspare drive automatically "loaded" into the array, ie is 
step 3 necessary ?
2. Can all of this (the replacement operation) happen when power on ?
3. Once the new drive is in, and suppose now the hotspare is now par of 
the array as a logical drive, will the new drive be the hotspare ? How 
do I configure it to be a hotspare (could not find that in the magacli's 
manual) ?
4. If say the hotspare is not added automatically or manually to the 
array, but yet I have replaced the faulty drive, any chance the array 
detects it on its own and start rebuilding it with no questions asked ?
5. The cli's manual describes that some of the operation happen on 
(un)configured drive. What is the state/meaning of a such (un)configured 
drive ?
6. The machine is not in a critical environment and can be rebooted if 
need be. Can I do all these operations from the raid controller's bios 
menu ?

I have been reading thru the past messages in this mailing list and 
could not find the answers to these questions. However, should they have 
already been answered, could you please be so kind as to post some 
pointers.

Many thanks for dor answers. Regards,

Romain

-- 
Logic is a systematic method of coming to the wrong conclusion with 
confidence. -- Murphy



More information about the Linux-PowerEdge mailing list