the 'right' way to rebuild a container

Chuck Stuettgen cstuettgen at
Fri Oct 15 21:04:01 CDT 2004

On Fri, 2004-10-15 at 15:27, Glenn L. Wentworth wrote:
> I am sure this has probably been answered before but I don't remember
> seeing it so I'll just ask it again.  Also the answer may end a
> discussion we are having internally about the problem.
> We have just installed some new 2650s. The systems have 4 drives setup
> in a raid-5.  The systems are in disparate locations so they are managed
> by different people.  
> Two of the machines (1 in each location) lost a drive.  At one site the
> admin got the new drive popped the failed drive out, put the new drive
> in and walked away letting the system run and rebuild at the same time. 
> That system seems OK, continues to run and best of all was 'up' the
> whole time.  

He was lucky. If the PERC controller still thought the drive was online
it may not have recognized the new drive.

> The other group called Dell and followed the directions of a Dell tech.
> The process was: take the system down, replace the drive, come up into
> the ctrl-a raid manager and rebuild the container.  Then bring the
> system back up.  

The Dell Tech was wrong.  The only time you should replace a drive with
the system powered off is if the drive is hanging the SCSI bus.

See this Dell Knowledge Base Article for the proper way to replace a
failed drive.

> Both methods seem to work.  Except of course the second system was
> off-line for some 4 hours while the container was rebuilt.
> If there is not a downside to hot swapping a failed drive while the
> system is running why does Dell have the support techs tell customers to
> rebuild the raid array with the machine off-line?  And other than being
> off-line for 4 hours are there other pros and cons to the two ways of
> fixing a raid array with a failed drive?
> glw

More information about the Linux-PowerEdge mailing list