the 'right' way to rebuild a container
Chuck Stuettgen
cstuettgen at myrealbox.com
Fri Oct 15 21:04:01 CDT 2004
On Fri, 2004-10-15 at 15:27, Glenn L. Wentworth wrote:
> I am sure this has probably been answered before but I don't remember
> seeing it so I'll just ask it again. Also the answer may end a
> discussion we are having internally about the problem.
>
> We have just installed some new 2650s. The systems have 4 drives setup
> in a raid-5. The systems are in disparate locations so they are managed
> by different people.
>
> Two of the machines (1 in each location) lost a drive. At one site the
> admin got the new drive popped the failed drive out, put the new drive
> in and walked away letting the system run and rebuild at the same time.
> That system seems OK, continues to run and best of all was 'up' the
> whole time.
He was lucky. If the PERC controller still thought the drive was online
it may not have recognized the new drive.
> The other group called Dell and followed the directions of a Dell tech.
> The process was: take the system down, replace the drive, come up into
> the ctrl-a raid manager and rebuild the container. Then bring the
> system back up.
The Dell Tech was wrong. The only time you should replace a drive with
the system powered off is if the drive is hanging the SCSI bus.
See this Dell Knowledge Base Article for the proper way to replace a
failed drive.
http://support.dell.com/support/topics/global.aspx/support/kb/en/document?DN=1070984
> Both methods seem to work. Except of course the second system was
> off-line for some 4 hours while the container was rebuilt.
>
> If there is not a downside to hot swapping a failed drive while the
> system is running why does Dell have the support techs tell customers to
> rebuild the raid array with the machine off-line? And other than being
> off-line for 4 hours are there other pros and cons to the two ways of
> fixing a raid array with a failed drive?
>
>
> glw
More information about the Linux-PowerEdge
mailing list