Dell 2650 Perc 3/Di RAID failure, Server wouldn't reboot during rebuild filesystem dead

Ben Russo Ben at muppethouse.com
Tue Apr 27 08:40:01 CDT 2004


Ben Russo wrote:
>     A Story of my problem with DELL PERC 3/Di...
>     It hung the OS when one disk of a RAID5 volume was
>         falsely accused of being "FAILED"
>     And then delayed rebooting for about 18 hours while
>         it "REBUILT"  (it wouldn't reboot during this time)

As a follow up on this.

I have received a series direct e-mails from a Dell engineer who is 
trying to replicate this situation in his lab.  He says that he may have 
some clues that involve a variety of things happening at the same time, 
such as the battery being refreshed (hence the write cache disabled) and 
the box being under very heavy load (or maybe I/O queue?) when a disk 
subsystem failure happens simultaneously on the box, and then the 
possibility of resetting the system before waiting for the timeouts on 
the bus to occur after the confluence of other events....

Anyway if he can reproduce a similar situation and then make some 
progress that would be great.  It definitely shows a commitment to 
quality by Dell's employees.  That's nice.  I have been BCC'g my boss on 
all the e-mails between me and this Dell engineer so that my boss can 
see that the money we pay for Dell equipment is actually buying something.

Although I think the linux-dell site and mailing list is excellent the 
way that it is, and I think that maybe if Dell were to make it official 
the management would ruin this wonderfull resource with red tape and 
beauraucracy.  My boss and I do have to wonder why even though we paid 
for a 3 year support contract for the server, we didn't get this type of 
response when we call for Dell support, but instead get this type of 
response when we communicate on a community supported mailing list that 
is frequented by Dell engineers.

It leads me to believe that Dell's engineering staff and maybe other 
employees are great, but that the business management side is a little 
backward.  So, *why* do we pay Dell for support when their own company 
doesn't deliver this type of response, but their internal engineers are 
probably doing it (uncompensated) on their own time?

Also, of note  to the Dell Engineers who are reading this newsgroup.  I 
am absolutely in awe of the mechanical engineers who designed your Dell 
PowerEdge systems and rack kits.  They are amazing "toolless" designs 
that should be given design awards for their elegance and functionality. 
     Please don't let this change.  Give those guys responsible a big raise.

-Ben.




More information about the Linux-PowerEdge mailing list