Dell 2650 Perc 3/Di RAID failure, Server wouldn't reboot during rebuild filesystem dead
Ben Russo
Ben at muppethouse.com
Tue Apr 27 08:40:01 CDT 2004
Ben Russo wrote:
> A Story of my problem with DELL PERC 3/Di...
> It hung the OS when one disk of a RAID5 volume was
> falsely accused of being "FAILED"
> And then delayed rebooting for about 18 hours while
> it "REBUILT" (it wouldn't reboot during this time)
As a follow up on this.
I have received a series direct e-mails from a Dell engineer who is
trying to replicate this situation in his lab. He says that he may have
some clues that involve a variety of things happening at the same time,
such as the battery being refreshed (hence the write cache disabled) and
the box being under very heavy load (or maybe I/O queue?) when a disk
subsystem failure happens simultaneously on the box, and then the
possibility of resetting the system before waiting for the timeouts on
the bus to occur after the confluence of other events....
Anyway if he can reproduce a similar situation and then make some
progress that would be great. It definitely shows a commitment to
quality by Dell's employees. That's nice. I have been BCC'g my boss on
all the e-mails between me and this Dell engineer so that my boss can
see that the money we pay for Dell equipment is actually buying something.
Although I think the linux-dell site and mailing list is excellent the
way that it is, and I think that maybe if Dell were to make it official
the management would ruin this wonderfull resource with red tape and
beauraucracy. My boss and I do have to wonder why even though we paid
for a 3 year support contract for the server, we didn't get this type of
response when we call for Dell support, but instead get this type of
response when we communicate on a community supported mailing list that
is frequented by Dell engineers.
It leads me to believe that Dell's engineering staff and maybe other
employees are great, but that the business management side is a little
backward. So, *why* do we pay Dell for support when their own company
doesn't deliver this type of response, but their internal engineers are
probably doing it (uncompensated) on their own time?
Also, of note to the Dell Engineers who are reading this newsgroup. I
am absolutely in awe of the mechanical engineers who designed your Dell
PowerEdge systems and rack kits. They are amazing "toolless" designs
that should be given design awards for their elegance and functionality.
Please don't let this change. Give those guys responsible a big raise.
-Ben.
More information about the Linux-PowerEdge
mailing list