SCSI timeouts on PERC 3/DC
Jason Dixon
jason at dixongroup.net
Sun Oct 17 07:33:00 CDT 2004
On Oct 11, 2004, at 8:55 AM, Michael Weber wrote:
> Actually, what it might mean is you have a hard drive that is dying and
> the PERC card is not telling you about it.
>
> I had these symptoms and it cost my company 3 days of down-time because
> the RAID card, ever so helpfully, hid the fact that one hard drive was
> going away and coming right back on-line every few hours. Since the OS
> can only see the logical drives, it has no way of knowing which
> physical
> drive is having problems. It was only found after running the
> extensive
> diags on each drive in the pair for over 45 minutes. The quick diags
> found nothing. Of course, without the magic "error code" from the
> diags, Dell won't send you a new drive. This is one of the reasons I
> now have 6 bright shiney new IBM servers in my lab.
>
> I also have swatch set to alert me on every scsi timeout that is not
> tape drive related.
>
> If it were me, I would take this server down and run diags on the
> drives until you find which physical drive is failing. Expect an hour
> or more per drive of down time.
What is the recommended diag test for the physical drives? I don't see
anything in omdiag beyond the controller test. And that isn't
narrowing down the source of my failure:
[root at colo root]# omdiag storage raidctrl device=1 time=30
quicktest=false
........................................................................
........................................................................
........................................................................
........................................................................
........................................................................
........................................................................
........................................................................
........................................................................
...............
Device Name : Dell PERC 3/DC RAID Controller
Description : Dell PERC 3/DC RAID Controller Device
Location : PCI Bus 2, Device 0, Function 0
Additional Info : No additional information available
Test Name : LSI RAID Controller Hardware Test
Result : Failed
RunTime : 30
Thanks,
--
Jason Dixon
DixonGroup Consulting
http://www.dixongroup.net
More information about the Linux-PowerEdge
mailing list