SCSI timeouts on PERC 3/DC
Michael.James at csiro.au
Mon Oct 11 20:17:01 CDT 2004
On Sun, 10 Oct 2004 12:10 pm, Jason Dixon wrote:
> My Dell 2550 became unresponsive to console or socket input a short
> time ago. Accessing the DRAC-II via web console, I saw the following
> scsi : aborting command due to timeout : pid 5939736, scsi2, channel
> id 0, lun 0 Write (10) 00 01 21 9a e1 00 00 0a 00
> ( ... more similar ... )
> SCSI host 2 reset (pid 5939739) timed out again -
> probably an unrecoverable SCSI bus or device hang.
> ( ... more aborted commands ... )
> At this point, I had no choice but to power cycle it.
I had similar messages in the logs of a PE2650
with Perc3/DC 7 Disk Raid 5 array running Suse Pro 8.1.
Luckily the hardware was due for upgrading anyway
so I immediately copied everything across and put
the replacement machine and array into production.
I was glad I did because in the course of debugging it
under instruction from the Dell tech we lost the lot.
By way on an appology, Dell replaced the card, SCSI cable
and 2 disks, so I don't know just where the problem was.
That machine is back in production as spinning backup
and all arrays now have hot spares in them,
but I would love to know how to monitor
both the logical volume (degraded, disk swapped out, etc)
and the individual disk SMART info
without dropping down to BIOS.
Michael James michael.james at csiro.au
System Administrator voice: 02 6246 5040
CSIRO Bioinformatics Facility fax: 02 6246 5166
More information about the Linux-PowerEdge