SCSI timeouts on PERC 3/DC

Michael James Michael.James at csiro.au
Mon Oct 11 20:17:01 CDT 2004


On Sun, 10 Oct 2004 12:10 pm, Jason Dixon wrote:
> My Dell 2550 became unresponsive to console or socket input a short 
> time ago.  Accessing the DRAC-II via web console, I saw the following 
> errors:
> 
> scsi : aborting command due to timeout : pid 5939736, scsi2, channel 
0, 
> id 0, lun 0 Write (10) 00 01 21 9a e1 00 00 0a 00
> ( ... more similar ... )
> SCSI host 2 reset (pid 5939739) timed out again -
> probably an unrecoverable SCSI bus or device hang.
> ( ... more aborted commands ... )
> 
> At this point, I had no choice but to power cycle it.

I had similar messages in the logs of a PE2650
 with Perc3/DC 7 Disk Raid 5 array running Suse Pro 8.1.
Luckily the hardware was due for upgrading anyway
 so I immediately copied everything across and put
 the replacement machine and array into production.
I was glad I did because in the course of debugging it
 under instruction from the Dell tech we lost the lot.
By way on an appology, Dell replaced the card, SCSI cable
 and 2 disks, so I don't know just where the problem was.

That machine is back in production as spinning backup
 and all arrays now have hot spares in them,
 but I would love to know how to monitor
 both the logical volume (degraded, disk swapped out, etc)
 and the individual disk SMART info
 without dropping down to BIOS.

michaelj

-- 
Michael James                         michael.james at csiro.au
System Administrator                    voice:  02 6246 5040
CSIRO Bioinformatics Facility             fax:  02 6246 5166




More information about the Linux-PowerEdge mailing list