Server Down again : scsi : aborting command due to timeout

Carl_Niskern@Dell.com Carl_Niskern at Dell.com
Fri Dec 28 16:55:01 CST 2001


I would also recommend that you be onsite with your server when you call in
for troubleshooting assistance.
This saves time on both ends of the phone and reduces frustration levels all
around.

Not to mention you may find a loose cable, or something else simple, and fix
it right then ;)

> -----Original Message-----
> From: Philippe Gramoullé [mailto:philippe.gramoulle at mmania.com]
> Sent: Thursday, December 27, 2001 10:43 PM
> To: Michael E Brown
> Cc: linux-poweredge at exchange.dell.com
> Subject: Re: Server Down again : scsi : aborting command due 
> to timeout
> 
> 
> 
> Hi Michael,
> 
> On Thu, 27 Dec 2001 20:35:09 -0600 (CST)
> Michael E Brown <michael_e_brown at dell.com> wrote:
> 
>   | You should call into Dell technical support for hardware 
> assistance on
>   | this.
> 
> Ok, this was planned, i'll do it tomorrow after a little bit 
> of sleep :o)
> 
>   | I'm afraid that a lot of the Dell folks on this list that might 
>   | normally be able to help you are probably on vacation 
> from Christmas
>   | until New Years.
> 
> Lucky ones :o)
> 
>   | From the symptoms that you have described, it sounds
>   | like either a bad drive or cable.
> 
> I'd rather think of a bad SCSI cable (reason below)
> 
>   | Have you checked that there are no fault lights on your 
> disk shelf?
> 
> The people in the colocation center (2000Kms away from where 
> i am) seems
> to be on vacation as well :-(
> 
>   | Does the PERC BIOS say that all of the drives are in order?
> 
> How should i check that ?
> 
>   | How aboutSMART failures. I believe that the PERC BIOS has 
> a facility
>   | to check the number of SMART errors on each SCSI drive.
> 
> I've checked each and every disk drive and no single error
> 
> For each disk i have in the Powervault, i have : 
> 
> Predictive Failures : No Predictive Failures                  
>             
> Device Errors : Media Errors 0 Other Errors 0
> 
> All the things i could check reported no single error ( Disks 
> are OK, RAID
> arrays are OK, all disks are Online, filesystems have been 
> reiserfsck'ed
> and are OK)
> 
> That's why i think, it would be a bad cable, or the scsi 
> cable was badly
> plugged into the PERC3/QC ( as those cable are pretty long and heavy )
> 
> This could explain the behavior ( after 35 days of perfect 
> work under high
> load) of the scsi timeout.
> 
> Thanks,
> 
> Philippe
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq or search 
> the list archives at http://lists.us.dell.com/htdig/
> 



More information about the Linux-PowerEdge mailing list