Sporadic drive problems on PE1950

Randy Martin wolf at CLEMSON.EDU
Tue May 29 07:43:29 CDT 2007


I keep getting the following errors sporadically on different compute nodes.
I end up having to power cycle the nodes to recover.  Any ideas on how to
fix this?  I have applied the latest BIOS update and patches
FRMW_LX_R149666.BIN/FRMW_LX_R149730.BIN, but I still see the problems
occasionally.

 

Thanks,

Randy

 

May 22 20:03:24 compute-2-14.local syslogd: /var/log/kern: Read-only file
system 

May 22 20:03:24 compute-2-14.local kernel: mptscsih: ioc0: attempting task
abort! (sc=00000102011271c0) 

May 22 20:03:24 compute-2-14.local kernel: scsi0 : destination target 0, lun
0 

May 22 20:03:24 compute-2-14.local kernel:         command = Write (10) 00
00 91 19 2d 00 00 08 00  

May 22 20:03:24 compute-2-14.local kernel: mptbase: ioc0:
LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000) 

May 22 20:03:24 compute-2-14.local kernel: mptscsih: ioc0: removing sata
device, channel 0, id 0,  phy 0 

May 22 20:03:24 compute-2-14.local kernel: mptscsih: ioc0: task abort:
SUCCESS (sc=00000102011271c0) 

May 22 20:03:24 compute-2-14.local kernel: mptscsih: ioc0: attempting bus
reset! (sc=00000102011271c0) 

May 22 20:03:24 compute-2-14.local kernel: scsi0 : destination target 0, lun
0 

May 22 20:03:24 compute-2-14.local kernel:         command = Write (10) 00
00 91 19 2d 00 00 08 00  

May 22 20:03:24 compute-2-14.local kernel: mptscsih: ioc0: bus reset:
SUCCESS (sc=00000102011271c0) 

May 22 20:03:24 compute-2-14.local kernel: mptscsih: ioc0: Attempting host
reset! (sc=00000102011271c0) 

May 22 20:03:24 compute-2-14.local kernel: mptbase: Initiating ioc0 recovery


May 22 20:03:24 compute-2-14.local kernel: scsi: Device offlined - not ready
after error recovery: host 0 channel 0 id 0 lun 0 

May 22 20:03:24 compute-2-14.local kernel: sd 0:0:0:0: Illegal state
transition cancel->offline 

May 22 20:03:24 compute-2-14.local kernel: Badness in scsi_device_set_state
at drivers/scsi/scsi_lib.c:1700 

May 22 20:03:24 compute-2-14.local kernel:  

May 22 20:03:24 compute-2-14.local kernel: Call
Trace:<ffffffffa000802e>{:scsi_mod:scsi_device_set_state+241}  

May 22 20:03:24 compute-2-14.local kernel:
<ffffffffa00063d5>{:scsi_mod:scsi_error_handler+2567}  

May 22 20:03:24 compute-2-14.local kernel:
<ffffffff80110e17>{child_rip+8}
<ffffffffa00059ce>{:scsi_mod:scsi_error_handler+0}  

May 22 20:03:24 compute-2-14.local kernel:
<ffffffff80110e0f>{child_rip+0}  

May 22 20:03:24 compute-2-14.local kernel: scsi: Device offlined - not ready
after error recovery: host 0 channel 0 id 0 lun 0 

May 22 20:03:24 compute-2-14.local kernel: sd 0:0:0:0: Illegal state
transition cancel->offline 

May 22 20:03:24 compute-2-14.local kernel: Badness in scsi_device_set_state
at drivers/scsi/scsi_lib.c:1700

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20070529/789f9ab0/attachment.htm 


More information about the Linux-PowerEdge mailing list