Dell 6650 megaraid hang
Mikey Sklar
sklarm at electric-clothing.com
Thu Oct 7 17:16:00 CDT 2004
We've updated a few hundred of our Dell 6650's to the latest megaraid
firmware (along with bios, esm, and rac levels). Today we saw the first
repeat hang on a server running AS2.1 2.4.9-e.34enterprise.
While diagnosing the hang we gathered sysrq-t data and I see that the
scsi_unjam_host call is present. One of the signatures we have associated
with megaraid hangs.
scsi_eh_0 D 00000001 5188 109 1 110 23 (L-TLB)
Call Trace: [<f884c6f0>] mega_internal_command [megaraid2] 0x140 (0xf5c4de98)
[<c011cea8>] printk [kernel] 0xd8 (0xf5c4dec4)
[<f8849431>] megaraid_reset [megaraid2] 0x61 (0xf5c4dedc)
[<c0114318>] smp_apic_timer_interrupt [kernel] 0xb8 (0xf5c4df0c)
[<f884dce0>] .LC63 [megaraid2] 0x0 (0xf5c4df10)
[<f88059ae>] scsi_try_bus_device_reset [scsi_mod] 0x3e (0xf5c4df5c)
[<f8806236>] scsi_unjam_host [scsi_mod] 0x426 (0xf5c4df70)
[<f8850940>] driver_template [megaraid2] 0x0 (0xf5c4df9c)
[<f88069b8>] scsi_error_handler [scsi_mod] 0x198 (0xf5c4dfa0)
[<c0107396>] ret_from_fork [kernel] 0x6 (0xf5c4dfbc)
[<c0105856>] arch_kernel_thread [kernel] 0x26 (0xf5c4dff0)
[<f8806820>] scsi_error_handler [scsi_mod] 0x0 (0xf5c4dff8)
Is there a known resolution to this hang?
Controller, disk and firmware revisions below.
$ bsm -A show -T container,controller,disk
--------------------------------------------------------------------------
Logical drive information
--------------------------------------------------------------------------
Logical drive number 0
Status of logical drive : optimal
Current operation : none
RAID level : 1
Stripe size : 64
Row size : 2
Logical drive number 1
Status of logical drive : optimal
Current operation : none
RAID level : 1
Stripe size : 64
Row size : 2
--------------------------------------------------------------------------
Controller information
--------------------------------------------------------------------------
Controller type : PERC 3/DC
Firmware version : 197O
Logical devices : 2
Channels : 2
Maximum logical devices : 40
Concurrent commands supported : 63
--------------------------------------------------------------------------
Physical drive information
--------------------------------------------------------------------------
Channel #0
Target on SCSI ID 0
SCSI ID : 0
State : online
Device ID : SEAGATE ST336753LC
Target on SCSI ID 1
SCSI ID : 1
State : online
Device ID : SEAGATE ST336753LC
$ bsm -A check -T firmware
Detected system manufacturer dell
Detected system model poweredge 6650
Checking BIOS version...
Supported BIOS version(s): A15 A16
Production BIOS version: A16
Detected BIOS version: A16
BIOS version check [ PASS ]
Checking ESM version...
Supported ESM version(s): 1.64 1.78
Production ESM version: 1.78
Detected ESM version: 1.78
ESM version check [ PASS ]
Checking RAC version...
Supported RAC version(s): 3.12 3.14
Production RAC version: 3.14
Detected RAC version: 3.14
RAC version check [ PASS ]
Checking RAID version...
Supported RAID version(s): 197O
Production RAID version: 197O
Detected RAID version: 197O
RAID version check [ PASS ]
More information about the Linux-PowerEdge
mailing list