EXT3 errors on PE 1600SC

Karl Zander kzander at commpartners.com
Mon Jan 8 09:27:41 CST 2007


  
We have a PE 1600SC with MegaRAID controller.

01:02.0 RAID bus controller: LSI Logic / Symbios Logic 
MegaRAID (rev 01)
01:04.0 SCSI storage controller: LSI Logic / Symbios Logic
53c1030 PCI-X Fusion-MPT Dual Ultra320 SCSI (rev 07)


We are getting the error

EXT3-fs error (device sda6) in start_transaction: Journal 
has aborted

and the server crashes.  A reboot gets it back up, but it 
will go down in 24 hours or so.

To run e2fsck -c /dev/sda6 must I unmount the system 
first?

The list archives show others have also had this problem. 
 One suggestion was to get linttylog to read the RAID 
controller logs.  Here is that log.  I am not sure what 
its showing me.


TTY History for HA(0) -Bus 0x01 Device 0x02
B @T0
(C) LSI Logic 2002 @T0
Megaraid Series 0 firmware version 3.28 @T0
Build date: Jun 30 2003 at 10:14:11 @T0
Board type: 1000/1960/1028/0520 @T0
DRAM_ALT sig invalid from previous boot @T0
FLUSH_ON_SYSTEM_RESET=0 @T0
WAIT FOR BIOS.... @T0
bus 1 dev 5 function 0 @T0
reg 0 value ffffff01 @T0
reg 1 value fbff0004 @T0
reg 2 value 0 @T0
reg 3 value fbfe0004 @T0
reg 4 value 0 @T0
reg 5 value 0 @T0
SCSI chip is on the secondary bus @T0
Found MPT LVD 30 at fbff0004 @T0
  isp 0 membaseaddr 8bff0000 iobaseaddr 9001ff00  @T0
BIOS UP! @T0
Can_flush=0 DRAM SIZE=64 MB @T0
* RST * @T0
Enabling data cache @T0
pciDebug = d0008a9c @T0
calling init_scsi @T0
DISK_CACHE_ADDR=d0d2bc00  @T0
MEM_END_ADDR=d3fffff0  @T0
Total LSI MPT Chips found 1  @T0
LSI_InitMPT : start_index 0 totalLSIMPTChips 1 @T0
         Verifying Image Signatures...VERIFIED @T0
         Verifying image check sum... VERIFIED @T0
BaseAddr 9001ff00 chip 0 @T0
Checking Diagnostic Register write access...    Enabled. 
@T0
reset adapter bit cleared @T0
Complete. @T0
Checking Diagnostic Register write access...    Enabled. 
@T0
The FW version being loaded is MPTFW-01.03.06.00-IT @T0
NextImageHeaderOffset 9784 @T0
ExtImage Size 818 @T0
Diag, Register disabled DIAGNOSTIC_REG 131 @T0
FW download complete... Expecting LSI FW to start excute 
and come to ready state
  @T0
  For this sys doorbell reg bit 28 should be set  @T0
  MISM CHN_STATE_MPT_GET_FW_FEAT chip 0  @T2
  Check IOC FACTS chip 0  @T2
  MISM CHN_STATE_MPT_OPERATIONAL chip 0  @T2
  MISM: Reply frame size 60 start addr d05389ec  @T2
ff reply free frames posted @T2
  MISM CHN_STATE_MPT_INIT_BUS_RST chip 0  @T2
MPT_Poll: chip 0 CHN_STATE_MPT_INIT_BUS_RST  @T2
cmdBufferAddr = fc538770, ioIdx = 0 @T5
cmdBufferAddr = fc538808, ioIdx = 1 @T5
CommandBufferPost Post: Request = d00201c0 @T5
DISM: Queued! @T5
  MPT_ProcessIo Reply Fr 2 EVENT_NOTIFICATION @T5
MPI_EVENT_EVENT_CHANGE @T5
DISM: CR8 SAFTE at chan 0 id 6 @T6
DISM_ProcessPprState: DomainVal done on all disks @T8
DISM: Complete!!! @T28
bbuDebugFlags = d000d77c @T28
battery init: battery backup circuit is not mounted @T28
TBBU: No TBBU h/w @T28
Veirfyin config struct at Addr e0001400  @T28
NVRAM checksum OK - reading configuration @T28
DISK_CACHE_ADDR=d0d2bc00  @T28
MEM_END_ADDR=d3fffff0  @T28
Memory End d3fffff0 @T28
Total Number of Cache Lines 810 @T28
L 5   SS 128   Size cb31000   N 810   Status 2   DT  251 
  BT 512 @T28
can_flush = 0 @T28
No Reconst:Checking drive info @T28
  @T28
REF drive found at ch 0 tgt 0  @T28
Attempting to perform drive roaming @T28
NOT Flushing Cache @T28
Battery Bad: Changing to WRTHRU @T28
BIOS CALL FOR DRV ROAMING : 55 @1/8 12:57:26
drive roaming not done @1/8 12:57:26
REC:log MedErr on pid[1] FcRty=0 ScsiRty=0 @1/8 12:58:19
REC: MedErr on LD[1] BadLba=8659c0 @1/8 12:58:19
Retrying cmdId 2e @1/8 12:58:19
REC:log MedErr on pid[1] FcRty=0 ScsiRty=0 @1/8 12:58:20
REC: MedErr on LD[1] BadLba=8659c0 @1/8 12:58:20
DIO with no cache command. returning FAILURE with CRB set 
to -3 @1/8 12:58:20
  @1/8 12:58:20
Retrying cmdId 2e @1/8 12:58:20
REC:log MedErr on pid[1] FcRty=0 ScsiRty=0 @1/8 12:58:21
REC: MedErr on LD[1] BadLba=8659c0 @1/8 12:58:21
<0,1> scsiErr=f1 rwCmdInx=2e cmdType=2 @1/8 12:58:21
  Reassign d00a0840 94 40 @1/8 12:58:21
REC:log MedErr on pid[1] FcRty=0 ScsiRty=0 @1/8 12:58:24
REC: MedErr on LD[1] BadLba=8669d6 @1/8 12:58:24
DIO with no cache command. returning FAILURE with CRB set 
to -3 @1/8 12:58:24
  @1/8 12:58:24
Retrying cmdId 29 @1/8 12:58:24
REC:log MedErr on pid[1] FcRty=0 ScsiRty=0 @1/8 12:58:32
REC: MedErr on LD[1] BadLba=c216f6 @1/8 12:58:32
<0,1> scsiErr=f1 rwCmdInx=10 cmdType=2 @1/8 12:58:32
  Reassign d00988c0 170 76 @1/8 12:58:32
REC:log MedErr on pid[1] FcRty=0 ScsiRty=0 @1/8 12:58:36
REC: MedErr on LD[1] BadLba=c217d8 @1/8 12:58:36
<0,1> scsiErr=f1 rwCmdInx=1d cmdType=2 @1/8 12:58:36
  Reassign d00a2940 179 58 @1/8 12:58:36
REC:log MedErr on pid[1] FcRty=0 ScsiRty=0 @1/8 13:12:18
REC: MedErr on LD[1] BadLba=86668f @1/8 13:12:18
<0,1> scsiErr=f1 rwCmdInx=34 cmdType=2 @1/8 13:12:18
  Reassign d0091f00 110 f @1/8 13:12:18

--Karl



More information about the Linux-PowerEdge mailing list