Critical hardware error with Perc4e/DC controller on PowerEdge 850

Harald_Jensas at Dell.com Harald_Jensas at Dell.com
Fri Apr 21 06:39:30 CDT 2006


> -----Original Message-----
> From: linux-poweredge-bounces at dell.com 
> [mailto:linux-poweredge-bounces at dell.com] On Behalf Of Michael Stucki
> Sent: 21 April 2006 13:17
> To: linux-poweredge-Lists
> Subject: Re: Critical hardware error with Perc4e/DC 
> controller on PowerEdge 850
> 
> Thanks Harald,
> 
> > I would suggest having a look at the RAID controller log 
> using lintty, 
> > or a DOS bootable floppy to read it.
> 
> Just did so. I think it looks good but I admit I cannot argue 
> if there's anything wrong with it. Below is the tty.log 
> output. Maybe there is something obvious for you?
> 
> Regards, michael
> 
> === cut tty.log ===
> 
> TTY History for HA(0) -Bus 0x02 Device 0x0e
> 
> T0: LSI Logic MegaRAID firmware loaded
> T0: Firmware version 521X build on Jan 27 2006 at 12:08:29
> T0: Board is type 1000/0408/1028/0002
> 
> T0: Authenticating RAID key: Done!
> T0: EepromInit: Family=33, SN=eaacbd000000
> T0: Waiting for Expansion ROM to load...done
> T0: ATU located at bus=2 dev=e fun=0
> T0: Found Vendor=8086, Device=0330, SubVendor=0000, SubDevice=0000
> T0:       at bus=0, device=0, function=0
> T0: Found Vendor=8086, Device=0330, SubVendor=0000, SubDevice=0000
> T0:       at bus=0, device=0, function=1
> T0: Found Vendor=8086, Device=0330, SubVendor=0000, SubDevice=0000
> T0:       at bus=0, device=0, function=2
> T0: Found Vendor=8086, Device=0330, SubVendor=0000, SubDevice=0000
> T0:       at bus=0, device=0, function=3
> T0: Found Vendor=8086, Device=0330, SubVendor=0000, SubDevice=0000
> T0:       at bus=0, device=0, function=4
> T0: Found Vendor=8086, Device=0330, SubVendor=0000, SubDevice=0000
> T0:       at bus=0, device=0, function=5
> T0: Found Vendor=8086, Device=0330, SubVendor=0000, SubDevice=0000
> T0:       at bus=0, device=0, function=6
> T0: Found Vendor=8086, Device=0330, SubVendor=0000, SubDevice=0000
> T0:       at bus=0, device=0, function=7
> T0: Found Vendor=1000, Device=0030, SubVendor=1000, SubDevice=1000
> T0:       at bus=0, device=8, function=0
> T0: Found Vendor=1000, Device=0030, SubVendor=1000, SubDevice=1000
> T0:       at bus=0, device=8, function=1
> T0: DRAM SIZE=128 MB
> 
> T0: Environment data:
> T0: VALIDATION=None
> T0: VERSIONS=BTBL_D.2.1.9,BIOS_H430,CTLM_U827,APP_521X
> T0: Flashable=1000_0408_1028_0002
> 
> T0: MFC data:
> T0:     Vendor/DeviceID=1000/0408
> T0:     SubVendor/SubDevice=1028/0002
> T0:     OEM=Dell, clusterDisable=0, flexRaidDisable=0
> T0:     rebuildRate=30, stripeSize=128, flushTime=4
> T0:     cachedIo=1, writeBack=1, readAhead=2
> T0:     channelBase='0', smartMode=6, alarmDisable=0
> T0:     fastInitDisable=0, coercion=128M, disablePredictiveFail=0
> T0:     disableWebBios=1, disableCtrlM=0, writeThroughWhenBatteryBad=1
> T0:     zcrConfig=Undefined, keepSafteFailedStatus=0, 
> autoHotSpareRestore=0
> T0:     variableChkConRate=0, enableNvramDiskMgmtChange=0
> T0:     dirtyLedShowsDriveActivity=0, disableConsChkRestoration=0
> T0:     biosContinueOnError=0, biosAutoConfig=Prompt
> T0:     disableRandomDriveDeletion=0
> 
> T0: DmaInit(): COUNT_DMA_CHANNELS_USED_FOR_DISK_CACHE_DATA = 2
> T0: DISK_CACHE_ADDR=a095e800
> T0: MEM_END_ADDR=a7fefff0
> T0: Found MPT LVD 1030(0/8/1) at fe9e0004/0, mapped to 869e0000
> T0: Found MPT LVD 1030(0/8/0) at fe9c0004/0, mapped to 869c0000
> T0: Total LSI MPT Chips found 2
> T0: LSI_InitMPT : start_index 0 totalLSIMPTChips 2
> T0: Set to Get f/w features chip 0
> T0: 	Verifying Image Signatures...VERIFIED
> T0: 	Verifying image check sum... VERIFIED
> T0: The FW version being loaded is MPTFW-01.03.35.00-IT
> T0: NextImageHeaderOffset=9c70, ExtImageSize=818
> T0: FW download complete... Expecting LSI FW to start excute 
> and come to ready state
> T4: MISM CHN_STATE_MPT_GET_FW_FEAT chip 1
> T4: PRESENT SCSI_ID = 0
> T4: Changing scsiId to  7
> T4: Check IOC FACTS chip 1
> T4: MISM CHN_STATE_MPT_OPERATIONAL chip 1
> T4: Channel 1 using HW termination
> T4: MISM: Reply frame size 60 start addr a0282c80
> T4: fe reply free frames posted
> T4: MISM CHN_STATE_MPT_INIT_BUS_RST chip 1
> T4: MPT_Poll: chip 0 CHN_STATE_MPT_WAIT
> T4: MISM CHN_STATE_MPT_GET_FW_FEAT chip 0
> T5: PRESENT SCSI_ID = 0
> T5: Changing scsiId to  7
> T5: Check IOC FACTS chip 0
> T5: MISM CHN_STATE_MPT_OPERATIONAL chip 0
> T5: Channel 0 using HW termination
> T5: MISM: Reply frame size 60 start addr a0288c40
> T5: fe reply free frames posted
> T5: MISM CHN_STATE_MPT_INIT_BUS_RST chip 0
> T5: MPT_Poll: chip 1 CHN_STATE_MPT_INIT_BUS_RST
> T7: MPT_Poll: chip 0 CHN_STATE_MPT_INIT_BUS_RST
> T9: DISM: Queued!
> T9: MPT_ProcessIo Reply Fr 2 EVENT_NOTIFICATION
> T9: MPI_EVENT_EVENT_CHANGE
> T9: MPT_SetIocPageParameters: After Write CoalescingDepth=1, 
> Timeout=0,
> Flags=1
> T9: MPT_ProcessIo Reply Fr 2 EVENT_NOTIFICATION
> T9: MPI_EVENT_EVENT_CHANGE
> T9: MPT_SetIocPageParameters: After Write CoalescingDepth=1, 
> Timeout=0,
> Flags=1
> T13: DISM_ProcessPprState: DomainVal done on all disks
> T13: DISM: Complete!!!
> 
> T13: Physical device info:
> T13: ID  NVRState  Vendor    Product           Rev    6   7  56
> T13: --  --------  --------  ----------------  ----  --  --  --
> T13: 00  Online    SEAGATE   ST3300007LW       D703  01  3e  0f
> T13: 01  Online    SEAGATE   ST3300007LW       D703  01  3e  0f
> 
> T13: IbbuInit: Current fast charge counter=0 cycles
> T13: IbbuInit: battery voltage good
> T13: IbbuInit: battery temperature good
> T13: TBBU: TBBU h/w . Cache not dirty
> T13: Verifying config struct at Addr 9f401400
> T13: NVRAM checksum OK - reading configuration
> T13: DISK_CACHE_ADDR=a095e800
> T13: MEM_END_ADDR=a7fefff0
> T13: Memory End a7fefff0
> T13: Total memory available for disk cache: 76913f0
> T13: Total Number of Cache Lines 1891
> T13: SS 128: mrs=2  lc=1891 ldc=1  ps=1 cm=ff ba=0 LDs: 0
> T13: LD  0: L=1  SS=128  Size=22ec0000  NL=1891  Status=2  
> DT=245  BT=512
> T13:        span 0: sBlk=00000000, nBlk=22ec0000, dev=00-01
> T13: can_flush = 0
> T13: No Reconst:Checking drive info
> T13: MIGRATE: 40LD or 8ld new drive  ch 0 tgt 0
> T13: REF drive found at ch 0 tgt 0
> T13: Attempting to perform drive roaming
> T13: POWER-ON CHECK-CON[ld=0]
> T13: Actual cons started for ld 0 from stripe 256ed5
> T13: Nvram Event Data ChkSum Error
> T13: NOT Flushing Cache
> T13: RMW: NVRAM structure valid - checking for active RMWs
> 04/21  9:42:40: Time established at T20
> 04/21  9:42:40: Rejecting MISC opcode: unknown sub-opcode (0x41)
> 04/21  9:42:40: BIOS CALL FOR DRV ROAMING : 0
> 04/21  9:42:40: drive roaming not done
> 04/21  9:43:51: Rejecting MISC opcode: unknown sub-opcode (0x26)
> 04/21  9:46:20: Rejecting MISC opcode: unknown sub-opcode (0x26)
> 04/21  9:46:20: Rejecting Unknown FC DCMD 1f
> 04/21  9:49:54: Rejecting MISC opcode: unknown sub-opcode (0x26)
> 04/21  9:49:54: Rejecting Unknown FC DCMD 1f
> 
> === cut tty.log ===
> 
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
> 

Hi,

I belive the TTY log is cleared after a hardreset.

2 Suggestions

1. You might be able to have network share, USB key or floppy drive mounted. And can possibly dump the controller log before doing a hardreset.

2. Turn battery on for TTY logging

With winttylog it is the following switch: (I could not find doc's for linux right now, but I guess you will figure it out..)
/TTY_HIST_BBU_ON - Turn battery on for TTY logging


Switches associated with this utility

Usage : winttylog [/D] [/Ax] [/F filename][/TTY_HIST_CLR] [/TTY_HIST_BBU_O
F] [/TTY_HIST_BBU_ON] [/TTY_HIST_BBU_TEMP_OFF]

/D - will display the TTY History and the store it in then log file.
/Ax - Where the user can specify the adapter number for which desired operation can be performed. x can be from 0-12.
/I - will allow the user to feed, the number of pages(256 bytes each) and the starting offset to read.
/F - User can give the desired file name.
/TTY_HIST_CLR Clear TTY History buffer.
/TTY_HIST_BBU_OFF - Turn battery off for TTY logging
/TTY_HIST_BBU_ON - Turn battery on for TTY logging
/TTY_HIST_BBU_TEMP_OFF - Turn battery off temporarily, turn back on at next restart.


NOTE!
I recomend that you turn the battery off for TTY logging when you are done troubleshooting this.



//
Harald Jensås



More information about the Linux-PowerEdge mailing list