PE2950 harddisk problem

Tino Schwarze linux-poweredge.lists at tisc.de
Thu Jan 4 03:37:14 CST 2007


Hi there,

we've got a PE2950 with a Fusion-MPT SAS controller. It has 2x73 GB
(sda, sdb) and 2x300 GB (sdc, sdd) SAS harddisks.

Yesterday, the kernel threw the following messages:
Jan  3 18:43:50 herkules kernel: sd 0:2:2:0: SCSI error: return code = 0x8000002
Jan  3 18:43:50 herkules kernel: sdc: Current: sense key: Medium Error
Jan  3 18:43:50 herkules kernel:     Additional sense: Unrecovered read error - recommend rewrite the data
Jan  3 18:43:50 herkules kernel: Info fld=0x4b204d7
Jan  3 18:43:50 herkules kernel: end_request: I/O error, dev sdc, sector 78775511
Jan  3 18:43:50 herkules kernel: EXT3-fs error (device dm-2):
ext3_get_inode_loc: unable to read inode block - inode=9852990, block=19693763
Jan  3 18:43:50 herkules kernel: Aborting journal on device dm-2.
Jan  3 18:43:50 herkules kernel: ext3_abort called.
Jan  3 18:43:50 herkules kernel: EXT3-fs error (device dm-2): ext3_journal_start_sb: Detected aborted journal
Jan  3 18:43:50 herkules kernel: Remounting filesystem read-only

(dm-2 is an LVM volume on the 300 GB disks)

Later, during fsck run the following messages appeared:
Jan  3 19:24:07 herkules kernel: mptscsih: ioc0: attempting task abort! (sc=ffff81000f9f1b80)
Jan  3 19:24:07 herkules kernel: sd 0:2:2:0:
Jan  3 19:24:07 herkules kernel:         command: Read (10): 28 00 04 b2 02 0f 00 00 80 00
Jan  3 19:24:07 herkules kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
Jan  3 19:24:07 herkules kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff81000f9f1b80)
Jan  3 19:24:07 herkules kernel: mptscsih: ioc0: attempting task abort! (sc=ffff81010d5d2940)
Jan  3 19:24:07 herkules kernel: sd 0:2:2:0:
Jan  3 19:24:07 herkules kernel:         command: Write (10): 2a 00 04 b2 01 cf 00 00 08 00
Jan  3 19:24:07 herkules kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
Jan  3 19:24:08 herkules kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff81010d5d2940)

And a lot more of these.

To me, this looks like a hard disk error (the machine is about 2 months
old...) - can we trust SAS disks after all? They're expensive enough and
this is a RAID0 and we use for database development.

How can I diagnose the controller? I've got OMSA installed, but the SAS
controller doesn't seem to be supported. At least, I couldn't find any
CLI to it.

Thanks,

Tino.

-- 
www.quantenfeuerwerk.de
www.spiritualdesign-chemnitz.de
www.lebensraum11.de



More information about the Linux-PowerEdge mailing list