PE2950 harddisk problem
Tino Schwarze
linux-poweredge.lists at tisc.de
Thu Jan 4 03:37:14 CST 2007
Hi there,
we've got a PE2950 with a Fusion-MPT SAS controller. It has 2x73 GB
(sda, sdb) and 2x300 GB (sdc, sdd) SAS harddisks.
Yesterday, the kernel threw the following messages:
Jan 3 18:43:50 herkules kernel: sd 0:2:2:0: SCSI error: return code = 0x8000002
Jan 3 18:43:50 herkules kernel: sdc: Current: sense key: Medium Error
Jan 3 18:43:50 herkules kernel: Additional sense: Unrecovered read error - recommend rewrite the data
Jan 3 18:43:50 herkules kernel: Info fld=0x4b204d7
Jan 3 18:43:50 herkules kernel: end_request: I/O error, dev sdc, sector 78775511
Jan 3 18:43:50 herkules kernel: EXT3-fs error (device dm-2):
ext3_get_inode_loc: unable to read inode block - inode=9852990, block=19693763
Jan 3 18:43:50 herkules kernel: Aborting journal on device dm-2.
Jan 3 18:43:50 herkules kernel: ext3_abort called.
Jan 3 18:43:50 herkules kernel: EXT3-fs error (device dm-2): ext3_journal_start_sb: Detected aborted journal
Jan 3 18:43:50 herkules kernel: Remounting filesystem read-only
(dm-2 is an LVM volume on the 300 GB disks)
Later, during fsck run the following messages appeared:
Jan 3 19:24:07 herkules kernel: mptscsih: ioc0: attempting task abort! (sc=ffff81000f9f1b80)
Jan 3 19:24:07 herkules kernel: sd 0:2:2:0:
Jan 3 19:24:07 herkules kernel: command: Read (10): 28 00 04 b2 02 0f 00 00 80 00
Jan 3 19:24:07 herkules kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
Jan 3 19:24:07 herkules kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff81000f9f1b80)
Jan 3 19:24:07 herkules kernel: mptscsih: ioc0: attempting task abort! (sc=ffff81010d5d2940)
Jan 3 19:24:07 herkules kernel: sd 0:2:2:0:
Jan 3 19:24:07 herkules kernel: command: Write (10): 2a 00 04 b2 01 cf 00 00 08 00
Jan 3 19:24:07 herkules kernel: mptbase: ioc0: LogInfo(0x31140000): Originator={PL}, Code={IO Executed}, SubCode(0x0000)
Jan 3 19:24:08 herkules kernel: mptscsih: ioc0: task abort: SUCCESS (sc=ffff81010d5d2940)
And a lot more of these.
To me, this looks like a hard disk error (the machine is about 2 months
old...) - can we trust SAS disks after all? They're expensive enough and
this is a RAID0 and we use for database development.
How can I diagnose the controller? I've got OMSA installed, but the SAS
controller doesn't seem to be supported. At least, I couldn't find any
CLI to it.
Thanks,
Tino.
--
www.quantenfeuerwerk.de
www.spiritualdesign-chemnitz.de
www.lebensraum11.de
More information about the Linux-PowerEdge
mailing list