Weird Filesystem Problem on PE2650/RH7.2

Philip Rowlands phr at doc.ic.ac.uk
Tue Mar 4 06:39:00 CST 2003


On Mon, 3 Mar 2003, Joe Stevens wrote:

>I had a strange problem occur a few days ago on one of our PE2650s that
>acts as a DB server.

>Feb 24 06:03:49 brickdb kernel: aacraid: Host adapter reset request. SCSI hang ?
>Feb 24 06:03:49 brickdb kernel: scsi: device set offline - command error recover failed: host 0 channel 0 id 1 lun 0
>Feb 24 06:03:49 brickdb kernel: SCSI disk error : host 0 channel 0 id 1 lun 0 return code = 6000000
>Feb 24 06:03:49 brickdb kernel:  I/O error: dev 08:11, sector 15876992
>Feb 24 06:03:49 brickdb kernel:  I/O error: dev 08:11, sector 15876992

Here, there's a problem writing (I assume) to the disk. The sectors
mentioned are probably bad.

>Feb 24 06:03:49 brickdb kernel:  I/O error: dev 08:11, sector 103192
>Feb 24 06:03:54 brickdb kernel: vs-13050: reiserfs_update_sd: i/o failure occurred trying to update [6820 6855 0x0 SD] stat data I/O error: dev 08:11, sector 35216
>Feb 24 06:03:54 brickdb kernel: journal-601, buffer write failed

Here, reiserfs is complaining as a result of the above errors.

>Feb 24 06:03:56 brickdb kernel: ------------[ cut here ]------------
>Feb 24 06:03:56 brickdb kernel: kernel BUG at prints.c:334!
>Feb 24 06:03:56 brickdb kernel: invalid operand: 0000

This probably is a real bug, but as you can see from the traces below,
the kernel is in the middle of reiserfs_panic.

>Feb 24 06:03:56 brickdb kernel: EIP is at reiserfs_panic [reiserfs] 0x29 (2.4.18-18.7.xsmp)

These all suggest a bad disk. Pull it, test it, replace it.


Cheers,

Phil




More information about the Linux-PowerEdge mailing list