Software RAID problem on 2650

Rory Campbell-Lange rory at campbell-lange.net
Tue Aug 12 11:23:01 CDT 2003


I am using 3 disks in a RAID5 array. md has kicked one of the disks out
of the array after perceiving an error. However the DELL low level SCSI
disk verification tool has found no error.

I am running hotraidadd /dev/md0 /dev/sdd and it seems to work. Is this
advisable?!

Thanks for any advice.
Rory
 
---------------------------------------------------------------
kern.log error report

.. kernel: SCSI disk error : host 0 channel 0 id 3 lun 0 return code = 8000002
.. kernel: Info fld=0xd760068, Current sd08:30: sense key Hardware Error
.. kernel: Additional sense indicates Mechanical positioning error
.. kernel:  I/O error: dev 08:30, sector 225837160
.. kernel: raid5: Disk failure on sdd, disabling device. Operation continuing on 2 devices
.. kernel: md: updating md0 RAID superblock on device
.. kernel: md: (skipping faulty sdd )
.. kernel: md: sdc [events: 00000019]<6>(write) sdc's sb offset: 143374656
.. kernel: md: recovery thread got woken up ...
.. kernel: md0: no spare disk to reconstruct array! -- continuing in degraded mode
.. kernel: md: recovery thread finished ...
.. kernel: md: sdb [events: 00000019]<6>(write) sdb's sb offset: 143374656
.. kernel: SCSI disk error : host 0 channel 0 id 3 lun 0 return code = 8000002
.. kernel: Info fld=0xdaa0068, Current sd08:30: sense key Hardware Error
.. kernel: Additional sense indicates Mechanical positioning error
.. kernel:  I/O error: dev 08:30, sector 229245032

---------------------------------------------------------------
fdisk p on /dev/sdc (one of the two working disks)

Disk /dev/sdc: 255 heads, 63 sectors, 17849 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System

---------------------------------------------------------------
fdisk p on /dev/sdd (failed disk)

Disk /dev/sdd: 255 heads, 63 sectors, 17849 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/sdd1            86     11480  91521251+  c7  Syrinx
Partition 1 does not end on cylinder boundary:
     phys=(542, 1, 6) should be (542, 254, 63)
/dev/sdd2   ?         1         1         0    0  Empty
Partition 2 does not end on cylinder boundary:
     phys=(0, 0, 0) should be (0, 254, 63)
/dev/sdd3        133761    251742 947683555+  c7  Syrinx
Partition 3 does not end on cylinder boundary:
     phys=(670, 1, 14) should be (670, 254, 63)
/dev/sdd4   ?         1         1         0    0  Empty
Partition 4 does not end on cylinder boundary:
     phys=(0, 0, 0) should be (0, 254, 63)

---------------------------------------------------------------
Reboot output from md:

.. kernel: md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
.. kernel:  [events: 0000001a]
.. kernel:  [events: 0000001a]
.. kernel:  [events: 00000018]
.. kernel: md: autorun ...
.. kernel: md: considering sdd ...
.. kernel: md:  adding sdd ...
.. kernel: md:  adding sdc ...
.. kernel: md:  adding sdb ...
.. kernel: md: created md0
.. kernel: md: bind<sdb,1>
.. kernel: md: bind<sdc,2>
.. kernel: md: bind<sdd,3>
.. kernel: md: running: <sdd><sdc><sdb>
.. kernel: md: sdd's event counter: 00000018
.. kernel: md: sdc's event counter: 0000001a
.. kernel: md: sdb's event counter: 0000001a
.. kernel: md: superblock update time inconsistency -- using the most recent one
.. kernel: md: freshest: sdc
.. kernel: md: kicking non-fresh sdd from array!
.. kernel: md: unbind<sdd,2>
.. kernel: md: export_rdev(sdd)
.. kernel: md0: removing former faulty sdd!
.. kernel: raid5: measuring checksumming speed
.. kernel:    8regs     :  2322.000 MB/sec
.. kernel:    32regs    :  1721.200 MB/sec
.. kernel:    pIII_sse  :  2603.600 MB/sec
.. kernel:    pII_mmx   :  2342.800 MB/sec
.. kernel:    p5_mmx    :  2318.400 MB/sec
.. kernel: raid5: using function: pIII_sse (2603.600 MB/sec)
.. kernel: md: raid5 personality registered as nr 4
.. kernel: md0: max total readahead window set to 496k
.. kernel: md0: 2 data-disks, max readahead per data-disk: 248k
.. kernel: raid5: device sdc operational as raid disk 1
.. kernel: raid5: device sdb operational as raid disk 0
.. kernel: raid5: md0, not all disks are operational -- trying to recover array
.. kernel: raid5: allocated 3287kB for md0
.. kernel: raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2
.. kernel: RAID5 conf printout:
.. kernel:  --- rd:3 wd:2 fd:1
.. kernel:  disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sdb
.. kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdc
.. kernel:  disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
.. kernel: RAID5 conf printout:
.. kernel:  --- rd:3 wd:2 fd:1
.. kernel:  disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sdb
.. kernel:  disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdc
.. kernel:  disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
.. kernel: md: updating md0 RAID superblock on device
.. kernel: md: sdc [events: 0000001b]<6>(write) sdc's sb offset: 143374656
.. kernel: md: recovery thread got woken up ...
.. kernel: md0: no spare disk to reconstruct array! -- continuing in degraded mode
.. kernel: md: recovery thread finished ...
.. kernel: md: sdb [events: 0000001b]<6>(write) sdb's sb offset: 143374656
.. kernel: md: ... autorun DONE.
.. kernel: raid5: switching cache buffer size, 4096 --> 1024


-- 
Rory Campbell-Lange 
<rory at campbell-lange.net>
<www.campbell-lange.net>




More information about the Linux-PowerEdge mailing list