2650 Disk problems [Was: RAID 5 Errors, Disk Problems : What to do?]

Rory Campbell-Lange rory at campbell-lange.net
Wed Aug 27 05:37:01 CDT 2003


Thought I would rephrase and hope for an answer!

I have three big 146GB disks in a soft-raid RAID5 array. One disk has
been kicked out due to a "Hardware Error...mechanical positioning
error". I can readd the disk after a reboot. But it falls out again
after a few weeks.

Dell Support won't replace the disk as it does not fail the SCSI verify
tool scan that you can choose to go into on boot.

Any other stuff I can run by Dell Support?

Thanks for any help.
Rory

On 26/08/03, Rory Campbell-Lange (rory at campbell-lange.net) wrote:
> I am running a Dell PowerEdge 2650 Xeon 2.0GHz/512k x2 with 4 disks. It
> is running Linux 2.4.19-ac4 on Debian. Three are 146GB ULTRA 3 disks,
> SOFT-RAIDed at RAID 5 and use EXT3 (the fourth disk is a system disk).
> 
> Twice in the last two weeks md0 has kicked out sdd. The RAID system
> coped fine. 
> 
>     SCSI disk error : host 0 channel 0 id 3 lun 0 return code = 8000002
>     Info fld=0xc8e0040, Current sd08:30: sense key Hardware Error
>     Additional sense indicates Mechanical positioning error
>      I/O error: dev 08:30, sector 210632768
>     raid5: Disk failure on sdd, disabling device. Operation continuing on 2 devices
> 
> I was unable to hot-add the disk back in either time...
> 
> I rebooted the server and did a low level verification of the disk using
> the DELL SCSI utility on startup. No errors were found. After reboot I
> was able to re-add the failed disk.
> 
>     md: trying to hot-add sdd to md0 ... 
>     md: bind<sdd,3>
>     RAID5 conf printout:
>      --- rd:3 wd:2 fd:1
>      disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sdb
>      disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdc
>      disk 2, s:0, o:0, n:2 rd:2 us:1 dev:[dev 00:00]
>     ...
>       --- rd:3 wd:3 fd:0
>       disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sdb
>       disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sdc
>       disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdd
>      md: updating md0 RAID superblock on device
>      md: sdd [events: 00000037]<6>(write) sdd's sb offset: 143374656
>      md: sdc [events: 00000037]<6>(write) sdc's sb offset: 143374656
>      md: sdb [events: 00000037]<6>(write) sdb's sb offset: 143374656
>      md: recovery thread finished ...
> 
> What should I do? DELL technical support won't replace the disk unless
> it fails the SCSI verification. Is this possibly a SOFT-RAID problem?
> Should I add another disk to the array and keep the present sdd as an
> emergency disk?
> 
> Thoughts and advice much appreciated.
> Rory
> 
> -- 
> Rory Campbell-Lange 
> <rory at campbell-lange.net>
> <www.campbell-lange.net>
> 
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq or search the list archives at http://lists.us.dell.com/htdig/
> 

-- 
Rory Campbell-Lange 
<rory at campbell-lange.net>
<www.campbell-lange.net>




More information about the Linux-PowerEdge mailing list