2850 PERC 4e/Di drive errors

Matthew Lenz matthew at nocturnal.org
Tue May 11 11:55:25 CDT 2010


Jefferson Ogata wrote:
> On 2010-05-11 14:54, Matthew Lenz wrote:
>   
>> $ megactl -H
>> a0       PERC 4e/Di               chan:2 ldrv:1  batt:good
>> a0c0t0     279GiB  a0d0  online   errs: media:6  other:1
>>      write errors: corr:  0    delay:  0    rewrit:  0    tot/corr:  
>> 0    tot/uncorr:  0   
>>       read errors: corr:  4Mi  delay: 58    reread:  0    tot/corr:  
>> 0    tot/uncorr:  0   
>>     verify errors: corr:  0    delay:  0    revrfy:  6    tot/corr:  
>> 0    tot/uncorr:  6   
>>     temperature: current:28C threshold:0C
>>
>> This is the only system with this showing up (of several x850 raid 
>> setups).  These systems are still under warranty should I request a 
>> replacement of this drive?  I really don't know how long this drive has 
>> been erroring.
>>     
>
> Try running a long self-test on the drive (megactl -T long a0c0t0). If
> it fails that it will be worth replacing it.
>
> I'm assuming that that drive is part of a redunant RAID. Be aware that
> if the disk is failing, a long self-test may turn up enough problems for
> the PERC to knock the disk offline. If you don't have redundancy, back
> the system up first...
>   
Yeah, there is a hot spare in all these raid-5 systems.   I'll try 
running it during off time/hours to see.  My primary concern is that 
these machines all have their warranties expiring in the next 4-5 weeks 
and we aren't planning on renewing since only the PE2850s are renewable 
(dell won't renew PE1850 and PC5324).  We were just planning on buying a 
couple of each used as spares.

If this drive is on it's way out I'd like to get it replaced.  From what 
I've seen those 300GB 10K U320 drives aren't cheap even refurbed.

I'll give the long self-test a try.



More information about the Linux-PowerEdge mailing list