Failed disk leads to Failed Redundancy

Clay England clay.england+dellpe at gmail.com
Mon Apr 7 20:51:02 CDT 2008


Greeting:

I had a disk fail over the weekend on a PERC 3/Di raid 5.  I replaced
the disk with a spare and it started to rebuild, last time I checked
it was at 32% or so.  Sometime later
I checked and it had changed to Failed Redundancy.

omreport storage vdisk controller=0
Virtual Disk 0 on Controller PERC 3/Di (Slot Embedded)

Controller PERC 3/Di (Slot Embedded)
ID           : 0
Status       : Non-Critical
Name         :
State        : Failed Redundancy
Progress     : Not Applicable
Layout       : RAID-5
Size         : 273.43 GB (293595512832 bytes)
Device Name  : /dev/sda
Read Policy  : Read Cache Enabled
Write Policy : Write Cache Enabled Protected
Cache Policy : Not Applicable
Stripe Size  : 32 KB

So I got another spare, the system saw the slot as offline when I
removed the disk but when I inserted the 2nd spare it never started to
rebuild, just staying at Failed Redundancy.
I tried several things to get it out of this state:

omconfig storage adisk action=remove controller=0 adisk=0:4
Operation disabled. Read, action=remove
Try, again later..Refer to the documentation for more information.

omconfig storage vdisk action=checkconsistency controller=0 vdisk=0
Failure!

omconfig storage adisk action=offline controller=0 adisk=0:4
Operation not supported. Read, action=offline
Refer to the documentation for more information.

 omconfig storage adisk action=rebuild controller=0 adisk=0:4
Operation disabled. Read, action=rebuild
Try, again later..Refer to the documentation for more information.

The only action I got to work was blink, but it did blink.

I found in the docs that checkconsistency could remove the Failed
Redundancy but I can't seem to run that as shown above.  The slot
where the disk failed looks ok with the
spare in it:

ID                        : 0:4
Status                    : Ok
Name                      : Array Disk 0:4
State                     : Online
Progress                  : Not Applicable
Device Name               : Not Applicable
Capacity                  : 68.36 GB (73398812672 bytes)
Used RAID Disk Space      : 68.36 GB (73398812672 bytes)
Available RAID Disk Space : 0.00 GB (0 bytes)
Hot Spare                 : No
Product ID                : ST373307LC
Revision                  : DS09
Vendor ID                 : SEAGATE

Just looking for an idea on how to proceed, I hope that both of the
spares I tried are not faulty it seems stuck in the failed state and I
am not sure how to recover.
Or even just to get it to try and rebuild again.
The machine is: Red Hat Enterprise Linux ES release 3 (Taroon Update 9).

Thanks.
Clay



More information about the Linux-PowerEdge mailing list