problem restoring RAID5 container

Chris Pakkala cpakkala at salu.com
Mon Oct 13 11:37:01 CDT 2003


I inherited a RedHat 7.3 system that has 2 raid 5 containers with 4
partitions each.  One of my drives(0:03:0) went bad as was evident by the
"!" in the output of "container list".  I read both the afacli users guide
and reference guides and neither one gave clear instructions on what to do
when a drive goes bad.  So, I replaced the drive and tried a "container
restore RAID5" on both labels.  The prompt returned right away and nothing
had changed.  So, I rebooted to enter into the raid configuration utility at
the prom level, where I saw that the machine had automatically started
restoring container "1" with the new drive.  I let the restore complete,
thinking that it would continue on and restore container "0" as well; but it
never did.  I rebooted again, hoping that it would initiate another restore
on container "0", and it never did.  I pressed cntrl-r to manually start the
restore, but was freightened off by the warning that I may lose data and
canceled the request.  Now the machine is fully booted and I see this:

Num          Total  Oth Chunk          Scsi   Partition
Creation        System
Label Type   Size   Ctr Size   Usage   B:ID:L Offset:Size   State   RO Lk
Task    Done%  Ent Date   Time      Files
----- ------ ------ --- ------ ------- ------ ------------- ------- -- -- --
----- ------ --- ------ -------- ------
 0    RAID-5 16.0GB       32KB Open    0:00:0 64.0KB:5.33GB UnProt
0  012802 14:12:13
 /dev/sda             root             0:01:0 64.0KB:5.33GB UnProt
1  012802 14:12:13
                                       0:02:0 64.0KB:5.33GB UnProt
2  012802 14:12:13
                                         --- Missing ---

 1    RAID-5 85.6GB       32KB Open    0:00:0 5.33GB:28.5GB
0  012802 14:12:35
 /dev/sdb             data             0:01:0 5.33GB:28.5GB
1  012802 14:12:35
                                       0:02:0 5.33GB:28.5GB
2  012802 14:12:35
                                       0:03:0 64.0KB:28.5GB
3  012802 14:12:35

As you can see, things are messed up now because container "1" started the
new disk(0:03:0) at the beginning of the drive(offset 64.0KB), where the
partition for container "0" should start.  So my question is how do I remove
the (0:03:0) disk from both containers and add it back(with the correct
offset:size values) without losing data?  If there is better
documentation(on what to actually do; not just a list of commands and
switches) available; please let me know.  Any help would be greatly
appreciated!

Thank you,
Chris




More information about the Linux-PowerEdge mailing list