Manually reconstructing a RAID5 array from a PERC3/Di (Adaptec)

J. Epperson Dell at epperson.homelinux.net
Wed May 12 08:59:26 CDT 2010


Clap-clap-clap.  Not that I'd attempt anything like this short of a
national security issue or forensics for a particularly heinous crime....

On Wed, May 12, 2010 08:56, Tim Small wrote:
> On 06/05/10 08:47, Support @ Technologist.si wrote:
>> Hi tim,
>> You gave yourself a hell of a job..
>> Below here are some links.. the last 2 links are linux ways to go..
>>
>> http://forum.synology.com/enu/viewtopic.php?f=9&t=10346
>> http://www.diskinternals.com/raid-recovery/
>> http://www.chiark.greenend.org.uk/~peterb/linux/raidextract/
>> http://www.intelligentedu.com/how_to_recover_from_a_broken_raid5.html
>>
>
> Ta for those who sent along some tips...
>
> In the end, I did manage to persuade the controller to put the array
> back together (succeeded on the second attempt, after restoring the
> drive metadata from the backups I'd taken).  Part of the reason that I
> didn't try this originally is that I didn't have access to any spare
> SCSI/SCA drives, or the original RAID controller either!
>
> Once I had access to the original block device, I created a COW snapshot
> in order to run fsck.ext3 on the filesystem without actually triggering
> any writes to the array (I think a write caused by replaying the journal
> killed the array the first time around).
>
> Here are some handy instructions on using dmsetup to do this:
>
> http://www.thelinuxsociety.org.uk/content/device-mapper-copy-on-write-filesystems
>
> ... which would also be handy in the case of any other file-system
> corruption, and is a lot faster than copying around image files!
>
>
>
> Before that I tried the following method using Linux software RAID to
> reconstruct the array (which nearly worked):
>
> . Take images of the 5 drives
> . Work out how big the metadata is (assuming it's at the beginning of
> the drives):
>
> for i in {0..1024} ; do dd if=/mnt/tmp/raid_0 skip=$i | file - ; done
>
> ... etc. for all 5 drive images.
>
> . Create read-only loop-back devices from the drives using:
>
> losetup -r -o 65536 /dev/loop0 /mnt/tmp/raid_0
>
> ... having found a valid MBR 64k into one of the drives - so assuming
> the Adaptec aacraid controller metadata was on the first 64k of the
> disk.  The loop device skips over this first 64k using the offset
> argument above.
>
> . Create a set of 5 empty files (to hold the Linux md metadata) using
> dd, and set these up as loopX as well.
> . Create a set of RAID appends (without metadata) using:
>
> ./mdadm --build /dev/md0 --force -l linear -n 2 /dev/loop0 /dev/loop10
>
> etc. - with the idea that a to-be-created-later md RAID5 device will put
> their (version 0.9) metadata into the (read/write) files which make up
> the end of these RAID append arrays.  It would be handy if you could
> create software RAID5s without metadata, but you can't - they wouldn't
> be much practical use except for this soft of data-recovery purpose, I
> suppose....
>
> . Create a set of degraded md RAID5s using commands like:
>
> ./mdadm --create  /dev/md5 -e 0.9 --assume-clean -l 5 -n 5 /dev/md0
> /dev/md1 /dev/md2 /dev/md3 missing
>
> ... for all possible permutations of 4 out-of the 5 drives, plus one
> missing (actually it tried the all-5-drives running layouts as well, but
> I disregarded these to be on the safe side).
>
> http://www.perlmonks.org/?node_id=29374
>
> perl permutations.pl /dev/md0 /dev/md1 /dev/md2 /dev/md3 /dev/md4
> missing | xargs -n 6 ./attempt.sh  2>&1 | tee output2.txt
>
> Where attempt.sh look like this:
>
> #!/bin/bash
>
> lev=5
> for layout in ls la rs ra
> do  for c in 64
> do echo
>          echo
>          echo
>          echo  echo "level: $lev  alg: $layout chunk: $c  order: $1 $2
> $3 $4 $5"
>          echo y | ./mdadm-3.1.2/mdadm --create  /dev/md5 -e 0.9
> --chunk=${c} -l $lev -n 5 --layout=${layout} --assume-clean $1 $2 $3 $4
> $5 > /dev/null 2>&1
>          sfdisk -d /dev/md5 2>&1 | grep 'Id=82' &&  sleep 4 && fsck.ext3
> -v -n /dev/md5p1
>          mdadm -S /dev/md5
> done
>   done
>
>
> ... so this assembles a v0.9 metadata md array (which puts its metadata
> at the end), and then looks for a Linux swap partition in the partition
> table, and tries a read-only fsck of the data partition.
>
> A chunk size of 64 seemed to be the default for the BIOS but I did
> originally try others.  Anyway, this came up with two layouts which
> looked kind-of-OK (which is what I was expecting, as I assume that first
> one drive failed, then a second), both used left-asymetric parity layout.
>
> ... but e2fsck came up with loads of errors, and although the directory
> structure ended-up largely intact, the contents of most files were wrong
> - so there must be something else which is a bit different about the way
> that these aacraids layout their data - maybe something discontinuous
> about the array or something?  After I'd completed the job, I didn't
> have time to compare the linux-software-raid reconstructed image with
> the aacraid-hw-raid reconstructed version, but this would be easy enough
> todo using some test data....
>
> I've posted this detail here in case someone is faced with having to
> attempt a similar job again, but can't get the controller to put the
> data back together - or perhaps someone who is trying this with drives
> from a different HW raid controller - in which case this method might
> Just Work (tm).
>
> Similarly if anyone else can see anything obvious which I did wrong,
> please shout!
>
> Cheers,
>
> Tim.
>
> --
> South East Open Source Solutions Limited
> Registered in England and Wales with company number 06134732.
> Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
> VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
>




More information about the Linux-PowerEdge mailing list