PV220S in a bad state. Recovery advice needed

Steve_Boley@Dell.com Steve_Boley at Dell.com
Tue May 18 11:52:01 CDT 2004


15 jumped to id4.  I'd retag all the array again and fail id4
without setting any hotspares and bring it up and see if the
data is accessible.  

Might be that id15 is failed and caused the array to drop when 
it tried to failover to it.  When you have a hotspare, there 
is no way to see if the hotspare drive has had issues and 
could be bad and not know till a failure takes place and it 
tries to failover to it.  

Way to retag is to clear configuration and then recreate the 
arrays the exact same as originally created and then go to 
objects/physical drive and force id4 offline.  

Boot and see if data is accessible and if it is I recommend to 
try rebuilding id4 first and if it fails to then move id14 
to id4's place and rebuild with the new drive.

If id4 won't rebuild call and have id4 and id15 replaced.
Steve

-----Original Message-----
From: linux-poweredge-admin at dell.com [mailto:linux-poweredge-admin at dell.com] On Behalf Of Philippe Gramoullé
Sent: Tuesday, May 18, 2004 10:56 AM
To: Poweredge Mailing List
Subject: PV220S in a bad state. Recovery advice needed


 Hi,

We have a 2650+PV220s that broke minutes ago. Here's the problem:

PV220S in a RAID5 setup 2 raid sets of 6 disks each, spanned for a logical volume of about 670 Go

Before the SCSI errors happened this was the following set up

 0°  ONLINE A00-00
 1°  ONLINE A00-01
 2°  ONLINE A00-02
 3°  ONLINE A00-03
 4°  ONLINE A00-04
 5°  ONLINE A00-05
 6°  PROC
 7°
 8°  ONLINE A01-00
 9°  ONLINE A01-01
 10° ONLINE A01-02
 11° ONLINE A01-03
 12° ONLINE A01-04
 13° ONLINE A01-05
 14° SPARE
 15° SPARE

now the layout looks like this:

 0°  FAIL   A00-00
 1°  ONLINE A00-01
 2°  ONLINE A00-02
 3°  ONLINE A00-03
 4°  READY
 5°  ONLINE A00-05
 6°  PROC
 7°
 8°  ONLINE A01-00
 9°  ONLINE A01-00
 10° ONLINE A01-00
 11° ONLINE A01-00
 12° ONLINE A01-00
 13° ONLINE A01-00
 14° READY
 15° FAIL   A00-04

What is think is that the spare drives were somewhat broken so that when rebuild started after A00-01 and/or A00-04 broke, the volume when offline.

I'm used to plug/unplug the shelf and redoing the config manually, so it isn't a problem if this is the way to go.

I'd rather avoid to use the "Force Disk Online" option as it always screwed the filesystem before and needed fsck.

Any suggestion welcome.

Thanks,

Philippe

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq or search the list archives at http://lists.us.dell.com/htdig/






More information about the Linux-PowerEdge mailing list