Disk fault with LSI MPT-SAS & MegaRAID

Patrick_Fischer at Dell.com Patrick_Fischer at Dell.com
Tue Sep 29 08:50:18 CDT 2009


Hi,

If you know some scsi events the best is to check the controller log.
You can get the log with our open manage or with mega cli from lsi.
Here you can check for "fail" find the failed disk and check why it has
failed(in the log  some lines before). E.g. bad lba, timeout (non
critical), sense key.....
If it is a timeout or the disk is in state removed you can reseat it, if
a bad lba or a for you unknown sense key is the root cause mail it to
your local dell support for analyzing.

For open manage it is:

Omreport storage controller => to get the controller id
Omconfig storage controller action=exportlog controller="id" => to
gather the log

After 10 disks it is easy to read :-)

Another way is our diagnostic tool:
Offline => 32 Bit diag (for every dell server)
Online => online diagnostics for PE servers (only with supported OS)
If you need the links please tell me.

Both tools test the disks with some algorithms and say green for ok and
red for needs to be replaced (in most cases)

If you have the raid controller driver, its firmware and the disk
revision every time up to date you can nearly be sure that the disk
failure is a hw defect.

-----Original Message-----
From: linux-poweredge-bounces at lists.us.dell.com
[mailto:linux-poweredge-bounces at lists.us.dell.com] On Behalf Of
Henry-Nicolas Tourneur
Sent: 29 September 2009 14:52
To: linux-poweredge-Lists
Subject: Disk fault with LSI MPT-SAS & MegaRAID

Hi,

I got a question about 2 raid controllers : the MegaRAID I got in
PowerEdge 2900 (Symbios Logic MegaRAID SAS 1078) and second, the Fusion
MPT-SAS I got in R710 (Symbios Logic SAS1068E PCI-Express).


The question is quite simple : what's the rule used to say that a drive
is faulty ? the test case is quite simple : I "hot remove" the disk but
this is not realistic. In the real life, I would like to know what are
the required conditions to declare a drive faulty (eg: more than x % of
bad sector ... don't know).

Does anybody have an idea on this ?

Regards,

-- 
Henry-Nicolas Tourneur

System Administrator

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at lists.us.dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq



More information about the Linux-PowerEdge mailing list