Disk fault with LSI MPT-SAS & MegaRAID

Henry-Nicolas Tourneur hntourneur at mactelecom.com
Tue Sep 29 09:05:46 CDT 2009


Hi,

First, thank you for your reply.
With the R710, we are using another driver, the mpt_sas one.

You told me that megacli will declare a failure in case of bad lba,
timeout or sense key problem.
Is it also true with the SAS1068E ?

I'm not using open manage and I don't think that Debian Lenny is part of
your official supported OS.

The only thing is want is to be sure that either the megacli or the
mpt-status tools will declare a failure
in realistic case (as you told : bad lba, timeout, sense key ...) and
not only in "theorical cases" as hard drive removal.


Thank you for your help,

Regards,

Le mardi 29 septembre 2009 à 14:50 +0100, Patrick_Fischer at Dell.com a
écrit :

> Hi,
> 
> If you know some scsi events the best is to check the controller log.
> You can get the log with our open manage or with mega cli from lsi.
> Here you can check for "fail" find the failed disk and check why it has
> failed(in the log  some lines before). E.g. bad lba, timeout (non
> critical), sense key.....
> If it is a timeout or the disk is in state removed you can reseat it, if
> a bad lba or a for you unknown sense key is the root cause mail it to
> your local dell support for analyzing.
> 
> For open manage it is:
> 
> Omreport storage controller => to get the controller id
> Omconfig storage controller action=exportlog controller="id" => to
> gather the log
> 
> After 10 disks it is easy to read :-)
> 
> Another way is our diagnostic tool:
> Offline => 32 Bit diag (for every dell server)
> Online => online diagnostics for PE servers (only with supported OS)
> If you need the links please tell me.
> 
> Both tools test the disks with some algorithms and say green for ok and
> red for needs to be replaced (in most cases)
> 
> If you have the raid controller driver, its firmware and the disk
> revision every time up to date you can nearly be sure that the disk
> failure is a hw defect.
> 
> -----Original Message-----
> From: linux-poweredge-bounces at lists.us.dell.com
> [mailto:linux-poweredge-bounces at lists.us.dell.com] On Behalf Of
> Henry-Nicolas Tourneur
> Sent: 29 September 2009 14:52
> To: linux-poweredge-Lists
> Subject: Disk fault with LSI MPT-SAS & MegaRAID
> 
> Hi,
> 
> I got a question about 2 raid controllers : the MegaRAID I got in
> PowerEdge 2900 (Symbios Logic MegaRAID SAS 1078) and second, the Fusion
> MPT-SAS I got in R710 (Symbios Logic SAS1068E PCI-Express).
> 
> 
> The question is quite simple : what's the rule used to say that a drive
> is faulty ? the test case is quite simple : I "hot remove" the disk but
> this is not realistic. In the real life, I would like to know what are
> the required conditions to declare a drive faulty (eg: more than x % of
> bad sector ... don't know).
> 
> Does anybody have an idea on this ?
> 
> Regards,
> 

-- 
Henry-Nicolas Tourneur
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20090929/f6b066cc/attachment-0001.htm 


More information about the Linux-PowerEdge mailing list