答复: Lots of bad disks with PE1955's running RHEL4?

Jie_Tang at Dell.com Jie_Tang at Dell.com
Wed Aug 8 03:27:36 CDT 2007


Hi, Hrudy

Please aware Oracle has a patch for Oracle 10g Grid agent, this patch will fix the SAS hard disk offline issue on PE1955 too.




Subject: 	Grid Control Agent Corrupting RAID Disk Mirroring
 	Doc ID: 
Note:421908.1	Type: 	PROBLEM
 	Last Revision Date: 	01-APR-2007	Status: 	PUBLISHED
In this Document
  Symptoms
  Cause
  Solution
  References
________________________________________
Applies to: 
Enterprise Manager Grid Control - Version: 10.2.0.3
Linux x86-64
Linux x86
Symptoms
When agent is started it corrupts RAID disk mirroring.  In the syslog; see messages of the format:
Mar 26 14:05:33 tggsoa01 kernel: mptbase: ioc0: RAID STATUS CHANGE for PhysDisk 1 
Mar 26 14:05:33 tggsoa01 kernel: mptbase: ioc0: PhysDisk is now failed 
Mar 26 14:05:33 tggsoa01 kernel: mptbase: ioc0: RAID STATUS CHANGE for PhysDisk 1 
Mar 26 14:05:33 tggsoa01 kernel: mptbase: ioc0: PhysDisk is now failed, out of sync 
Mar 26 14:05:33 tggsoa01 kernel: mptbase: ioc0: RAID STATUS CHANGE for VolumeID 0 
Mar 26 14:05:33 tggsoa01 kernel: mptbase: ioc0: volume is now degraded, enabled
A number of errors appear in the file $ORACLE_HOME/sysman/sysman/log/emagent_perl.trc and 
the disks then report that there is a problem and the raid controller splits the mirroring. 
This problem does not occur on all platforms - this particular problem surfaced on linux 32bit installed on an IBM Blade HS21. 
Cause
There is a known issue for Storage Array Metric Collections.
Solution
Applying the one off backport to fixed the problem
Apply Patch 5713547 to AGENT_HOME
References
Bug 5713547 - STORAGE_REPORT_METRICS.PL IS CORRUPTING RAID DISK MIRRORING



主题: Re: Lots of bad disks with PE1955's running RHEL4?

On 8/1/07, Nathan Hruby <nhruby at gmail.com> wrote:
> Howdy,
>
> Over the past few weeks we've had an alarming number of disks kicked
> out of the SAS RAID controllers on our PE1955's running RHEL4-u4 (I
> think a dozen replacements in the past 6 weeks, and this morning 7
> machines are seemingly unhappy).  Sadly, the Win2k03 and VMware 1955's
> (and everything else) are very happy though so I don't think it's an
> environmental problem with our datacenter but clearly something with
> our deployment, workload, hardware, or some entertaining combination
> of the 3.
>
> Has anyone else seen these kinds of issues with their 1955's?
>
> Also, when I arrived here I was surprised to find that all of these
> systems were using the mpt drivers (including mptsas) instead of the
> megaraid_sas driver.  That sounds like an immediate red flag to me,
> but it's been a long while since I used Dell hardware, so what do I
> know?  That might be right considering lspci doesn't see anything
> PERCish.
>
> Thanks for any insight one might have to give :)

Reposting because we're confounded by this issue.  We have one machine
that is now running the latest and greatest of everything (RHEL,
Firmware, BIOS, SAS Backplane firmware, BMC, Drive Firmware, etc..)
and are still having this issue of disks getting kicked out of the
arrays.

Dell support also seems vexed by this issue.

Anyone have a thought?

-n
-- 
-------------------------------------------
nathan hruby <nhruby at gmail.com>
metaphysically wrinkle-free
-------------------------------------------

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq



More information about the Linux-PowerEdge mailing list