OMSA continually reports power supply issues

Wayne_Weilnau at Dell.com Wayne_Weilnau at Dell.com
Tue Sep 6 23:41:42 CDT 2011


Chuck,
The messages for event ID 1151 have a status of unknown.  My guess (without getting somebody to look at code) is that this indicates that the OMSA agent is unable to retrieve readings from the iDrac/BMC or the iDrac/BMC is unable to retrieve the reading from the power supplies.  The fact that the recovery messages come within a few minutes of the failure messages but the failure messages can be hours apart leads me to further suspect that there is a firmware bug most likely in the power supplies.  A few questions:

1.  Are you seeing any other monitoring errors?
2.  If you look at the hardware log (SEL) via OMSA or iDrac, do you see any of these power supply events?
3.  If you swap power supplies with your good system, does the problem follow the power supply?
4.  Do your working supplies have the same version of firmware?
5.  If it is possible the connect the problem system to 110V, do you still see issues?
6.  What is the FRU data for the power supplies (manufacturer and model) on the failing system?  What about the good system?  (We may have multiple suppliers and the issue could be specific to the supplier or firmware version.)
7.  What version of iDrac FW and OMSA software are you using?

I have not seen this issue reported elsewhere, but the technical support staff is more likely to have seen this type of issue than myself.  In general, I would recommend you ensure you are at the latest iDrac and PS firmware versions.  Technical support may be able to give you more timely and accurate advice than myself......not sure how receptive they will be to your request since you are running a distro that is not officially supported.

Wayne Weilnau
Systems Management Technologist
Dell | OpenManage Software Development 

Please consider the environment before printing this email.

Confidentiality Notice | This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential or proprietary information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, immediately contact the sender by reply e-mail and destroy all copies of the original message.


-----Original Message-----
From: linux-poweredge-bounces-Lists On Behalf Of Chuck Anderson
Sent: Tuesday, September 06, 2011 5:47 PM
To: linux-poweredge-Lists
Subject: Re: OMSA continually reports power supply issues

BTW, this is Scientific Linux 6.1 (RHEL 6.1 clone) with
srvadmin-all-6.5.0-1.1.1.el6.x86_64, running on a Dell PowerEdge R710.

And I have another pretty much identical R710 with the same setup
where this is NOT happening.  A notable difference is that one is
running on 208V instead of 120V.

On Tue, Sep 06, 2011 at 06:41:37PM -0400, Chuck Anderson wrote:
> OMSA is telling me both of my power supplies keep changing from 118
> Volts input to 0 Volts input.  I've checked and rechecked the power
> cords, reseated the power supplies, etc. but the logs still keep
> coming in.  The iDRAC reports no issues with the power supplies.  Has
> anyone else seen this?  Is this is software/firmware issue or some
> real hardware issue?
> 
> According to iDRAC, the power supplies have firmware 08.05.00:
> 
> Individual Power Supply Elements
>    Status 	Location	Type	Input Wattage	Max Wattage	Online Status	FW Version	
>  		PS 1 		AC	1080  		870		Present		08.05.00	
> 		PS 2 		AC	1080  		870		Present		08.05.00	
> 
> Sep  6 11:27:51 hostname Server Administrator: Instrumentation Service EventID: 1152  Voltage sensor returned to a normal value #012Sensor location: PS 1 Voltage #012Chassis location: Main System Chassis #012Previous state was: Unknown #012Voltage sensor value (in Volts): 118.000
> Sep  6 13:22:29 hostname Server Administrator: Instrumentation Service EventID: 1151  Voltage sensor value unknown #012Sensor location: PS 1 Voltage #012Chassis location: Main System Chassis #012Previous state was: OK (Normal) #012Voltage sensor value (in Volts): 0.000
> Sep  6 13:26:57 hostname Server Administrator: Instrumentation Service EventID: 1152  Voltage sensor returned to a normal value #012Sensor location: PS 1 Voltage #012Chassis location: Main System Chassis #012Previous state was: Unknown #012Voltage sensor value (in Volts): 118.000
> Sep  6 13:57:04 hostname Server Administrator: Instrumentation Service EventID: 1151  Voltage sensor value unknown #012Sensor location: PS 2 Voltage #012Chassis location: Main System Chassis #012Previous state was: OK (Normal) #012Voltage sensor value (in Volts): 0.000
> Sep  6 14:00:57 hostname Server Administrator: Instrumentation Service EventID: 1152  Voltage sensor returned to a normal value #012Sensor location: PS 2 Voltage #012Chassis location: Main System Chassis #012Previous state was: Unknown #012Voltage sensor value (in Volts): 118.000
> Sep  6 15:31:36 hostname Server Administrator: Instrumentation Service EventID: 1151  Voltage sensor value unknown #012Sensor location: PS 2 Voltage #012Chassis location: Main System Chassis #012Previous state was: OK (Normal) #012Voltage sensor value (in Volts): 0.000
> Sep  6 15:34:41 hostname Server Administrator: Instrumentation Service EventID: 1152  Voltage sensor returned to a normal value #012Sensor location: PS 2 Voltage #012Chassis location: Main System Chassis #012Previous state was: Unknown #012Voltage sensor value (in Volts): 118.000
> Sep  6 16:03:55 hostname Server Administrator: Instrumentation Service EventID: 1151  Voltage sensor value unknown #012Sensor location: PS 2 Voltage #012Chassis location: Main System Chassis #012Previous state was: OK (Normal) #012Voltage sensor value (in Volts): 0.000
> Sep  6 16:04:20 hostname Server Administrator: Instrumentation Service EventID: 1152  Voltage sensor returned to a normal value #012Sensor location: PS 2 Voltage #012Chassis location: Main System Chassis #012Previous state was: Unknown #012Voltage sensor value (in Volts): 118.000
> Sep  6 18:02:10 hostname Server Administrator: Instrumentation Service EventID: 1151  Voltage sensor value unknown #012Sensor location: PS 1 Voltage #012Chassis location: Main System Chassis #012Previous state was: OK (Normal) #012Voltage sensor value (in Volts): 0.000
> Sep  6 18:04:40 hostname Server Administrator: Instrumentation Service EventID: 1152  Voltage sensor returned to a normal value #012Sensor location: PS 1 Voltage #012Chassis location: Main System Chassis #012Previous state was: Unknown #012Voltage sensor value (in Volts): 118.000
> 
> Thanks,
> Chuck

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
https://lists.us.dell.com/mailman/listinfo/linux-poweredge



More information about the Linux-PowerEdge mailing list