alarms [was: Re: esm module and ntp problem]

Harold van Oostrom pedge at lanceerplaats.nl
Tue Jan 14 11:05:01 CST 2003


Brian, Michael,

On Tue, Jan 14, 2003 at 04:23:39PM +0100, Michael Redinger wrote:
> 
> Bad news:
> dcstor32d seems to be responsible for much of the core functionality of 
> OMSA.
> If you disable it, you don't get any information about the system 
> (components, service tag etc.) and, even worse, you don't get any alarms 
> any more!

That depends. By accident I found out that once you have
e-mailwarnings configured the RAC will send alerts all on its own
*even when the system is down*. No OMSA required (except maybe for
initial configuration).

What happened was this:
I setup email alerts with OMSA 1.0  (DOMN31_A00.tar)
Then a little later installed a newer OMSA release 1.2.0 (DOMN32A00.tar)
This release has a bug that causes erroneous warnings to be sent
(about system voltage in my case). So I was getting an email every
two or three minutes. Much to my surpise the card kept sending alert
after the server was powered off. 

I installed the fix (the latest OMSA release already has that fix)
and the messages stopped. I powered off the server and by way of test
removed the powerercord and again it began sending alerts ..

This is what the alert looks like

Date: Fri, 20 Dec 2002 03:58:18 +0100
Subject: Alert from ERA/O: 192.168.1.123
To: alert at localhost

Message: WARNING
Event: Power Supply 2 power supply sensor Power lost
Date: 20-dec-2002
Time: 05:02:40
Severity: Warning
System ID:
Model: PowerEdge 1650
BIOS version: A08
Asset tag: 
Service tag: 459PT1G
OS Type: 64-bit Unknown
Hostname: orange
OS Name: Linux
ESM Version: 1.48  Dell Computer Corp.

The warning about system voltage had this `Event line'
Event: +2.5 voltage sensor detected a warning (2.580 V)

I have not verified that it does the same when the temperature
crosses a threshold but I have no reason to believe it wouldn't.

dellomsa works quite unreliable for me anyway. 

I mean can you explain this:
Two identically configured systems.
-------------------------------------------------------
host1> ~# omreport chassis info index=0
Chassis Information

Index                                   : 0
Chassis Name                            : Main System Chassis
Host Name                               : host1
Baseboard Management Controller Version : 1.48
Primary Backplane Version               : 0.28
Sensor Data Records Version             : SDR Version 0.26
Chassis Model                           : PowerEdge 1650             
Chassis Lock                            : Present
Service Tag                             : 635PT34
Chassis Asset Tag                       : FT83
-------------------------------------------------------
host2> ~# omreport chassis info index=0 
Error! Hardware or feature not present.
-------------------------------------------------------

Cheers,
Harold.

> So, this is definitely no solution, not even a workaround ...
> 
> Just wanted you to know about the status.
> 
> I'm back again at the support hotline. My next try is to flash BIOS, ERA 
> and the backblane to see if anything changes (don't think so, but the 
> support guy wants me to try this ...).
> 
> Will be back with more news.
> 
> Michael
> 
> On Tue, 14 Jan 2003, Brian Smith wrote:
> 
> > Do you know what the function of "dcstor32d" is?   I assume "dcsnmp32d" is
> > for the snmp support.  And I assume "dcevt32d" is for the event logging in
> > the /var/log/messages file when the status of something changes.  Does
> > anyone know what each daemon does?
> 
> 
> > Some news:
> > 
> > This is based on information provided by the Dell Support Hotline (thanks)
> > and it seems as if I can verify this:
> > 
> > The problem is _not_ the esm module (directly, at least) but dcstor32d.
> > 
> > When I kill dcstor32d (and restart ntpd), things seem to work fine (for
> > some hours now ...).
> > 
> > (However, if I disable dcstor32d, I basically loose all of OMSA's
> > functionality ...)
[ snip ]




More information about the Linux-PowerEdge mailing list