omreport segfault and OMSA with nagios many semaphore

shawn at systemtemplar.org shawn at systemtemplar.org
Fri Sep 23 00:50:23 CDT 2011


 > Hmm...
 >
 >  So I guess this is only happening to me?
 >
 >  Strange I have tried a firmware update in different version of OMSA
 >  and CentOS but all failed.

(Posting this again, I posted yesterday and 24 hours later hasn't shown 
up. My apologies to the list if this doubleposts)

Hello, I am having the exact same issue on a pair of R710's running
Scientific Linux 6.

I'm using the nagios plugin check_openmanage, which is nothing more than
a perl script that scrapes 'omreport' and puts the results into a nagios
readable format. With the amount of checks I'm doing this amounts to
running omreport 20 times every 5 minutes.

Each server has the exact same packages (managed with puppet) and I was
not having the problem until an update from last week (details
incoming). Since then i'm running out of semaphores to the point where I
had to setup the following cron.hourly script:

#!/bin/bash
## NOTE: I don't use semaphores for anything so I can happily clear them
all. ymmv.
for i in $(ipcs |grep "0x0" |grep 600 |awk '{print $2}'); do ipcrm -s
$i; done; > /dev/null 2>&1
/opt/dell/srvadmin/sbin/srvadmin-services.sh restart > /dev/null 2>&1

I was doing a cron.daily script, but i started getting too many false
alerts so switched to hourly. I'm ready to turn off omsa and switch to
ipmitool.

/var/log/messages has a boatload of these:
Sep 21 17:01:01 ndb02 kernel: dsm_sa_datamgrd[25547]: segfault at 8 ip
00007fe5af91ceaa sp 00007fe5a6bdbde8 error 4 in
libdsm_sm_queue.so[7fe5af917000+a000]

So. What's this about an update, and it was working before? Well sir.
This server was working fine for the past 3 months with no changes. We
get to an upgrade cycle, upgrade several packages, and now it's problematic.

Here's what we updated:

Packages Installed:
      kernel-devel-2.6.32-131.12.1.el6.x86_64
      kernel-2.6.32-131.12.1.el6.x86_64

   Packages Updated:
      curl-7.19.7-26.el6_1.1.x86_64
      libcurl-7.19.7-26.el6_1.1.x86_64
      selinux-policy-targeted-3.7.19-93.el6_1.7.noarch
      nss-softokn-3.12.9-3.el6.x86_64
      perl-Compress-Zlib-2.020-119.el6.x86_64
      subversion-1.6.11-2.el6_1.4.x86_64
      1:dbus-libs-1.2.24-5.el6_1.x86_64
      nss-3.12.9-12.el6_1.x86_64
      4:perl-5.10.1-119.el6.x86_64
      32:bind-utils-9.7.3-2.el6_1.P3.2.x86_64
      32:bind-libs-9.7.3-2.el6_1.P3.2.x86_64
      1:perl-Pod-Simple-3.13-119.el6.x86_64
      pixman-0.18.4-1.el6_0.1.x86_64
      ca-certificates-2010.63-3.el6_1.5.noarch
      apr-1.3.9-3.el6_1.2.x86_64
      facter-1.6.0-2.el6.noarch
      perl-Compress-Raw-Zlib-2.023-119.el6.x86_64
      perl-Proc-ProcessTable-0.45-1.el6.rf.x86_64
      3:perl-version-0.77-119.el6.x86_64
      rsyslog-4.6.2-3.el6_1.2.x86_64
      perl-IO-Compress-Base-2.020-119.el6.x86_64
      elfutils-libelf-0.152-1.el6.x86_64
      ruby-1.8.7.299-7.el6_1.1.x86_64
      ruby-libs-1.8.7.299-7.el6_1.1.x86_64
      nss-sysinit-3.12.9-12.el6_1.x86_64
      nss-softokn-freebl-3.12.9-3.el6.x86_64
      openssl-1.0.0-10.el6.x86_64
      selinux-policy-3.7.19-93.el6_1.7.noarch
      nspr-4.8.7-1.el6.x86_64
      krb5-libs-1.9-9.el6_1.1.x86_64
      kernel-headers-2.6.32-131.12.1.el6.x86_64
      1:dbus-1.2.24-5.el6_1.x86_64
      nss-util-3.12.9-1.el6.x86_64
      python-libs-2.6.6-20.el6.x86_64
      perl-IO-Compress-Zlib-2.020-119.el6.x86_64
      freetype-2.3.11-6.el6_1.6.x86_64
      1:perl-Pod-Escapes-1.04-119.el6.x86_64
      kernel-firmware-2.6.32-131.12.1.el6.noarch
      sudo-1.7.4p5-5.el6.x86_64
      4:perl-libs-5.10.1-119.el6.x86_64
      2:postfix-2.6.6-2.2.el6_1.x86_64
      python-2.6.6-20.el6.x86_64
      tzdata-2011h-3.el6.noarch
      4:perl-Time-HiRes-1.9721-119.el6.x86_64
      2:libpng-1.2.46-1.el6_1.x86_64
      system-config-firewall-base-1.2.27-3.el6_1.3.noarch
      12:dhclient-4.1.1-19.P1.el6_1.1.x86_64
      nss-softokn-freebl-3.12.9-3.el6.i686
      1:perl-Module-Pluggable-3.90-119.el6.x86_64

What omsa are we running? Good question:

Packages Installed:
      srvadmin-base-6.5.0-1.1.1.el6.x86_64
      srvadmin-xmlsup-6.5.0-1.141.2.el6.x86_64
      srvadmin-omacore-6.5.0-1.143.2.el6.x86_64
      srvadmin-deng-6.5.0-1.31.1.el6.x86_64
      srvadmin-isvc-6.5.0-1.52.2.el6.x86_64
      sysfsutils-2.1.0-6.1.el6.x86_64
      srvadmin-storelib-sysfs-6.5.0-1.1.1.el6.x86_64
      srvadmin-smcommon-6.5.0-1.201.1.el6.x86_64
      ipmitool-1.8.11-99.dell.1.117.1.el6.x86_64
      srvadmin-sysfsutils-6.5.0-1.1.el6.x86_64
      srvadmin-storage-6.5.0-1.201.1.el6.x86_64
      srvadmin-omcommon-6.5.0-1.142.2.el6.x86_64
      srvadmin-hapi-6.5.0-1.33.2.el6.x86_64
      libsysfs-2.1.0-6.1.el6.x86_64
      srvadmin-omilcore-6.5.0-1.396.1.el6.noarch
      srvadmin-storageservices-6.5.0-1.1.1.el6.x86_64
      libsmbios-2.2.26-6.1.el6.x86_64
      smbios-utils-bin-2.2.26-6.1.el6.x86_64
      srvadmin-storelib-6.5.0-1.326.1.el6.x86_64

What else have you done? We uninstalled omsa via doing a yum remove on
the omsa packages listed above. Then reinstalled. Same problem.

BIOS=2.3.12 01/24/2011
iDRAC6=1.54

I am aware of the 3.0.0 bios being out, but policy prevents me from
upgrading it for several weeks. I don't think this is a bios related
bug, as things were working great 2 weeks ago. One of the packages that
we have upgraded clearly doesn't get alone with the current version of
omsa. The question is, which?



More information about the Linux-PowerEdge mailing list