omreport segfault and OMSA with nagios many semaphore

shawn at systemtemplar.org shawn at systemtemplar.org
Thu Sep 22 03:03:57 CDT 2011


>  Hmm...
>
>  So I guess this is only happening to me?
>
>  Strange I have tried a firmware update in different version of OMSA
>  and CentOS but all failed.

Hello, I am having the exact same issue on a pair of R710's running 
Scientific Linux 6.

I'm using the nagios plugin check_openmanage, which is nothing more than 
a perl script that scrapes 'omreport' and puts the results into a nagios 
readable format. With the amount of checks I'm doing this amounts to 
running omreport 20 times every 5 minutes.

Each server has the exact same packages (managed with puppet) and I was 
not having the problem until an update from last week (details 
incoming). Since then i'm running out of semaphores to the point where I 
had to setup the following cron.hourly script:

#!/bin/bash
## NOTE: I don't use semaphores for anything so I can happily clear them 
all. ymmv.
for i in $(ipcs |grep "0x0" |grep 600 |awk '{print $2}'); do ipcrm -s 
$i; done; > /dev/null 2>&1
/opt/dell/srvadmin/sbin/srvadmin-services.sh restart > /dev/null 2>&1

I was doing a cron.daily script, but i started getting too many false 
alerts so switched to hourly. I'm ready to turn off omsa and switch to 
ipmitool.

/var/log/messages has a boatload of these:
Sep 21 17:01:01 ndb02 kernel: dsm_sa_datamgrd[25547]: segfault at 8 ip 
00007fe5af91ceaa sp 00007fe5a6bdbde8 error 4 in 
libdsm_sm_queue.so[7fe5af917000+a000]

So. What's this about an update, and it was working before? Well sir. 
This server was working fine for the past 3 months with no changes. We 
get to an upgrade cycle, upgrade several packages, and now it's problematic.

Here's what we updated:

Packages Installed:
     kernel-devel-2.6.32-131.12.1.el6.x86_64
     kernel-2.6.32-131.12.1.el6.x86_64

  Packages Updated:
     curl-7.19.7-26.el6_1.1.x86_64
     libcurl-7.19.7-26.el6_1.1.x86_64
     selinux-policy-targeted-3.7.19-93.el6_1.7.noarch
     nss-softokn-3.12.9-3.el6.x86_64
     perl-Compress-Zlib-2.020-119.el6.x86_64
     subversion-1.6.11-2.el6_1.4.x86_64
     1:dbus-libs-1.2.24-5.el6_1.x86_64
     nss-3.12.9-12.el6_1.x86_64
     4:perl-5.10.1-119.el6.x86_64
     32:bind-utils-9.7.3-2.el6_1.P3.2.x86_64
     32:bind-libs-9.7.3-2.el6_1.P3.2.x86_64
     1:perl-Pod-Simple-3.13-119.el6.x86_64
     pixman-0.18.4-1.el6_0.1.x86_64
     ca-certificates-2010.63-3.el6_1.5.noarch
     apr-1.3.9-3.el6_1.2.x86_64
     facter-1.6.0-2.el6.noarch
     perl-Compress-Raw-Zlib-2.023-119.el6.x86_64
     perl-Proc-ProcessTable-0.45-1.el6.rf.x86_64
     3:perl-version-0.77-119.el6.x86_64
     rsyslog-4.6.2-3.el6_1.2.x86_64
     perl-IO-Compress-Base-2.020-119.el6.x86_64
     elfutils-libelf-0.152-1.el6.x86_64
     ruby-1.8.7.299-7.el6_1.1.x86_64
     ruby-libs-1.8.7.299-7.el6_1.1.x86_64
     nss-sysinit-3.12.9-12.el6_1.x86_64
     nss-softokn-freebl-3.12.9-3.el6.x86_64
     openssl-1.0.0-10.el6.x86_64
     selinux-policy-3.7.19-93.el6_1.7.noarch
     nspr-4.8.7-1.el6.x86_64
     krb5-libs-1.9-9.el6_1.1.x86_64
     kernel-headers-2.6.32-131.12.1.el6.x86_64
     1:dbus-1.2.24-5.el6_1.x86_64
     nss-util-3.12.9-1.el6.x86_64
     python-libs-2.6.6-20.el6.x86_64
     perl-IO-Compress-Zlib-2.020-119.el6.x86_64
     freetype-2.3.11-6.el6_1.6.x86_64
     1:perl-Pod-Escapes-1.04-119.el6.x86_64
     kernel-firmware-2.6.32-131.12.1.el6.noarch
     sudo-1.7.4p5-5.el6.x86_64
     4:perl-libs-5.10.1-119.el6.x86_64
     2:postfix-2.6.6-2.2.el6_1.x86_64
     python-2.6.6-20.el6.x86_64
     tzdata-2011h-3.el6.noarch
     4:perl-Time-HiRes-1.9721-119.el6.x86_64
     2:libpng-1.2.46-1.el6_1.x86_64
     system-config-firewall-base-1.2.27-3.el6_1.3.noarch
     12:dhclient-4.1.1-19.P1.el6_1.1.x86_64
     nss-softokn-freebl-3.12.9-3.el6.i686
     1:perl-Module-Pluggable-3.90-119.el6.x86_64

What omsa are we running? Good question:

Packages Installed:
     srvadmin-base-6.5.0-1.1.1.el6.x86_64
     srvadmin-xmlsup-6.5.0-1.141.2.el6.x86_64
     srvadmin-omacore-6.5.0-1.143.2.el6.x86_64
     srvadmin-deng-6.5.0-1.31.1.el6.x86_64
     srvadmin-isvc-6.5.0-1.52.2.el6.x86_64
     sysfsutils-2.1.0-6.1.el6.x86_64
     srvadmin-storelib-sysfs-6.5.0-1.1.1.el6.x86_64
     srvadmin-smcommon-6.5.0-1.201.1.el6.x86_64
     ipmitool-1.8.11-99.dell.1.117.1.el6.x86_64
     srvadmin-sysfsutils-6.5.0-1.1.el6.x86_64
     srvadmin-storage-6.5.0-1.201.1.el6.x86_64
     srvadmin-omcommon-6.5.0-1.142.2.el6.x86_64
     srvadmin-hapi-6.5.0-1.33.2.el6.x86_64
     libsysfs-2.1.0-6.1.el6.x86_64
     srvadmin-omilcore-6.5.0-1.396.1.el6.noarch
     srvadmin-storageservices-6.5.0-1.1.1.el6.x86_64
     libsmbios-2.2.26-6.1.el6.x86_64
     smbios-utils-bin-2.2.26-6.1.el6.x86_64
     srvadmin-storelib-6.5.0-1.326.1.el6.x86_64

What else have you done? We uninstalled omsa via doing a yum remove on 
the omsa packages listed above. Then reinstalled. Same problem.

BIOS=2.3.12 01/24/2011
iDRAC6=1.54

I am aware of the 3.0.0 bios being out, but policy prevents me from 
upgrading it for several weeks. I don't think this is a bios related 
bug, as things were working great 2 weeks ago. One of the packages that 
we have upgraded clearly doesn't get alone with the current version of 
omsa. The question is, which?



More information about the Linux-PowerEdge mailing list