OMSA v5.2 - dsm_sa_datamgr32d is stopped
David_Duncan at Dell.com
David_Duncan at Dell.com
Mon Sep 17 14:48:56 CDT 2007
-----Original Message-----
From: linux-poweredge-bounces at dell.com
[mailto:linux-poweredge-bounces at dell.com] On Behalf Of
Patrick_Boyd at dell.com
Sent: Monday, September 17, 2007 12:11 PM
To: frank at newspapersystems.com; linux-poweredge-Lists
Subject: RE: OMSA v5.2 - dsm_sa_datamgr32d is stopped
You've got a program with a semaphore leak. You need to restart the
box... and figure out which program is leaking semaphores.
-----Original Message-----
From: linux-poweredge-bounces at dell.com
[mailto:linux-poweredge-bounces at dell.com] On Behalf Of Frank Warnke
Sent: Monday, September 17, 2007 12:05 PM
To: linux-poweredge-Lists
Subject: Re: OMSA v5.2 - dsm_sa_datamgr32d is stopped
Some more information;
I decided to try one more time to uninstall (srvadmin-uninstall.sh) and
reinstall srvadmin (yum install srvadmin-all).
When I ran "srvadmin-services.sh start", I get something different;
# srvadmin-services.sh start
Starting mptctl:
Waiting for mptctl driver registration to complete:
[ OK ]
Starting Systems Management Device Drivers:
Starting dell_rbu: [ OK ]
Starting ipmi driver: Already started [ OK ]
Starting Systems Management Data Engine:
Starting dsm_sa_datamgr32d: [ OK ]
Starting dsm_sa_eventmgr32d: [ OK ]
Starting dsm_sa_snmp32d: [ OK ]
Starting DSM SA Shared Services: [ OK ]
Starting DSM SA Connection Service [FAILED]
# srvadmin-services.sh status
dcdbas (module) is stopped
dell_rbu (module) is stopped
dsm_sa_datamgr32d is stopped
dsm_sa_eventmgr32d is stopped
dsm_sa_snmp32d is stopped
dsm_om_shrsvc32d (pid 24981) is running
dsm_om_connsvc32d is stopped
I see these messages in /var/log/messages;
Sep 17 12:35:52 s-layout1 Server Administrator (Shared Library): Data
Engine EventID: 0 A semaphore set has to be created but the system
limit for the maximum number of semaphore sets has been exceeded Sep 17
12:35:52 s-layout1 last message repeated 5 times Sep 17 12:35:55
s-layout1 snmpd[4552]: [smux_accept] accepted fd 11 from 127.0.0.1:47240
Sep 17 12:35:55 s-layout1 snmpd[4552]: accepted smux peer: oid SNMPv2-
SMI::enterprises.674.10892.1, descr Systems Management SNMP MIB Plug-in
Manager Sep 17 12:35:56 s-layout1 Server Administrator (Shared Library):
Data Engine EventID: 0 A semaphore set has to be created but the system
limit for the maximum number of semaphore sets has been exceeded Sep 17
12:35:56 s-layout1 last message repeated 5 times Sep 17 12:35:57
s-layout1 kernel: dcdbas dcdbas: Dell Systems Management Base Driver
(version 5.6.0-2) Sep 17 12:35:57 s-layout1 instsvcdrv: dcdbas device
driver loaded Sep 17 12:35:58 s-layout1 Server Administrator (Shared
Library): Data Engine EventID: 0 A semaphore set has to be created but
the system limit for the maximum number of semaphore sets has been
exceeded Sep 17 12:35:58 s-layout1 last message repeated 2 times Sep 17
12:35:59 s-layout1 snmpd[4552]: peer disconnected: SNMPv2-
SMI::enterprises.674.10892.1
Sep 17 12:36:00 s-layout1 dataeng: dsm_sa_snmp32d shutdown succeeded Sep
17 12:36:01 s-layout1 dataeng: dsm_sa_eventmgr32d shutdown succeeded Sep
17 12:36:02 s-layout1 Server Administrator (Shared Library): Data Engine
EventID: 0 A semaphore set has to be created but the system limit for
the maximum number of semaphore sets has been exceeded Sep 17 12:36:02
s-layout1 instsvcdrv: dcdbas device driver unloaded Sep 17 12:36:02
s-layout1 instsvcdrv: dell_rbu device driver unloaded Sep 17 12:36:08
s-layout1 Server Administrator (Shared Library): Data Engine EventID: 0
A semaphore set has to be created but the system limit for the maximum
number of semaphore sets has been exceeded
The kernel is 2.6.18-8.1.10.el5 #1 SMP Thu Aug 30 20:43:28 EDT 2007
x86_64 x86_64 x86_64 GNU/Linux
Thanks,
Frank
On Mon, 2007-09-17 at 10:57 -0400, Frank Warnke wrote:
> I have installed OMSA on two other PE1900's running RHEL 5 Server back
> in July 2007 without a problem. Last Friday I tried installing OMSA
on
> a third PE1900 and ran in to a problem.
>
> This one is also running RHEL 5 Server, but with all the OS updates
> released since July 2007.
>
> OMSA was installed via these steps;
>
> 1) wget -O- -q http://linux.dell.com/repo/hardware/bootstrap.cgi |
> bash
>
> 2) yum install srvadmin-all
>
> 3) srvadmin-services.sh start
>
>
> Starting srvadmin looks OK;
>
> # srvadmin-services.sh start
> Starting mptctl:
> Waiting for mptctl driver registration to complete:
> [ OK ]
>
> Starting Systems Management Device Drivers:
> Starting dell_rbu: [ OK ]
> Starting ipmi driver: Already started [ OK ]
> Starting Systems Management Data Engine:
> Starting dsm_sa_datamgr32d: [ OK ]
> Starting dsm_sa_eventmgr32d: [ OK ]
> Starting dsm_sa_snmp32d: [ OK ]
> Starting DSM SA Shared Services: [ OK ]
>
> Starting DSM SA Connection Service: [ OK ]
>
>
> However, logging in to OMSA via a web browser, does not show system
> information like Dell server model or the PERC 5/i with its drives.
>
> Running srvadmin status shows datamgr32d as stopped;
>
> # srvadmin-services.sh status
> dell_rbu (module) is running
> ipmi driver is running
> dsm_sa_datamgr32d is stopped
> dsm_sa_eventmgr32d (pid 8142) is running dsm_sa_snmp32d (pid 8152) is
> running dsm_om_shrsvc32d (pid 8182) is running dsm_om_connsvc32d (pid
> 8256 8255) is running
>
>
> Here is what a srvadmin restart looks like;
>
> # srvadmin-services.sh restart
> Shutting down DSM SA Shared Services: [ OK ]
>
>
> Shutting down DSM SA Connection Service: [ OK ]
>
>
> Stopping Systems Management Data Engine:
> Stopping dsm_sa_snmp32d: [ OK ]
> Stopping dsm_sa_eventmgr32d: [ OK ]
> Stopping dsm_sa_datamgr32d: Not started [FAILED]
> Stopping Systems Management Device Drivers:
> Stopping dell_rbu: [ OK ]
> Starting mptctl:
> Waiting for mptctl driver registration to complete:
> [ OK ]
>
> Starting Systems Management Device Drivers:
> Starting dell_rbu: [ OK ]
> Starting ipmi driver: Already started [ OK ]
> Starting Systems Management Data Engine:
> Starting dsm_sa_datamgr32d: [ OK ]
> Starting dsm_sa_eventmgr32d: [ OK ]
> Starting dsm_sa_snmp32d: [ OK ]
> Starting DSM SA Shared Services: [ OK ]
>
> Starting DSM SA Connection Service: [ OK ]
>
> The srvadmin status is still the same;
>
> # srvadmin-services.sh status
> dell_rbu (module) is running
> ipmi driver is running
> dsm_sa_datamgr32d is stopped
> dsm_sa_eventmgr32d (pid 9700) is running dsm_sa_snmp32d (pid 9710) is
> running dsm_om_shrsvc32d (pid 9740) is running dsm_om_connsvc32d (pid
> 9814 9813) is running
>
>
> I have Googled and checked system logs as well as the archives but so
> far I have not been able to solve this. Any ideas on how to proceed
> to troubleshoot this would be much appreciated.
>
> Thanks,
> Frank
Frank,
Is this PowerEdge running an application server on Java, like tomcat or
resin? Have you modified the number of semephores or tuned any related
kernel parameters?
Checkout Linux Community Web: http://linux.dell.com/
RedHat http://www.dell.com/redhat
SuSE http://www.dell.com/suse
Novell http://www.dell.com/novell
Oracle http://www.dell.com/oracle
VMWare http://www.dell.com/vmware
More information about the Linux-PowerEdge
mailing list