Dell OMSA / Debian Squeeze problems

Rob Donovan hikerman2005-dellcom at yahoo.com
Tue Sep 13 13:50:47 CDT 2011


I just installed Dell OMSA from the 6.5 repository at
http://linux.dell.com/repo/community/deb/OMSA_6.5/ under Debian Squeeze
on a Dell R815 with four 6128 processors.  I've had a few issues :

1) There's a problem with using the "sudo update-rc.d dsm_om_connsvc
defaults" command.  The problem is that the header in
/etc/init.d/dsm_om_connsvc specifies:

# Default-Start: 3 4 5
# Default-Stop: 1 2

This conflicts with the 0 1 6 / 2 3 4 5 implied by the "defaults" option
to update-rc.d, and under a new installation of Debian Squeeze using
dependency based booting the "defaults" option is ignored and the header
is followed.  The upshot is that dsm_om_connsvc doesn't start on reboot
into the usual runlevel 2.  The solution seems to be to edit
dsm_om_connsvc to read

# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6

then run the update-rc.d command again.

This makes it start but introduces a problem at stop time : the daemon
binary doesn't remove its pid file, and as a result the stop() function
in /etc/init.d/dsm_om_connsvc spends 30s in a pointless wait loop on
every shutdown when calling /etc/init.d/dsm_om_connsvc stop.  This is
bad news if you're trying to do a quick shutdown on battery backup.  My
solution was to rewrite the /etc/init.d/dsm_om_connsvc stop() function
to read as follows.  Stopping now takes 1-2s.   I wait 1s after issuing
killproc because when I didn't I found the daemon was still running.

#########################################################################
## stop() modified to use 1s steps, always wait 1s after killproc,
## to keep waiting only if STATUSVAL is 0, and to remove the pid file.
#########################################################################
stop() {
    # Check if the daemon is running
    STATUS ${PROGRAM_DAEMON} >/dev/null
        if [ $? == 3 ];
        then
        echo -n "${PROGRAM_NAME} is already stopped"
        echo
        return 2
        fi
    echo -n $"Shutting down ${PROGRAM_NAME}: "
    killproc ${PROGRAM_DAEMON}
    sleep 1

    STATUS ${PROGRAM_DAEMON} >/dev/null
    STATUSVAL=$?

    #if the process is still running wait for it to close down
    COUNTER=0
    while  [ ${STATUSVAL} == 0 ] && [ ${COUNTER} -le 10 ]
    do
        let COUNTER=${COUNTER}+1
        #The service is NOT completely stopped yet.
        #Wait 1 second and then check the status again
        sleep 1
        STATUS ${PROGRAM_DAEMON} >/dev/null
        STATUSVAL=$?
    done

    #if after 10 seconds it is still not stopped
    #kill the process again
    if [ ${STATUSVAL} == 0 ]
    then
        killproc ${PROGRAM_DAEMON} >/dev/null
        sleep 1
    fi

    STATUS ${PROGRAM_DAEMON} >/dev/null
    STATUSVAL=$?

    RETVAL=0

    #the daemon doesn't remove its pid file when it stops
    #so if it's dead and the pid file is there, remove it
    #in this case, the lock file will still exist, so reset STATUSVAL to 2

    PID_FILE="/var/run/${DAEMON}.pid"
    if [ $STATUSVAL -eq 1 ] && [ -f ${PID_FILE} ];
    then
        if [ `rm -f ${PID_FILE}` ];
        then
                # failed to clear pid file
                RETVAL=1
            else
                STATUSVAL=2
        fi
    fi

    if ([ ${STATUSVAL} == 1 ] || [ ${STATUSVAL} == 0 ])
    then
        RETVAL=1
    fi

    # remove the lockfile
    if [ ${STATUSVAL} == 2  ];
    then
        if [ `rm -f ${PROGRAM_LOCK_FILE}` ];
        then
            # failed to clear lock file
            RETVAL=1
        fi
    fi

    # check for complete success
    if [ $RETVAL -eq 0  ];
    then
        # log the success
        if [ -f /lib/lsb/init-functions ];
        then
            LOG_SUCCESS ""
            echo
        else
            echo -en \\033[45G
            echo
        fi
    else
        # log the error
        if [ -f /lib/lsb/init-functions ];
        then
            LOG_FAILURE
                echo
        else
            echo -en \\033[45G
            echo
        fi
    fi

    echo
    [ $RETVAL -eq 0 ] && rm -f ${PROGRAM_LOCK_FILE}
    return $RETVAL
}


2) ipmievd was not starting at boot because /dev/ipmi0 was not available
when needed (though it was present after boot).  I solved the problem by
adding the lines

ipmi_msghandler
ipmi_devintf
ipmi_si

  to /etc/modules.  This was part of the how-tos for installing SARA's
port of Dell OMSA under Lenny, but seems to be absent from the current
installer.

  I should probably add that I do see an error in syslog (about 5
minutes after boot) that reads "sisa instsvcdrv: Failed to unload module
ipmi_si (while preparing to disable kipmi0 thread) : ipmi driver may be
in use".  I don't know if this is caused by the above fix, or not.   Nor
do I know if it's important.  It looks to be related to problem 4 in
http://en.community.dell.com/support-forums/servers/f/177/t/19350169.aspx
but I'm not seeing any delay when I access "System | Alert Management |
Platform Events" through the web interface or run "omreport system
platformevents".

Rob Donovan




More information about the Linux-PowerEdge mailing list