PE Hardware monitoring best practices for Linux

Troels Arvin troels at arvin.dk
Tue Jun 2 13:23:19 CDT 2009


SA wrote:
> Can anyone give me advise, hopefully based on experience, for best
> proactive hardware monitoring practices for PowerEdge servers under
> Linux

"Best practice" is a strange concept.

But I find it easy to install OpenManage and then monitor at least the 
following SNMP parameter:

  MIB-Dell-10892::systemStateGlobalSystemStatus.1
also known as:
  .1.3.6.1.4.1.674.10892.1.200.10.1.2.1

We use Nagios' check_snmp to watch it, using the following command:

define command{
    command_name    check_snmp_dell_systemStateGlobalSystemStatus
    command_line    $USER1$/check_snmp -H $HOSTADDRESS$ -C $ARG1$ -o .1.3.6.1.4.1.674.10892.1.200.10.1.2.1 -w 3 -c 6:5
}

Then, if we get an alarm, we log into the openmanage web interface
at https://servername:1311 and investigate further.

Sometimes, we get a warning due to a battery performing a self 
test. Other than that we don't get false alarm from monitoring
systemStateGlobalSystemStatus

-- 
Regards,
Troels Arvin <troels at arvin.dk>
http://troels.arvin.dk/



More information about the Linux-PowerEdge mailing list