Detecting lights on Dell

andrew2 at one.net andrew2 at one.net
Wed Dec 12 13:54:03 CST 2007


Terry Gliedt wrote:
> Davide Ferrari wrote:
>> El Tuesday 11 December 2007 15:18:13 Terry Gliedt escribió:
>>> We've recently moved a number of Dells PowerEdge servers (1425,
>>> 1850, 1950, 2950) to a remote location. I'd like to run a cron job
>>> to detect that an orange light is on, meaning 'come pay attention
>>> to me'. I have one such machine for 'testing'.
>> 
>> I think that you're taking the wrong approach. I mean, external
>> lights are meant to inform people phisically there, and they respond
>> to hardware status changes. But there could be LOTS of meanings for
>> a yellow led blinking. So, the best way to operate is directly
>> checking the hardware from remote with a specific tool like Dell
>> OpenManage, that can guarantee you a better fine grained control
>> over your hardware status. 
> 
> I understand about the lights. I was hoping to use IPMI to simply find
> out that something needed attention (the equivalent of an orange
> light) so I'd know to run diagnostics. At the moment I have an SC1425
> with a memory parity error and I was looking for some ipmitool
>   subcommand which returns something I could scan for when "something
> was wrong". 

I do something along those lines, although my alerts do also specify exactly
what the problem is.  Basically, I use nagios to run the following check via
NRPE:

Check_ipmi.pl:

#!/usr/bin/perl

# Nagios plugin for IPMI sensors status checking.
#
# Especially useful on Dell Poweredge servers, and others that
# implement the Intelligent Platform Management Interface (IPMI)
# interface.
#
# (C) Chris Wilson <check_ipmi at qwirx.com>, 2005-06-04
# Released under the GNU General Public License (GPL)

use warnings;
use strict;

open(OUTPUT, "</home/nagios/ipmioutput");

my %found;
my %bad;

sub trim ($) {
        my ($v) = @_;
        $v =~ s/^ +//;
        $v =~ s/ +$//;
        return $v;
}

while (my $line = <OUTPUT>)
{
        chomp $line;
        unless ($line =~ m'^(.*) \| (.*) \| (\w+)$')
        {
                die "Bad format in ipmitool output: $line";
        }

        my $name  = trim $1;
        my $value = trim $2;
        my $state = trim $3;
        $name =~ tr| |_|;

        my $counter = 1;
        my $uname = "$name";
        while ($found{$uname}) {
                $uname = $name . $counter++;
        }

        next if $state eq "ns";

        if ($state ne "ok") {
                $bad{$uname} = $state;
        }

        $found{$uname} = $value;
}


if (keys %bad) {
        print "IPMI critical: ";
        my @bad;
        foreach my $name (sort keys %bad) {
                push @bad, "$name is $bad{$name}";
        }
        print join(", ", @bad) . " ";
} else {
        print "IPMI ok ";
}

my @out;

foreach my $name (sort keys %found) {
        next unless $name =~ m|Fan| or $name =~ m|Temp|;
        push @out, "$name = $found{$name}";
}

print "(" . join(", ", @out) . ")\n";

close(OUTPUT);

if (%bad) { exit 2 } else { exit 0 }






NOTE:  I didn't write that script, but works like a charm.  Also note that
it reads from a file called /home/nagios/ipmioutput.  I generate that file
out of cron as follows:

#!/bin/sh
PATH=$PATH
/usr/local/bin/ipmitool sdr > /home/nagios/ipmitemp
mv /home/nagios/ipmitemp /home/nagios/ipmioutput


You could have the check_ipmi script invoke ipmitool directly, but I've
found ipmitool to be too slow for that, hence having it run out of cron
while Nagios just checks on the output.  This all presupposes using Nagios
to check all this, but it shouldn't be too hard to fit into whatever
monitoring system you might be using.

Andrew




More information about the Linux-PowerEdge mailing list