Need PERC RAID status on 64-bit RHES4 Linux

Jeff Potter jpotter-dell at codepuppy.com
Wed Jun 27 21:40:38 CDT 2007


> I'm running RHES4 64-bit on Dell 1850's and Dell 2850's with PERC
> controllers. The Dell RPMs I found to install MegaPR to query the  
> RAID status
> are only 32-bit aware. It may be possible to install these, but I  
> have not
> figured out the magic. I just want to have Nagios tell me when a  
> drive in the
> array fails! Currently, the only way I can get array status is by  
> booting
> into the DRAC and then going into the RAID setup to view the status  
> of the
> array. It's absurd that I can't do this from the command-line. What  
> am I
> doing wrong? I can give more clues if necessary.

Look at using OMSA instead for this.

Here's what we do on our dell systems in general -- I know this works  
on Rhel5/64 bit on the 1850's, and I know it works on Rhel4/32 bit on  
the 1850's -- you don't need the firmware stuff necessarily, but OMSA  
5.2.0 will complain about too-old firmware on the perc controller.

wget -q -O - http://linux.dell.com/repo/software/bootstrap.cgi | bash
wget -q -O - http://repo.fwupdate.com/repo/firmware/bootstrap.cgi | bash
yum -y install firmware-addon-dell
yum -y install $(bootstrap_firmware -b)
# (may need to reboot for next command to work)
update_firmware --yes
yum -y install srvadmin-base srvadmin-storageservices
chkconfig --level 345 ipmi on #Rhel4 only; not needed in Rhel5
service ipmi start #Rhel4 only; not needed in Rhel5
srvadmin-services.sh  start
echo "nagios  ALL = NOPASSWD: /usr/bin/omreport" >> /etc/sudoers
perl -pi -e "s:Defaults\s+requiretty:#Defaults    requiretty:gs" /etc/ 
sudoers


To check raid status in Nagios, use Gunther Schlegel's check_perc  
plugin -- I'm attaching a copy that we use, that has minor  
modifications (this version won't flag 'resynching' (sic) as an  
error; also, it uses sudo so that it works under 'nagios' user when  
running under ipmi 2.0 (e.g. PE 1950's)).



best,

Jeff




#!/usr/bin/perl -wT
#
# Nagios Plugin to check storage devices ( e.g. raid controller )
# in Dell server systems using "omreport" from Dell OpenManage.
#
# Requires Dell OpenManage 4.3 or later.
#
# (C) 2005 Riege Software International GmbH
# Mollsfeld 10
# 40670 Meerbusch
# Germany
#
# Published under the Genral Public License, Version 2.
#
# Author: Gunther Schlegel <schlegel at riege.com>
# Atof.net modified: added 'Resynching' into allowed OK list
#
# V0.0.1  20050801 gs   new script
# V0.1.0  20050818 gs   initial release, checks controllers, disks,
#                                               virtual disks,  
batteries.
# V0.2.0  20050819 gs   detect regenerating virtual disks and report
#                                               warning instead of  
critical.
# V0.3.0  20050822 gs   detect noncritical degraded controllers and  
report
#                                               warning instead of  
critical
# V0.4.0  20051102 gs   detect noncritical degraded hard disk and report
#                                               warning instead of  
critical
# V0.5.0  20051103 gs   detect noncritical rebuilding-message and report
#                                               warning instead of  
critical
# V0.5.1  20060516 gs   add /usr/lib/nagios/plugins to default module  
search path
# V0.5.2  20060609 gs   fix option inconsistency
# V0.6.0  20060609 gs   add OpenManage 5.0.0 compatibility
# V0.6.1  20060712 gs   fix: detect OM5 Non-Critical messages correctly
# V0.7.0  20061019 gs   add OpenManage 5.1.0 / PERC 5 / SAS  
compatibility
# V0.7.1  20061024 gs   fix: battery matching
#                                               enh: use different  
omreport output format -- it is way faster


# Modules
use strict;
use Getopt::Long;
use File::Basename;
use lib qw(/usr/local/nagios/libexec /usr/lib/nagios/plugins);
use utils qw (%ERRORS);

# untaint Environment
$ENV{'BASH_ENV'}='';
$ENV{'ENV'}='';
$ENV{'PATH'}='/bin:/usr/bin';

# variables
my ($debug,$help)='0';
my $om='/usr/bin/omreport';
my $omcmd="/usr/bin/sudo $om storage";
my $omfmt='-fmt ssv';
my $result=$ERRORS{'UNKNOWN'};

my @controllers;
my %controller;
my $I;
my $J;
my @disks;
my %disk;
my @messages;
my @batteries;
my %battery;
my $omversion;
my $dummy;
my @dummy;
my $match;

# Process command line
GetOptions ('VERBOSE+' => \$debug,'HELP|?' => \$help);
$help and usage();

# Main
if ( -x $om ) {
         @dummy=`$om about -fmt cdv`;
         chomp @dummy;

         ($omversion)=($dummy[2] =~ /.*?;(\d+\.\d+\.\d+)?;/);
         print "OpenManage version: $1\n" if $debug;


         @controllers=`$omcmd controller $omfmt`;
         print "Got $#controllers controller lines\n" if $debug > 1;
         print @controllers if $debug > 2;
         chomp @controllers;

         foreach $I (@controllers) {
                 undef (%controller);
                 if (($controller{'id'},$controller{'status'}, 
$controller{'type'},$controller{'state'}) = ($I =~ /^(\d+?);(.*?); 
(.*?);.*?;(.*?);/)) {
                         print "Checking Controller No. $controller 
{'id'}: $controller{'type'}\n" if $debug;

                         if ( $controller{'status'} ne 'Ok' ) {
                                 if ( $controller{'status'} eq  
'Noncritical' or $controller{'status'} eq 'Non-Critical' ) {
                                         $result=addresult($result, 
$ERRORS{'WARNING'});
                                 } else {
                                         $result=addresult($result, 
$ERRORS{'CRITICAL'});
                                 }
                                 push @messages, "Ctrl $controller 
{'id'} ($controller{'type'} is $controller{'status'}($controller 
{'state'}))";
                         } else {
                                 $result=addresult($result,$ERRORS 
{'OK'});
                         }

                         undef @disks;
                         @disks=`$omcmd adisk controller=$controller 
{'id'} $omfmt`;
                         $dummy=$#disks;
                         print "Got $dummy physical disk lines\n" if  
$debug > 1;
                         # no need to check virtual disks if there  
are no harddisks
                         push (@disks,`$omcmd vdisk controller= 
$controller{'id'} $omfmt`) if ( grep /^(\d+:){1,2}\d+;.*/, @disks );
                         print 'Got '.($#disks-$dummy)." logical disk  
lines\n" if $debug > 1;
                         print @disks if $debug > 2;
                         chomp @disks;

                         foreach $J (@disks) {
                                 undef (%disk);

                                 $match=0;
                                 if ( $omversion =~ /^4\./ ) {
                                         (($disk{'id'},$disk 
{'status'},$disk{'state'},$disk{'progress'}) = ($J =~  /^(\d+:\d+|\d 
+);(\w+?);.+?;(\w+?);(.*?);/)) && ($match=1);
                                 } elsif ( $omversion =~ /^5\./ ) {
                                         (($disk{'id'},$disk 
{'status'},$disk{'state'},$disk{'predicted'},$disk{'progress'}) = ($J  
=~  /^(\d+:\d+:\d|\d+:\d+|\d+);(\w+?);.+?;(\w+?);(.*?);(.*?);/)) &&  
($match=1);
                                 }

                                 if ($match) {
                                         if ( $disk{'id'} =~ /:/ ) {
                                                 $disk{'type'} 
='physical';
                                         } else {
                                                 $disk{'type'} 
='virtual';
                                         }
                                         print "Status of $disk 
{'type'} disk $disk{'id'}: $disk{'status'}, state: $disk{'state'}\n"  
if $debug;

                                         if ( ($disk{'status'} ne  
'Ok') or ($disk{'state'} ne 'Online' and $disk{'state'} ne 'Ready') ) {
                                                 if ( $disk{'status'}  
=~ /Noncritical|Ok/ and $disk{'state'} =~ /Regenerating|Rebuilding| 
Resynching/ ) {
                                                         push  
@messages,"Ctrl $controller{'id'} Disk $disk{'id'} is $disk{'status'} 
($disk{'state'}, $disk{'progress'})";
                                                          
$result=addresult($result,$ERRORS{'WARNING'});
                                                 } elsif ( $disk 
{'status'} eq 'Noncritical' and $disk{'state'} eq 'Degraded' ) {
                                                         push  
@messages,"Ctrl $controller{'id'} Disk $disk{'id'} is $disk{'status'} 
($disk{'state'})";
                                                          
$result=addresult($result,$ERRORS{'WARNING'});
                                                 } else {
                                                         push  
@messages,"Ctrl $controller{'id'} Disk $disk{'id'} is $disk{'status'} 
($disk{'state'})";
                                                          
$result=addresult($result,$ERRORS{'CRITICAL'});
                                                 }
                                         }

                                         if ($disk{'predicted'}) { #  
only available with OM version 5 or newer
                                                 if ($disk 
{'predicted'} eq 'Yes') {
                                                         push  
@messages,"Predicted fail on Ctrl $controller{'id'} Disk $disk{'id'}";
                                                          
$result=addresult($result,$ERRORS{'WARNING'});
                                                 }
                                         }
                                 }

                         }

                         print "Checking Batteries:\n" if $debug;
                         undef @batteries;
                         @batteries=`$omcmd battery controller= 
$controller{'id'} $omfmt`;
                         print "Got $#batteries battery lines\n" if  
$debug > 1;
                         print @batteries if $debug > 2;
                         chomp @batteries;

                         foreach $J (@batteries) {
                                 undef (%battery);
                                 if (($battery{'id'},$battery 
{'status'},$battery{'name'},$battery{'state'},$battery{'chargecount'}, 
$battery{'chargemax'}) = ($J =~ /^(\d+?);(\w+?);(.+?);(\w+?);(.*?); 
(.*?)/)) {
                                         print "Status of battery  
$battery{'id'}: $battery{'status'}, state: $battery{'state'}\n" if  
$debug;

                                         if ( $battery{'status'} ne  
'Ok' or $battery{'state'} ne 'Ready' ) {
                                                 push @messages,"Ctrl  
$controller{'id'} Batt $battery{'id'} is $battery{'status'}($battery 
{'state'})";
                                                 $result=addresult 
($result,$ERRORS{'CRITICAL'});
                                         }
                                         if ( $battery{'chargecount'}  
=~ /^\d+$/ and $battery{'chargemax'} =~ /^\d+$/ ) {
                                                 if ( $battery 
{'chargecount'} >= $battery{'chargemax'} ) {
                                                         push  
@messages,"Ctrl $controller{'id'} Batt $battery{'id'} charge max  
reached";
                                                          
$result=addresult($result,$ERRORS{'WARNING'});
                                                 }
                                         }
                                 }
                         }
                 }
         }
} else {
         push @messages,"Error: $om not found\n\n";
         usage();
}


# Script end
print "\nResult: $result\n" if $debug;
writemessages(@messages);
exit $result;

# Subs

sub addresult {
         my $oldresult=shift @_;
         my $newresult=shift @_;

         if ( $oldresult eq $ERRORS{'UNKNOWN'} or $newresult gt  
$oldresult ) {
                 return $newresult;
         }

         return $oldresult;
}

sub writemessages {
         my @messages = @_;

         unshift @messages, '[' if $#messages >= 0;

         foreach (keys %ERRORS) {
                 unshift @messages, "$_" if $ERRORS{$_} == $result;
         }

         push @messages, ']' if $#messages > 0;

         print 'STORAGE: ',substr ((join " ", at messages),0,71),"\n";
}

sub usage {
         writemessages(@messages);
         print (basename $0." [--verbose] [--help]\n");
         exit $ERRORS{'UNKNOWN'};
}

sub exitmessage {
         my $result=shift @_;

         print join ' ', at messages."\n";
         exit $result;
}

# vim: autoindent number ts=4



More information about the Linux-PowerEdge mailing list