[Linux-PowerEdge] OMSA 7.1.0 SNMP bug with (non-)certified disks

Russell Kackley rkackley at naoj.org
Thu Nov 8 15:45:38 CST 2012


Hi Trond,

Sorry it took so long to reply to you, but here are some results from my
tests of OMSA 7.1.0 on some of our systems. We have several Dell PowerEdge
servers, but we have updated only two of them to run OMSA
7.1.0. Those two servers are running Ubuntu 12.04. We are seeing problems
similar to what you have reported. I was hoping that the recent update of
OMSA 7.1.0 would have corrected the problem, but I installed the OMSA
package updates yesterday and it looks like the problem is still with us.

One of the servers is a PE R815 that has all non-certified disks. Some are
connected via a PERC H200 controller and some are connected via a PERC H800
controller. "omreport storage pdisk controller=<n>" reports that all of the
disks are "Certified: No" and "Status: Non-Critical" which is correct.
However, using SNMP to access the arrayDiskDellCertified table tells us
that the disks are Certified (1), which is incorrect:

$ snmpwalk -v2c -c public localhost 1.3.6.1.4.1.674.10893.1.20.130.4.1.36
SNMPv2-SMI::enterprises.674.10893.1.20.130.4.1.36.1 = INTEGER: 1

I also looked at the arrayDiskComponentStatus and
arrayDiskRollUpStatus tables via SNMP. Both of those report correct
results, i.e., status is Non-critical (4):

$ snmpwalk -v2c -c public localhost 1.3.6.1.4.1.674.10893.1.20.130.4.1.24
SNMPv2-SMI::enterprises.674.10893.1.20.130.4.1.24.1 = INTEGER: 4

$ snmpwalk -v2c -c public localhost 1.3.6.1.4.1.674.10893.1.20.130.4.1.23
SNMPv2-SMI::enterprises.674.10893.1.20.130.4.1.23.1 = INTEGER: 4

The other server is a PE 2950 with a PERC 6/I controller. I'm not 100% sure
if the disks are certified or not, but I'm guessing that they are. I don't
think that the PERC 6/I controller is able to determine whether the disk is
certified or not, because omreport tells me that the disks in this machine
are "Certified: Not Applicable". omreport also reports "Status: Ok", which
is correct. Looking at arrayDiskDellCertified via SNMP, it tells us that
the disks are "Not Certified" (0), which I think is incorrect:

$ snmpwalk -v2c -c public localhost 1.3.6.1.4.1.674.10893.1.20.130.4.1.36
SNMPv2-SMI::enterprises.674.10893.1.20.130.4.1.36.1 = INTEGER: 0

arrayDiskComponentStatus and arrayDiskRollUpStatus both return correct
results, i.e., Ok (3):

$ snmpwalk -v2c -c public localhost 1.3.6.1.4.1.674.10893.1.20.130.4.1.24
SNMPv2-SMI::enterprises.674.10893.1.20.130.4.1.24.1 = INTEGER: 3

$ snmpwalk -v2c -c public localhost 1.3.6.1.4.1.674.10893.1.20.130.4.1.23
SNMPv2-SMI::enterprises.674.10893.1.20.130.4.1.23.1 = INTEGER: 3

Do you think you can modify check_openmanage to cope with these Dell OMSA
problems? Or, even better, has any one from Dell indicated that they might
correct these problems?

Anyway, thanks for your work on check_openmanage. I find it very useful for
monitoring our Dell servers.



On Fri, Sep 28, 2012 at 9:38 AM, Trond Hasle Amundsen <
t.h.amundsen at usit.uio.no> wrote:

> Trond Hasle Amundsen <t.h.amundsen at usit.uio.no> writes:
>
> > With OMSA 7.1.0, the boolean value from SNMP for whether a physical disk
> > i certified or not is reversed. Example for a server with one
> > non-certified disk and three certified:
> >
> > OMSA 7.0.0:
> >
> >   $ snmpwalk -v2c -c public hostname
> 1.3.6.1.4.1.674.10893.1.20.130.4.1.36
> >   SNMPv2-SMI::enterprises.674.10893.1.20.130.4.1.36.1 = INTEGER: 1
> >   SNMPv2-SMI::enterprises.674.10893.1.20.130.4.1.36.2 = INTEGER: 1
> >   SNMPv2-SMI::enterprises.674.10893.1.20.130.4.1.36.3 = INTEGER: 0
> >   SNMPv2-SMI::enterprises.674.10893.1.20.130.4.1.36.4 = INTEGER: 1
> >
> > OMSA 7.1.0:
> >
> >   $ snmpwalk -v2c -c public hostname
> 1.3.6.1.4.1.674.10893.1.20.130.4.1.36
> >   SNMPv2-SMI::enterprises.674.10893.1.20.130.4.1.36.1 = INTEGER: 0
> >   SNMPv2-SMI::enterprises.674.10893.1.20.130.4.1.36.2 = INTEGER: 0
> >   SNMPv2-SMI::enterprises.674.10893.1.20.130.4.1.36.3 = INTEGER: 1
> >   SNMPv2-SMI::enterprises.674.10893.1.20.130.4.1.36.4 = INTEGER: 0
> >
> > There are no changes in the MIB, which still states that:
> >
> >     -- 1.3.6.1.4.1.674.10893.1.20.130.4.1.36
> >     arrayDiskDellCertified OBJECT-TYPE
> >             SYNTAX INTEGER
> >             ACCESS read-only
> >             STATUS mandatory
> >             DESCRIPTION
> >                     "Indicates if array disk is certified by Dell.
> >                     Value: 1 - Certified, 0 - Not Certified, 99 -
> Unknown"
> >             ::= { arrayDiskEntry 36 }
> >
> > So this is clearly a bug.
>
> I'm wondering if this applies to all OS-es or just Linux (RHEL in my
> case). Would any of you with Windows (or Ubuntu etc.) servers care to
> test?
>
> Whether or not all OS-es are affected has implications on how this bug
> should be best handled by the check_openmanage Nagios plugin.
>
> Regards,
> --
> Trond H. Amundsen <t.h.amundsen at usit.uio.no>
> Center for Information Technology Services, University of Oslo
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
>



-- 
Russell Kackley
Subaru Telescope
Hilo, Hawaii
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.us.dell.com/pipermail/linux-poweredge/attachments/20121108/edd373da/attachment-0001.html 


More information about the Linux-PowerEdge mailing list