jay at ffsi.com
Wed Jul 20 11:21:21 CDT 2005
Im running debian sarge (2.4.29 customized) on a PE2650. I setup
basic snmp monitoring using the 4.120 .debs found at
http://debian.marlow.dk/dists/woody/dell/pool/. Anyway, it worked fine
for a couple weeks, but then this morning i started getting errors from
a cron'd script that checks certain OIDs. After doing a snmpwalk to see
what was happening, I saw that instead of 6 temp probe readings, I was
only getting 4. And instead of 5 fan readings, I was only getting 3.
Both CPU temperature probes are gone, as are Fan3 and Fan4 (CPU fans
They are not reading zero, they are simply not in the list. I logged
into the DRAC and checked the hardware logs, and it found these entries:
Wed Jul 20 08:10:36 2005 CPU Status 1 processor sensor CPU missing
Wed Jul 20 08:10:36 2005 CPU Status 2 processor sensor CPU missing
After trying a few different things, I rebooted the machine (it is not
in production right now) to see if that resolved the problem. It did
not. So it would appear that the sensors associated with the CPUs (both
temp and fans) are simply not registering. They do not show up in an
smpwalk nor in the DRAC sensors list.
The machine runs fine, and I have 4 logical (HT enabled) processors
showing in `top`.
More information about the Linux-PowerEdge