[Ipmitool-devel] Is the BMC robust to recover from system hangs? impitool unresponsive
rpnabar at gmail.com
Mon Aug 30 10:31:40 CDT 2010
On Mon, Aug 30, 2010 at 10:23 AM, Jarrod B Johnson <jbjohnso at us.ibm.com>wrote:
> Don't know much about Dell specifically, however I'll offer some
Very much appreciate your helping out! I haven't any
> If the Broadcom part has the tg3 driver, you may be out of luck depending
> on the failure state. For example, BCM5704 chips fundamentally cannot
> provide BMC access while executing PXE. On the other hand, bnx2 managed
> chips tend to fare better, there generally is at least one way to make it
> work correctly, though drivers and nic firmware matter *greatly* still. Not
> as resilient as I would like, but with precautions in how you manage
> firmware and drivers, it's workable.
Luckily no tg3 for this server. It does have the bnx2 driver. Do you have
any specific driver / firmware version comments about which combinations do
work and which don't?
> You'll want to check your tg3/bnx2/whatever driver version and NIC
> firmware version, depending on your investigation.
I have version 1.9.3 of bnx2. Not sure how to get the NIC firmware version
on a running system.
> Shared nics can work great, but some implementations can be picky about
> what drivers and firmware are in place.
This being a HPC cluster shared nics was the more feasible option. The cost
of a dedicated out-of-band network and switches was deemed too expensive and
messy. As such no one server is critical but the utility of the BMC+IMPI is
the ability to debug crashes without having to walk to the server room each
time. Or so I thought! :)
> Also, newer is not always better, sometimes a developer without caring
> about the IPMI access provided by some nics will unwittingly break it
> somehow in the driver, and it won't get fixed until some server vendor or
> other industrious administrator stumbles across it.
Absolutely. Agreed. Unfortunately there's no easy way of knowing what works
and what doesn't other than posting on a list like this and hoping someone
else has been burnt before! :)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Linux-PowerEdge