2650 + new BIOS + 2.6.10-ac11 and it *still* crashes

Kevin M. Myer kevin_myer at iu13.org
Tue Mar 15 15:47:36 CST 2005


Yes, we have been having problems with 2650's and RHEL.  Ironically enough, a
server just crashed moments ago.

Here are the symptoms:

The server remains alive to ICMP pings.  If you portscan it, it shows 
that ports
are open.  But if you attempt to connect to any of the services running on the
machine, you get nothing.  So its alive...but its dead in terms of 
usefullness.

You can't get a console prompt, either through a monitor, or through a serial
connection.

The kicker has been that absolutely nothing is ever logged.  No oops, 
panic, or
warning.  When you reboot the server (thank goodness for RAC cards), and go
back and comb through the logs, there's absolutely nothing logged to 
indicate a
problem.  Nothing in ESM either.

We are running RHEL 3, Update 4.  Several common items include most 
servers are
attached to a Dell/EMC CX300 SAN, with single Qlogic 2340 HBAs, they are
running iptables, we are using the bonding driver with the tg3 NIC driver, in
active-fallback mode.  Kernel is either 2.4.21-27.0.1smp or 2.4.21-27.0.2smp
and OpenManage is installed.  BIOS and FW are current.

I'm at a loss how you troubleshoot a problem where not one iota of data is
logged to indicate there is a problem.  Where the IP stack seems to be still
partially functioning, but the server is unresponsive to anything but ICMP
pings (including logins over a serial console).

Are we alone with these problems?

Kevin

-- 
Kevin M. Myer
Senior Systems Administrator
Lancaster-Lebanon Intermediate Unit 13
(717) 560-6140




More information about the Linux-PowerEdge mailing list