Dead 2400 with "stuck on TLB IPI wait (CPU#1)" message
dhubbard at dino.hostasaurus.com
Tue Jun 4 13:13:00 CDT 2002
Upon further investigation, now that it's back up, the
kernel.log makes it look like the problem was network
card related, I think, since I've never seen this before.
Here's the useful output from dmesg on the network card
eepro100.c:v1.09j-t 9/29/99 Donald Becker
eepro100.c: $Revision: 18.104.22.168 $ 2000/05/31 Modified by Andrey V.
Savochkin <saw at saw.sw.com.sg> and others
eepro100.c: VA Linux custom, Dragan Stancevic <visitor at valinux.com>
Intel PCI EtherExpress Pro100 82557, 00:A0:C9:D7:CB:83, I/O at 0xece0, IRQ
And here's what was in kernel.log just before the
"stuck on TLB IPI wait" junk started.
Jun 4 12:38:19 mail kernel: eth0: Transmit timed out: status f048 0c00 at
36775027/36775090 command 000ca000.
Jun 4 12:38:19 mail kernel: eth0: Tx ring dump, Tx queue 36775090 /
Jun 4 12:38:19 mail kernel: eth0: 0 200ca000.
Jun 4 12:38:19 mail kernel: eth0: 1 000ca000.
There were hundreds of those lines. Is this indicative
of a bad NIC, bad driver, or just bad luck? :-)
> -----Original Message-----
> From: Seth Mos [mailto:knuffie at xs4all.nl]
> Sent: Tuesday, June 04, 2002 1:59 PM
> To: Hubbard, David
> Cc: 'linux-poweredge at dell.com'
> Subject: Re: Dead 2400 with "stuck on TLB IPI wait (CPU#1)" message
> On Tue, 4 Jun 2002, Hubbard, David wrote:
> > stuck on TLB IPI wait (CPU#1)
> > Anyone seen that one before, it seems to have something
> > to do with SMP and the kernel. I have a dead 2400 right
> > now that went down with that message on the console,
> > remote support is supposed to be attempting to reboot
> > it for me.
> There is not much else you can do about this. This can
> happen. Anything
> special that the machine might have been doing at the time?
More information about the Linux-PowerEdge