2.4.17-18 and PE2x50 lockups

Rechenberg, Andrew arechenberg at shermfin.com
Thu Nov 7 10:57:00 CST 2002


I'm not sure if you saw my message about the PowerEdge 6600, but I am
having the EXACT same issue.  Lockup, no network, no console.

This appears to me to be a kernel, and/or network driver issue.  Are you
using the tg3 or bcm5700 driver for your NIC (and lsmod will show you
the installed kernel modules if you didn't know already).

I have an issue open with Dell and they are getting back with me today
about it, but I also wanted to run my problem by the list as well.
Check out the message labelled 'Hard lockups on PE 6600' for the details
of my box.

At least it looks as if I'm not the only one having issues.

Thanks,
Andy.

-----Original Message-----
From: Ed Martin [mailto:ed.martin at iop.org] 
Sent: Thursday, November 07, 2002 11:30 AM
To: linux-poweredge at dell.com
Subject: 2.4.17-18 and PE2x50 lockups


Hi,

Recently I've had a lot of trouble with a new PE2650.  I've installed
RHL 7.3 on it, and for a day all was fine.  I then updated the kernel to
2.4.18-17.  Since then I've had a series of lockups/hangs/freezes.
There is no discernable problem if you look at the main unit - all
lights and indicators are normal.  But you can't get any network
response from the box, nor is the console operative at all.  To resolve
I have to power cycle and reboot.  This occurs within an hour after boot
- usually within 10mins.  There is no error that I can see in
/var/log/messages.  And the server isn't doing anything in particular.
RHL was installed via OMA v7.2.2.

They are identical symptoms to a problem I also had recently on a 2550.
The box has run fine for several months.  Recently I upgraded from RHL
v7.2 to v7.3, flashed the bios and the PERC3/DC, and switched the
network lead from an Intel E100 card to the inbuilt broadcom.  I also
installed the latest errata kernel - 2.4.17-18.  This server then
experienced several lockups - from 1 minute after boot to a few hours.
The lockups went away when I switched back to the Intel E100 card from
the broadcom.  All other updates were left in place.

I plan to leave the 2650 with 2.4.18-10 running, to see if that is
stable.  I may also install a Intel 1000T nic, and see if this works,
indicating that the problem was again linked to the broadcom.  In the
meantime however I wanted to post my experience here.  I am a bit
confused as I understood that the 7.x 2.4.18-17 kernel was particularly
heavily tested!

Any ideas anyone?  I'd like to resolve this problem, or I won't dare
update any of the other 2550/2650's we've got.

Yours,

Ed


---
Ed Martin
Head of Systems Technology and Support
IOP Publishing Ltd
Dirac House, Temple Back
Bristol  BS1 6BE
ddi: +44 (0)117 930 1102
www:  http://www.iop.org



**********************************************************************
Institute of Physics
Registered charity No. 293851
76 Portland Place, London, W1B 1NT, England

IOP Publishing Limited
Registered in England under Registration No 467514.
Registered Office: Dirac House, Temple Back, Bristol BS1 6BE England

This e-mail message has been checked by MIMEsweeper using
F-Secure Anti-Virus for the presence of computer viruses.
**********************************************************************

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq or search the list
archives at http://lists.us.dell.com/htdig/




More information about the Linux-PowerEdge mailing list