Hard lockups on PE 6600 (Rechenberg, Andrew)

Montz, James C. (James Tower) JCMontz at jamestower.com
Tue Nov 12 11:29:00 CST 2002


Did you simply modify /etc/modules.conf and replace
alias eth0 tg3
alias eth0 bcm5700?

Also, if this is the case, does this then work properly when using basplnx?

-----Original Message-----
From: Rechenberg, Andrew [mailto:arechenberg at shermfin.com] 
Sent: Tuesday, November 12, 2002 11:11 AM
To: Jeff Parker; linux-poweredge at dell.com
Subject: RE: Hard lockups on PE 6600 (Rechenberg, Andrew)

You said you tried both NIC's, but did you try both NIC modules?

In my case, the 6600 with a fresh 7.3 install used the tg3 module
instead of the bcm5700 module and I believe that either the module, or
the combination of the module and the other hardware in the box, was
causing the lockups.  I have 2 PERC3 cards in the box (a DC and a QC)
and it may have been an issue with IRQ's and kernel modules.  I'm not
really sure.

All I know is that I changed the kernel module and the box has been up
and stable for 4 days with 2.4.18-17.7.xbigmem with HEAVY disk activity
and moderate to heavy network activity:

[root at mybox ~]# uptime
 12:01pm  up 4 days,  5:23, 309 users,  load average: 3.65, 3.53, 3.71
[root at mybox ~]# uname -a
Linux mybox.shermfin.com 2.4.18-17.7.xbigmem #1 SMP Tue Oct 8 12:07:59
EDT 2002 i686 unknown
[root at cinshrcub01 ~]# free
             total       used       free     shared    buffers
Mem:       7483280    7472596      10684          0      80408

A friend of mine had the same problem with some 2650's he just received
and he did the same thing and his boxes are now stable as well.

Hope this helps.  Let me know if you need any more info.


-----Original Message-----
From: Jeff Parker [mailto:parkerjeff at spcollege.edu] 
Sent: Tuesday, November 12, 2002 10:55 AM
To: linux-poweredge at dell.com
Cc: Rechenberg, Andrew
Subject: RE: Hard lockups on PE 6600 (Rechenberg, Andrew)

I have a 6650 that does not seem to like more than 30 days of uptime as
well. I am currently not using 4/8 gigs of DRAM because I cannot keep
the box up for 24 hours with the bigmem kernel (SMP seems better). I
tried the BC NICS, and now I am using an E1000 Fiber NIC with the same

I have collected every bit of data from my box possible, and the only
concrete problem that I can find is processes hanging that cannot be
killed (-9). My point of blame is beginning to go to the Perc3-DC card
because of this (who knows if I am on the right track). I am tempted to
bust out a Perc2!

I was wondering if anyone else was getting tons of "hung" processes on
their servers before kernelling? My top loadaverage was 695.

One note in defense of Linux, I have 2 6650's running Win2K and they do
not seem to work correctly either.

My 6650;
4X1600 Xeons
8 Gig DDR
5X73gig HD
2 Intel E1000 Fiber NICS
Jeffrey J. Parker
Sr. Network Design & Security Engineer
St. Petersburg College, Seminole Campus
9200 113th Street, North
Seminole, Florida 33772
Phone: 727.394.6036  
Fax:   727.394.6264
Cell:  727.224.5722

Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
Please read the FAQ at http://lists.us.dell.com/faq or search the list
archives at http://lists.us.dell.com/htdig/

More information about the Linux-PowerEdge mailing list