tg3 stability reports? (was: Re: recent kernel upgrades.. security.)

Steven Kehlet steven.kehlet at conexant.com
Wed Mar 19 18:51:00 CST 2003


> Would you be willing to elaborate on your "NFS issues galore"? I'm

Absolutely.  I just found this list, but thanks to everyone who already
responded to my inquiry on tg3 stability! :-)

I've actually been dealing with a couple different NFS-related issues,
on desktops and servers, on RedHat 7.2, 7.3, and 8.0.  We have a fleet
(~ 22) of PE2650s in the back room (for batch job processing), and
several Precision 530ns for desktops.  I would imagine other people may
not see the same severity of problems I'm seeing because our environment
is heavily NFS-driven, i.e.  I run Oracle on Linux over NFS, /usr/local,
our project directories, and just about everything is NFS mounted.  We
use numerous Network Appliance NFS filers to serve data.

On heavily loaded (NFS traffic) RH 7.3 systems (2.4.18-4), the NFS
performance is spotty and erratic.  I see tons of "kernel: nfs: server
xxx not responding, still trying" errors in /var/log/messages, followed
by variable amounts of time (usu. 1-60secs), then "server xxx OK".  At
its worst my Oracle server stalled for 1.5 hours in the middle of a
long-running report.  While this is slowing things down, at least
nothing is dying because of it :-), as NFS does pick up eventually and
things continue.

NetApp has a bug on file similar to this issue, claiming the bug is
actually in the Linux IP fragmentation code, and that switching to tcp
mounts will help.  And so at first using tcp mounts seemed to help,
because the barrage of "server not reponding" messages went away, but
sadly they were replaced by random hangings, accompanied by "kernel:
lockd: server xx.xx.xx.xx not responding, still trying" messages. 
Great--now it's lockd.  Argh :-).  

I've wanted to try later kernels, but given the widespread reports of
problems with the tg3 driver, I felt I'd be trading one set of problems
for another :-).  Also, I'm speculating that since RedHat is just
patching the same old 2.4.18 kernel, there probably aren't really any
bug fixes for the NFS code between, say 2.4.18-4 and 2.4.18-26 (maybe
I'm mistaken here, please let me know if so).  What I need is fixes to
the NFS client-side code which I'm thinking will only come with an
upgrade to a later kernel version (e.g. 2.4.2x).

On the desktop side, our Precisions came with RedHat 8.0 (2.4.18-14),
and we would experience random hangings periodically throughout the day
while accessing files over NFS.  I tried upgrading to a stock 2.4.20
kernel, but then reading files was preceded by a 1-2 second pause (I've
seen other reports of this with 2.4.20).  I then installed RH's
2.4.18-24 and my pauses went away, but other users are still
complaining.  I tried converting those users to tcp, but then the NFS
performance dropped through the floor, so I had to switch them back to
udp :-).

I've also tried a bunch of other things too (e.g. changing NFS block
sizes), but it's hard to remember everything.  If it weren't my job, at
this point I'd just say "oh well" and wait 6-12 months for Linux's NFS
to get better :-).  But I've gotten enough positive responses to my
query about tg3 in 2.4.18-26, so I'll try upgrading one of my 2650s to
it and report back here.  Thanks again everyone.

Steve




On Wed, 2003-03-19 at 15:35, David C. Kovar wrote:
> Good afternoon,
> 
> Would you be willing to elaborate on your "NFS issues galore"? I'm
> trying to pin down a fairly troubling NFS error (detailed in an earlier
> message) as well. I'd like to get some more insight on how stable NFS is
> in general on RH 7.3.
> 
> Thank you very much.
> 
> -David
> 
> On Wed, 2003-03-19 at 09:38, Steven Kehlet wrote:
> > > Of course this should also have the 'fixed' tg3 code from jeff garzik
> > > (released in -26) so if you are upgrading please be aware that you
> > 
> > Does anyone have any feedback on the stability of the tg3 driver now?
> > (RH kernels >= 2.4.18-26).  
> > 
> > I'm battling NFS issues galore with the stock RH7.3 kernel on a fleet of
> > PE2650 systems and would like to upgrade kernels, but have been scared
> > away up to this point by all the negative talk on the tg3 driver.
> > 
> > TIA!
> > 
> > Steve
> > 
> > 
> > 
> > 
> > 
> > 
> > On Tue, 2003-03-18 at 18:24, jason andrade wrote:
> > > Hi,
> > > 
> > > I'd strongly recommend that people out there using Red Hat Linux with
> > > multi user access systems consider upgrading to the latest RH kernel as
> > > there appears to be an exploit being circulated that takes advantage of the
> > > newest security hole to be discovered..
> > > 
> > > It is specifically vulnerable to people with local (account/shell/process)
> > > access to the machine and is not a remote exploit.
> > > 
> > > Of course this should also have the 'fixed' tg3 code from jeff garzik
> > > (released in -26) so if you are upgrading please be aware that you
> > > will not have the bcm5700 module available anymore and will need to
> > > modify your modules.conf appropriately.
> > > 
> > > 
> > > Currently this appears to be fixed by (find your closest RH mirror)
> > > 
> > > http://www.redhat.com/mirrors/
> > > 
> > > 2.4.18-27.7.x (Red Hat 7.1/7.2/7.3)
> > > 2.4.18-27-8.x (Red Hat 8.0)
> > > 
> > > I have not seen whether the Red Hat 2.1 AS/AW kernels are vulnerable
> > > to this so it'd be good if someone from Red Hat could clarify this..
> > > 
> > > 
> > > Generic Kernels
> > > 
> > > 2.4.21preX - i have not seen a 2.4.21pre6 patch released yet.
> > > 2.2.25
> > > 
> > > see:   http://www.spinics.net/lists/kernel/msg162986.html
> > > 
> > > 
> > > I have not seen any updates from Mandrake, SuSE or Slackware yet
> > > so if you are running on any of those platforms you will probably
> > > want to contact them directly or wait for a few days to see if
> > > they are announcing/releasing updates.
> > > 
> > > 
> > > regards,
> > > 
> > > -jason
> > > 
> > > _______________________________________________
> > > Linux-PowerEdge mailing list
> > > Linux-PowerEdge at dell.com
> > > http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> > > Please read the FAQ at http://lists.us.dell.com/faq or search the list archives at http://lists.us.dell.com/htdig/
> > 
> > _______________________________________________
> > Linux-PowerEdge mailing list
> > Linux-PowerEdge at dell.com
> > http://lists.us.dell.com/mailman/listinfo/linux-poweredge
> > Please read the FAQ at http://lists.us.dell.com/faq or search the list archives at http://lists.us.dell.com/htdig/




More information about the Linux-PowerEdge mailing list