Random server reboot after update to CentOS 5.3
Steven A. Kaskinen
steve at providentadvisors.com
Fri May 22 08:53:49 CDT 2009
Are you sure it's due to 9000 MTU or is it due to how much traffic you push through the interface?
Back on Centos/RHEL 5.0 era we had a very similar issue with the stock bnx2 driver which caused kernel panics in bnx_poll when the network load was too high...if it's the same issue (wouldn't be the first time I've seen a bug reintroduced) then you're setting the MTU higher might cause it by pushing the throughput up; instead of something intrinsic to the MTU size itself.
Here were the bug reports on that one:
https://bugzilla.redhat.com/show_bug.cgi?id=242060
http://bugs.centos.org/view.php?id=2103
For us it was an NFS server; which we increased its through put by increasing the tcp buffers (net.core.rmem_default; net.core.wmem_default; net.core.rmem_max; net.core.wmem_max) and # of nfs threads (default is 8 which is way too low; up to 128)...as well as putting the rsize and wsize options on the clients mount options (proto=tcp,rsize=32768,wsize=32768); all that pushed the server harder; and then it started daily reboots during any heavy file copy times.
We were able to fix it by using the updated driver available from the dell hw repo. Hopefully the same is still true with 5.3; as I'm looking to upgrade to it to fix various autofs issues I'm having.
Do you have TSO enabled on the bnx ports as well? We usually disable that right away as well; as its caused us all sorts of troubles; generally I've found the bnx2 drivers to have a pretty poor track record; and now I mostly resort to putting Intel NICs in our boxes.
Steve
> -----Original Message-----
> From: linux-poweredge-bounces at lists.us.dell.com [mailto:linux-
> poweredge-bounces at lists.us.dell.com] On Behalf Of Robert von Bismarck
> Sent: Friday, May 22, 2009 8:15 AM
> To: linux-poweredge at lists.us.dell.com
> Subject: RE: Random server reboot after update to CentOS 5.3
>
> Sounds scary,
>
> Have you tried installing the broadcom drivers + firmware from dell's
> support site or is it a pure centos kernel ?
>
> BTW : I can confirm no panics with heavy NFS loads on CentOS 5.2 :
>
> System : PE 1950 III 4Gb RAM
> NIC : Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (from
> lspci)
> Kernel : 2.6.18-92.el5 #1 SMP x86_64
> Bnx2 driver : bnx2 v1.6.9 (December 8, 2007) (from centos)
> MTU : 1500 on eth0 and 9000 on eth1
>
> No issues at all mounting a lot of heavy traffic maildirs over NFS from
> a Netapp for nearly two months. Same holds true for a MySQL DB on a
> Netapp over iSCSI.
>
> Cheers,
>
> Robert
>
>
>
> > -----Message d'origine-----
> > De : linux-poweredge-bounces at lists.us.dell.com
> > [mailto:linux-poweredge-bounces at lists.us.dell.com] De la part
> > de Matt Bernstein
> > Envoyé : vendredi, 22. mai 2009 13:16
> > À : Peter Hopfgartner
> > Cc : Sergio Segala; linux-poweredge at lists.us.dell.com
> > Objet : Re: Random server reboot after update to CentOS 5.3
> >
> > On May 21 Peter Hopfgartner wrote:
> >
> > > We upgraded a Dell Poweredge PE 1950 Server the 8th of May.
> > Since then
> > > the server rebooted 3 times without external cause (it is
> > located in a
> > > server farm with redundant power supply etc.). Looking at
> > the servers
> > > monitoring infrastructure with Dell's own OpenManage tools, I get
> > > strange errors:
> >
> > What's your MTU?
> >
> > I've seen the combination of bnx2 + 9000-byte MTU + CentOS
> > 5.3 panic the kernel. (PE 6950, R710, R805).
> >
> > Either dropping the MTU back to 1500 or dropping the kernel
> > back to CentOS
> > 5.2 stops the panics happening.
> >
> > It might be related to
> > <https://bugzilla.redhat.com/show_bug.cgi?id=482747>. I don't know.
> >
> > If anyone else knows of a proper fix, I'm all ears.
> >
> > Matt
> >
> > _______________________________________________
> > Linux-PowerEdge mailing list
> > Linux-PowerEdge at lists.us.dell.com
> > https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> > Please read the FAQ at http://lists.us.dell.com/faq
> >
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at lists.us.dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
More information about the Linux-PowerEdge
mailing list