Returning Network stability problems on R710 servers and BCM5709

Big Wave Dave bigwavedave at gmail.com
Tue Feb 9 15:25:02 CST 2010


We had to do the same thing on R410's.  We ran into a situation where
the one of the interfaces would stop working, and ethtool would show
"no link".  Restarting network wouldn't fix anything, only a reboot
would resolve the problem.  Moving to the Broadcomm provided driver
has solved this for us.

Dave

On Wed, Feb 3, 2010 at 6:30 AM, Carlson, Timothy S
<Timothy.Carlson at pnl.gov> wrote:
> I've moved away from the RHEL/Centos driver and have gone directly to the bnx2 driver from Broadcomm.
>
> dmesg | grep bnx
> Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v1.9.20b (July 9, 2009)
> bnx2: eth0: using MSI
>
> That driver seems stable for me. I was seeing your things similar to your problem and this driver fixed things right up for me.
>
> http://www.broadcom.com/support/ethernet_nic/driver-sla.php?driver=NX2-Linux
>
> You'll need to download that driver and rebuild it from the SRPM. You'll also need to rebuild the driver for each kernel update which is a pain.
>
> Tim
>
> -----Original Message-----
> From: linux-poweredge-bounces at dell.com [mailto:linux-poweredge-bounces at dell.com] On Behalf Of James Sparenberg
> Sent: Wednesday, February 03, 2010 1:08 AM
> To: linux-poweredge at dell.com
> Subject: Returning Network stability problems on R710 servers and BCM5709
>
> All,
>
>   I'm referencing an earlier thread from last Sept.
>
> http://lists.us.dell.com/pipermail/linux-poweredge/2009-September/040252.html
>
>   In it there was a discussion related to stability problems with the Broadcom BCM5709 on a Dell r610, where there would be a loss of connectivity for new connections but existing connections, or all connections of a different protocol passed.
>
> For example.  Just now I lost the ability to ping eth0, or get NIS authentication on that IP, I also lost the ability to get TFTP connections via the eth1 address.  However at the same time DHCP is running against eth1, and SNMP NTP and HTTP over port 10000 (webmin) where merrily working quite well on eth0.
>
> OS CentOS 5.4
>
> kernel   2.6.18-164.10.1.el5 SMP x86_64
> Kernel module bk2
>
> modinfo output
>
> filename:       /lib/modules/2.6.18-164.10.1.el5.centos.plus/kernel/drivers/net/bnx2.ko
> version:        1.9.3
> license:        GPL
> description:    Broadcom NetXtreme II BCM5706/5708/5709/5716 Driver
> author:         Michael Chan <mchan at broadcom.com>
> srcversion:     1040A42F87B8BE8A019736C
> alias:          pci:v000014E4d0000163Csv*sd*bc*sc*i*
> alias:          pci:v000014E4d0000163Bsv*sd*bc*sc*i*
> alias:          pci:v000014E4d0000163Asv*sd*bc*sc*i*
> alias:          pci:v000014E4d00001639sv*sd*bc*sc*i*
> alias:          pci:v000014E4d000016ACsv*sd*bc*sc*i*
> alias:          pci:v000014E4d000016AAsv*sd*bc*sc*i*
> alias:          pci:v000014E4d000016AAsv0000103Csd00003102bc*sc*i*
> alias:          pci:v000014E4d0000164Csv*sd*bc*sc*i*
> alias:          pci:v000014E4d0000164Asv*sd*bc*sc*i*
> alias:          pci:v000014E4d0000164Asv0000103Csd00003106bc*sc*i*
> alias:          pci:v000014E4d0000164Asv0000103Csd00003101bc*sc*i*
> depends:
> vermagic:       2.6.18-164.10.1.el5.centos.plus SMP mod_unload gcc-4.1
> parm:           disable_msi:Disable Message Signaled Interrupt (MSI) (int)
> parm:           enable_entropy:Allow bnx2 to populate the /dev/random entropy pool (int)
> module_sig:     883f3504b47af9bd3b84a368dd51f2112b6b90a0ed1bac15e1b94720602336594dc65775db83c460991575cc8694cf9c03aca6e623e0950281e5094
>
> So you can see that the version I have exceeds the version said to be stable in the prior thread.  BTW this chassis is about 1 month old so it should (but unverified) have the latest BIOS.
>
> Ironic part.  Same model running the same version/kernel of CentOS (kick start install so all my boxes are the same) is running some load testing pushing millions of sessions and billions (soaking 4 1G nics) of packets without a hitch in our LAB, testing out equipment, yet, this box which has a relatively low throughput is the one that locks up.
>
> Any thoughts or suggestions would be appreciated.  So far nothing in normal logs so I'm going to turn some additional logging on.
>
> James Sparenberg
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>
> _______________________________________________
> Linux-PowerEdge mailing list
> Linux-PowerEdge at dell.com
> https://lists.us.dell.com/mailman/listinfo/linux-poweredge
> Please read the FAQ at http://lists.us.dell.com/faq
>



More information about the Linux-PowerEdge mailing list