linux + Xen + bnx2 + bonding

G.Bakalarski at icm.edu.pl G.Bakalarski at icm.edu.pl
Tue Mar 20 10:22:46 CDT 2012


Dear ALL.

We have strange problems using bonding module on our DELL R815s farm.

Hardware:

Stacked 2 Juniper Ex-4200 switches (from DELL ;) )
Bunch of R815s - 2 1GBit ports connected (1 to each physical switch enclosure)
The R815s have Broadcom 5709 NICs.


Software:

Xen 4.1.2
Linux with kernel 3.2.0 (2.6.32-5-adm64-xen from Debian also tested)
In dom0 we have bridges on VLANs and on top bonded interfaces
with virtal interfaces for domUs.

Example:
#> ip a

 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
16: vif-pub-dom1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc
pfifo_fast master eth-pub state UP qlen 32
    link/ether fe:ff:ff:ff:ff:ff brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fcff:ffff:feff:ffff/64 scope link
       valid_lft forever preferred_lft forever
17: vif-mon-dom1: <BROADCAST,MULTICAST,PROMISC,UP,LOWER_UP> mtu 1500 qdisc
pfifo_fast master eth-mon state UP qlen 32
    link/ether fe:ff:ff:ff:ff:ff brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fcff:ffff:feff:ffff/64 scope link
       valid_lft forever preferred_lft forever
42: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UP
    link/ether 14:fe:b5:ca:4e:d5 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::16fe:b5ff:feca:4ed5/64 scope link
       valid_lft forever preferred_lft forever
44: eth-pub: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 14:fe:b5:ca:4e:d5 brd ff:ff:ff:ff:ff:ff
    inet6 2001:6a0:0:21::d0:20/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::16fe:b5ff:feca:4ed5/64 scope link
       valid_lft forever preferred_lft forever
45: bond0.21 at bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
master eth-pub state UP
    link/ether 14:fe:b5:ca:4e:d5 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::16fe:b5ff:feca:4ed5/64 scope link
       valid_lft forever preferred_lft forever
46: eth-mon: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 14:fe:b5:ca:4e:d5 brd ff:ff:ff:ff:ff:ff
    inet6 2001:6a0:1021::2:2000/112 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::16fe:b5ff:feca:4ed5/64 scope link
       valid_lft forever preferred_lft forever
47: bond0.402 at bond0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
master eth-mon state UP
    link/ether 14:fe:b5:ca:4e:d5 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::16fe:b5ff:feca:4ed5/64 scope link
       valid_lft forever preferred_lft forever
48: eth0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master
bond0 state UP qlen 1000
    link/ether 14:fe:b5:ca:4e:d5 brd ff:ff:ff:ff:ff:ff
49: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master
bond0 state UP qlen 1000
    link/ether 14:fe:b5:ca:4e:d5 brd ff:ff:ff:ff:ff:ff
50: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 14:fe:b5:ca:4e:d9 brd ff:ff:ff:ff:ff:ff
51: eth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 14:fe:b5:ca:4e:db brd ff:ff:ff:ff:ff:ff


--------------------------------------------------------

bridges:
#> brctl show
bridge name	bridge id		STP enabled	interfaces
eth-mon		8000.14feb5ca4ed5	no		bond0.402
							vif-mon-dom1
eth-pub		8000.14feb5ca4ed5	no		bond0.21
							vif-pub-dom1

----------------------------------------------------------

bonding details:


#> cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: fast
Min links: 0
Aggregator selection policy (ad_select): stable
Active Aggregator Info:
	Aggregator ID: 3
	Number of ports: 2
	Actor Key: 17
	Partner Key: 21
	Partner Mac Address: 2c:21:72:9e:b0:80

Slave Interface: eth0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 14:fe:b5:ca:4e:d5
Aggregator ID: 3
Slave queue ID: 0

Slave Interface: eth1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 14:fe:b5:ca:4e:d7
Aggregator ID: 3
Slave queue ID: 0
--------------------------------

switch is configured with link aggregaation + LACP

---------------------------------------------------
in dmesg we can see:


[617897.820090] ------------[ cut here ]------------
[617897.820106] WARNING: at
/mnt/linux-2.6-3.2.6/debian/build/source_amd64_none/net/sched/sch_generic.c:255
dev_watchdog+0xe9/0x148()
[617897.820111] Hardware name: PowerEdge R815
[617897.820115] NETDEV WATCHDOG: eth1 (bnx2): transmit queue 0 timed out
[617897.820119] Modules linked in: bonding bnx2 xt_physdev xen_netback
xen_blkback ebt_ip ebt_ip6 ebtable_filter ebtables bridge xen_evtchn xenfs
dm_round_robin dm_multipath scsi_dh ipmi_si ipmi_devintf ipmi_msghandler 8021q
garp stp snd_pcm snd_timer snd soundcore snd_page_alloc pcspkr psmouse
serio_raw evdev joydev sp5100_tco tpm_tis tpm dcdbas tpm_bios amd64_edac_mod
edac_core edac_mce_amd k10temp acpi_power_meter button processor thermal_sys
xfs dm_mod nfs lockd fscache auth_rpcgss nfs_acl sunrpc xt_tcpudp xt_state
ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables
ip6t_REJECT nf_conntrack_ipv6 nf_conntrack nf_defrag_ipv6 ip6table_filter
ip6_tables x_tables usbhid hid sg sr_mod cdrom sd_mod ses crc_t10dif enclosure
ahci libahci libata lpfc scsi_transport_fc scsi_tgt ohci_hcd ehci_hcd
megaraid_sas usbcore usb_common scsi_mod [last unloaded: bnx2]
[617897.820257] Pid: 0, comm: swapper/0 Tainted: G        W    3.2.0-1-amd64 #1
[617897.820261] Call Trace:
[617897.820264]  <IRQ>  [<ffffffff81046465>] ? warn_slowpath_common+0x78/0x8c
[617897.820283]  [<ffffffff81046511>] ? warn_slowpath_fmt+0x45/0x4a
[617897.820290]  [<ffffffff81291f15>] ? netif_tx_lock+0x40/0x72
[617897.820297]  [<ffffffff81292076>] ? dev_watchdog+0xe9/0x148
[617897.820305]  [<ffffffff81051af4>] ? run_timer_softirq+0x19a/0x261
[617897.820311]  [<ffffffff81291f8d>] ? netif_tx_unlock+0x46/0x46
[617897.820318]  [<ffffffff8104ba54>] ? __do_softirq+0xb9/0x177
[617897.820326]  [<ffffffff8120a0ab>] ? __xen_evtchn_do_upcall+0x1b5/0x1f2
[617897.820334]  [<ffffffff8133e8ec>] ? call_softirq+0x1c/0x30
[617897.820342]  [<ffffffff8100f875>] ? do_softirq+0x3c/0x7b
[617897.820348]  [<ffffffff8104bcbc>] ? irq_exit+0x3c/0x9a
[617897.820354]  [<ffffffff8120b675>] ? xen_evtchn_do_upcall+0x27/0x32
[617897.820360]  [<ffffffff8133e93e>] ? xen_do_hypervisor_callback+0x1e/0x30
[617897.820363]  <EOI>  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000
[617897.820374]  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1000
[617897.820382]  [<ffffffff8100663a>] ? xen_safe_halt+0xc/0x13
[617897.820389]  [<ffffffff81014448>] ? default_idle+0x47/0x7f
[617897.820395]  [<ffffffff8100d25f>] ? cpu_idle+0xaf/0xf2
[617897.820402]  [<ffffffff81687b38>] ? start_kernel+0x3b8/0x3c3
[617897.820408]  [<ffffffff8168963b>] ? xen_start_kernel+0x586/0x58c
[617897.820412] ---[ end trace a7919e7f17c0a757 ]---


and


[617897.820427] bnx2 0000:01:00.1: eth1: DEBUG: intr_sem[0] PCI_CMD[00180006]
[617897.824071] bnx2 0000:01:00.1: eth1: DEBUG: PCI_PM[19002008]
PCI_MISC_CFG[92000088]
[617897.824071] bnx2 0000:01:00.1: eth1: DEBUG: EMAC_TX_STATUS[00000008]
EMAC_RX_STATUS[00000000]
[617897.824071] bnx2 0000:01:00.1: eth1: DEBUG: RPM_MGMT_PKT_CTRL[40000088]
[617897.824071] bnx2 0000:01:00.1: eth1: DEBUG:
HC_STATS_INTERRUPT_STATUS[01fe0001]
[617897.824071] bnx2 0000:01:00.1: eth1: <--- start MCP states dump --->
[617897.824071] bnx2 0000:01:00.1: eth1: DEBUG: MCP_STATE_P0[0003610e]
MCP_STATE_P1[0003610e]
[617897.824071] bnx2 0000:01:00.1: eth1: DEBUG: MCP mode[0000b880]
state[80000000] evt_mask[00000500]
[617897.824071] bnx2 0000:01:00.1: eth1: DEBUG: pc[0800c6c8] pc[0800d7d4]
instr[ac620038]
[617897.824071] bnx2 0000:01:00.1: eth1: DEBUG: shmem states:
[617897.824071] bnx2 0000:01:00.1: eth1: DEBUG: drv_mb[0103000f]
fw_mb[0000000f] link_status[0000006f] drv_pulse_mb[00001d40]
[617897.824071] bnx2 0000:01:00.1: eth1: DEBUG: dev_info_signature[44564903]
reset_type[01005254] condition[0003610e]
[617897.824071] bnx2 0000:01:00.1: eth1: DEBUG: 000003cc: 44444444 44444444
44444444 00000a28
[617897.824071] bnx2 0000:01:00.1: eth1: DEBUG: 000003dc: 000cffff 00000000
ffff0000 00000000
[617897.824071] bnx2 0000:01:00.1: eth1: DEBUG: 000003ec: 00000000 00000000
00000000 00000000
[617897.824071] bnx2 0000:01:00.1: eth1: DEBUG: 0x3fc[0000ffff]
[617897.824071] bnx2 0000:01:00.1: eth1: <--- end MCP states dump --->
[617898.236955] bnx2 0000:01:00.1: eth1: NIC Copper Link is Down
[617898.268155] bonding: bond0: link status definitely down for interface
eth1, disabling it


The network is very unstable: from timeouts or IPv6 broadcasting somehow
filtered/dropped (some hosts not responding to neighbour requests), through
one or other VLAN not responding, up to both physical interfaces totally down
....

>From immediate "no connection", through problems after 1.5 hour,  up to strange
behaviuor after 5 days ....

Without bonding network was much more stable (i.e.  with only eth0 UP)
- 1-2 months without problmes, however messages in dmesg where also present -
without bonding ...

bnx2 version:
2.1.11

The server are not heavily loaded or dont have high network throughput ....

ANY HELP HIGLY APPRECIATED !

GB



More information about the Linux-PowerEdge mailing list