Channel bonding problems with 1950s and 2748 switch
Thomas_Chenault at Dell.com
Thomas_Chenault at Dell.com
Thu Jan 31 19:28:15 CST 2008
> It's my understanding that bonding modes 0, 1, and 2 require link
aggregation on the switch
Modes 0 (balance-rr) and 2 (balance-xor) require link aggregation on the
switch. Link aggregation should not be configured on the switch for mode
1 (active-backup), mode 5 (balance-tlb), or mode 6 (balance-alb).
> I never seem to get more than the throughput a single card would
provide
This is normally attributable to the testing procedure used and the way
most of the bonding modes work. In balance-xor (with the traditional
hashing policy), balance-alb, and balance-tlb, a single slave interface
is selected to communicate with any given host. This means that the data
rate through the bond to any single host will never exceed the capacity
of a single interface. If testing is conducted such that the bond is
used to communicate with multiple hosts simultaneously, it can be shown
that the total throughput of the bond exceeds the capacity of a single
slave.
The balance-rr mode can exceed the capacity of a single interface in
communication with a single host. In many applications this potential is
never truly realized. The problem is that TCP expects segments to arrive
in order and this rarely happens when they are being transmitted from
multiple interfaces. The outbound data rate of the bond may be higher,
but often this is due to retransmissions rather than useful
communications.
In recent versions of bonding, 2.6.3 and later, a xmit_hash_policy
option exists. The purpose of this option is to allow balance-xor mode
to exceed the capacity of a single slave in communication with a single
host. This is effective only if the traffic can be broken into two or
more separate streams.
The bonding.txt file included in the Linux kernel sources and with most
distributions provides more detailed information about each of the
bonding modes, their uses, and limitations.
Thomas
-----Original Message-----
From: linux-poweredge-bounces at dell.com
[mailto:linux-poweredge-bounces at dell.com] On Behalf Of Stephen Childs
Sent: Wednesday, January 30, 2008 2:50 AM
To: linux-poweredge-Lists
Subject: Channel bonding problems with 1950s and 2748 switch
Hi,
I have spent a few fruitless days trying to get channel bonding to work
on
our new cluster. The machines are the latest revision PE 1950s and the
switch is the Dell PowerConnect 2748 switch. The network driver in the
1950s is as follows:
eth0: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz
I have tried every channel bonding mode by now I think, both with link
aggregation groups set up on the switch and without. I never seem to get
more than the throughput a single card would provide (~940 Mbps or 990
with jumbo frames and other optimisations). When I look at the output of
netstat, I see that the transmit traffic seems to be evenly distributed,
but the receive traffic is mainly being processed by one card.
It's my understanding that bonding modes 0, 1, and 2 require link
aggregation on the switch, the 2748 doesn't seem to support the LACP
protocol needed for mode 4, and modes 5 and 6 shouldn't require switch
support.
As a sanity check I connected the two machines with two crossover cables
and in this configuration I was able to get ~1.96 Gbps using the
round-robin mode. (However, balance-alb didn't seem to provide any
increase in throughput.)
Does anyone else have experience with this switch and NIC? Any tips?
Stephen
--
Dr. Stephen Childs,
Research Fellow, EGEE Project, phone:
+353-1-8961797
Computer Architecture Group, email: Stephen.Childs @
cs.tcd.ie
Trinity College Dublin, Ireland web:
http://www.cs.tcd.ie/Stephen.Childs
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge at dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
More information about the Linux-PowerEdge
mailing list