RHEL 5.3 & Multipathing question
emsearcy at gmail.com
Thu Feb 18 16:24:05 CST 2010
On Feb 17, 2010, at 4:13 AM, Brian O'Mahony wrote:
> I am setting up a RHEL5.3 machine connected to an iSCSI SAN (PS6000). The machine is a PE2850. I have two onboard network ports and an Intel NIC with two ports. One from each is connected to our lan, and the other on each is connected to our SAN network, which is segregated from everything else.
> This is my first time setting up access for something with fault tolerance (previously it was all just for testbeds etc). I *had* originally set up the two SAN nics as a bond with the failover set to active-passive.
> As I was reading more documentation, I came across multipathing, and I am wondering if it is needed in my case. The machine is going to be the only machine connected to the LUN presented by the PS6000. The LUN is 500Gb, and this will be chopped down further using the OS (either ext3 or ext4) into 10x50Gb logical volumes.
Bonding and dm-multipath don't go together as near as I can tell. With dm-multipath you need at least two devices (paths), and with active-backup bonding, I don't think you will have two distinct devices. If you've configured your client drivers to scan for volumes on, say, bond0, you should only "see" one device in my experience.
> Is multipath really needed and/or necessary in this case? Why?
So, in the case where you're using bonding, I'd say it's dichotomous. As for whether you'd be better off using dm-multipath *instead*, one point is that even with the higher levels of bonding that provide load balancing (like mode=5), you usually can't split traffic destined for the same IP, though I suppose you might be able to share different volumes on different IPs/MACs with your iSCSI server?
In terms of failover, I don't actually know how fault-tolerant bonding is when used with iSCSI. I think it would depend on what you set your miimon interval too, and whether or not TCP for iSCSI would ensure reliable delivery across an outage of that threshold (at which point the gratuitous ARP should have updated the layer 2 routing table in the switch and the ARP table on the iSCSI server and communication would resume).
At one point I was using mode=1 bonding with AoE, but I haven't run iSCSI over bonding. Since AoE is below the IP layer, I'd assume that iSCSI would work even better with mode=1 bonding since the IP layer should provide delivery insurance (though the AoE driver might have been doing it's own error detection/correction for lost packets). But I thought I'd throw in my 2c anyhow since the last email seemed to be more about why you'd want redundancy/load-balancing rather than addressing the active-passive bonding comment.
More information about the Linux-PowerEdge