PDA

View Full Version : Issue with Multiple balance-tlb Bonds



cjhsa
31-Jan-2012, 18:06
Hello,

I have a quantity of SLES 11 SP1 servers that require multiple bonded
interfaces. In this example, eth0 and eth2 are bond0, and eth1 and eth3
are bond1. HP DL380 G7 using the onboard ports. There are two servers,
one is using balance-tlb bonding on both bonds, the other uses
balance-tlb on bond0 and active-backup on bond1.

FYI:

# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window
irtt Iface
10.20.210.0 0.0.0.0 255.255.255.0 U 0 0
0 bond0
10.20.200.0 0.0.0.0 255.255.255.0 U 0 0
0 bond1
127.0.0.0 0.0.0.0 255.0.0.0 U 0 0
0 lo
0.0.0.0 10.20.210.1 0.0.0.0 UG 0 0
0 bond0

# cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)

Bonding Mode: transmit load balancing
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 50
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Link Failure Count: 0
Permanent HW addr: 44:1e:a1:02:18:f8

Slave Interface: eth3
MII Status: up
Link Failure Count: 0
Permanent HW addr: 44:1e:a1:02:18:fc


The problem I'm seeing is that ping over the bond1 using balance-tlb
drops packets, and it's reproducible:

249 packets transmitted, 239 received, 4% packet loss, time 248005ms
rtt min/avg/max/mdev = 0.163/0.567/44.163/2.836 ms

When it occurs, it always drops 10 packets very near the beginning of
the test. It is reproducible IF you wait a while before you test again.

Oddly, I don't see any similar behavior on bond0, which uses
balance-tlb on both servers, nor do I have any issue if I change bond1
to use active-backup bonding.

Any thoughts are greatly appreciated.


--
cjhsa
------------------------------------------------------------------------
cjhsa's Profile: http://forums.novell.com/member.php?userid=94268
View this thread: http://forums.novell.com/showthread.php?t=451588

cjhsa
01-Feb-2012, 14:56
Anyone? Could this be a routing problem? bond0 is the default while
bond1 is supposed to serve only the 200 VLAN. What could cause the
second bond to always drop 10 packets on a ping test after an extended
period of inactivity?


--
cjhsa
------------------------------------------------------------------------
cjhsa's Profile: http://forums.novell.com/member.php?userid=94268
View this thread: http://forums.novell.com/showthread.php?t=451588

Automatic reply
10-Feb-2012, 20:26
cjhsa,

It appears that in the past few days you have not received a response to your
posting. That concerns us, and has triggered this automated reply.

Has your issue been resolved? If not, you might try one of the following options:

- Visit http://www.suse.com/support and search the knowledgebase and/or check all
the other support options available.
- You could also try posting your message again. Make sure it is posted in the
correct newsgroup. (http://forums.suse.com)

Be sure to read the forum FAQ about what to expect in the way of responses:
http://forums.suse.com/faq.php

If this is a reply to a duplicate posting, please ignore and accept our apologies
and rest assured we will issue a stern reprimand to our posting bot.

Good luck!

Your SUSE Forums Team
http://forums.suse.com