So. Had an interesting problem happen. On a CentOS 5.5 system (2.6.18-194.26.1.el5), an attempt was made to establish a 3-way LACP-style (aka 802.3ad) bonded NIC interface (bond1). The configuration parameters set to what appear to be proper things. And yet, things weren’t working (wrong etherchannel mode is being picked). Relevant configuration(s):
$ cat /etc/modprobe.conf alias eth0 igb alias eth1 igb alias eth2 igb alias eth3 igb alias scsi_hostadapter megaraid_sas alias bond1 bonding options bond1 mode=802.3ad miimon=100 alias bond0 bonding options bond0 miimon=100 mode=1 max_bonds=2 options ib_ipoib send_queue_size=4096 recv_queue_size=4096 alias ib1 ib_ipoib alias ib0 ib_ipoib $ cat /etc/sysconfig/network-scripts/ifcfg-bond1 DEVICE=bond1 IPADDR=XXX.XXX.XXX.XXX NETMASK=255.255.252.0 ONBOOT=yes BOOTPROTO=none USERCTL=no $ cat /etc/sysconfig/network-scripts/ifcfg-eth[1-3] Intel Corporation 82576 Gigabit Network Connection DEVICE=eth1 HWADDR=12:34:56:78:9A:BC USERCTL=no ONBOOT=yes MASTER=bond1 SLAVE=yes BOOTPROTO=none Intel Corporation 82576 Gigabit Network Connection DEVICE=eth2 HWADDR=12:34:56:78:9A:BC USERCTL=no ONBOOT=yes MASTER=bond1 SLAVE=yes BOOTPROTO=none Intel Corporation 82576 Gigabit Network Connection DEVICE=eth3 HWADDR=12:34:56:78:9A:BC USERCTL=no ONBOOT=yes MASTER=bond1 SLAVE=yes BOOTPROTO=none
No errors of note were being logged in /var/log/messages, but on the switch, we were seeing this:
sh etherchannel summary | include Po57 57 Po57(SD) LACP Gi9/24(I) Gi9/25(I) Gi9/26(I)
This indicates that the port channel is down and the constituent interfaces are running in independent mode. Quite confusing.
What’s more, on the system itself:
$ cat /sys/class/net/bond1/bonding/mode active-backup 1
This is wrong, based on the settings, but no clear cause of why that is.
Further debugging is required, but in the interim, the following worked:
ifdown bond1 echo 4 > /sys/class/net/bond1/bonding/mode ifup bond1
$ cat /sys/class/net/bond1/bonding/mode 802.3ad 4` And on the switch level: `sh etherchannel summary | include Po57 57 Po57(SU) LACP Gi9/24(P) Gi9/25(P) Gi9/26(P)
Presto-chango, we now have a working LACP port channel. Still, something odd is afoot.
One theory is that /etc/modprobe.conf contains definitions for bond0 (an Infiniband pair) and it is setup for active-backup (aka ‘mode=1’) – perhaps the startup/init scripts are doing some weird thing and mis-picking up the bond0 settings? This is pure conjecture, in need of tracing through the scripts.