So. Had an interesting problem happen. On a CentOS 5.5 system (2.6.18-194.26.1.el5), an attempt was made to establish a 3-way LACP-style (aka 802.3ad) bonded NIC interface (bond1). The configuration parameters set to what appear to be proper things. And yet, things weren’t working (wrong etherchannel mode is being picked). Relevant configuration(s):

$ cat /etc/modprobe.conf
alias eth0 igb
alias eth1 igb
alias eth2 igb
alias eth3 igb
alias scsi_hostadapter megaraid_sas
alias bond1 bonding
options bond1 mode=802.3ad miimon=100
alias bond0 bonding
options bond0 miimon=100 mode=1 max_bonds=2
options ib_ipoib send_queue_size=4096 recv_queue_size=4096
alias ib1 ib_ipoib
alias ib0 ib_ipoib

$ cat /etc/sysconfig/network-scripts/ifcfg-bond1
DEVICE=bond1
IPADDR=XXX.XXX.XXX.XXX
NETMASK=255.255.252.0
ONBOOT=yes
BOOTPROTO=none
USERCTL=no

$ cat /etc/sysconfig/network-scripts/ifcfg-eth[1-3]

Intel Corporation 82576 Gigabit Network Connection

DEVICE=eth1
HWADDR=12:34:56:78:9A:BC
USERCTL=no
ONBOOT=yes
MASTER=bond1
SLAVE=yes
BOOTPROTO=none

Intel Corporation 82576 Gigabit Network Connection

DEVICE=eth2
HWADDR=12:34:56:78:9A:BC
USERCTL=no
ONBOOT=yes
MASTER=bond1
SLAVE=yes
BOOTPROTO=none

Intel Corporation 82576 Gigabit Network Connection

DEVICE=eth3
HWADDR=12:34:56:78:9A:BC
USERCTL=no
ONBOOT=yes
MASTER=bond1
SLAVE=yes
BOOTPROTO=none

No errors of note were being logged in /var/log/messages, but on the switch, we were seeing this:

sh etherchannel summary | include Po57
57     Po57(SD)        LACP      Gi9/24(I)      Gi9/25(I)      Gi9/26(I)

This indicates that the port channel is down and the constituent interfaces are running in independent mode. Quite confusing.

What’s more, on the system itself:

$ cat /sys/class/net/bond1/bonding/mode
active-backup 1

This is wrong, based on the settings, but no clear cause of why that is.

Further debugging is required, but in the interim, the following worked:

ifdown bond1
echo 4 > /sys/class/net/bond1/bonding/mode
ifup bond1

Confirmed with:

$ cat /sys/class/net/bond1/bonding/mode
802.3ad 4` And on the switch level: `sh etherchannel summary | include Po57
57     Po57(SU)        LACP      Gi9/24(P)      Gi9/25(P)      Gi9/26(P)

Presto-chango, we now have a working LACP port channel. Still, something odd is afoot.

One theory is that /etc/modprobe.conf contains definitions for bond0 (an Infiniband pair) and it is setup for active-backup (aka ‘mode=1’) – perhaps the startup/init scripts are doing some weird thing and mis-picking up the bond0 settings? This is pure conjecture, in need of tracing through the scripts.