History log of /drivers/net/bonding/bond_alb.c
Revision Date Author Comments
b99215cdc6e191f5649687536d4fb0faa3d7f56e 13-May-2012 David S. Miller <davem@davemloft.net> bonding: Fix LACPDU rx_dropped commit.

I applied the wrong version of Jiri's bonding fix in commit
13a8e0c8cdb43982372bd6c65fb26839c8fd8ce9 ("bonding: don't increase
rx_dropped after processing LACPDUs")

I applied v3, which introduces warnings I asked him to fix,
instead of v4 which properly takes care of those issues.

This inter-diffs such that the warnings are now gone.

Signed-off-by: David S. Miller <davem@davemloft.net>
e404decb0fb017be80552adee894b35307b6c7b4 29-Jan-2012 Joe Perches <joe@perches.com> drivers/net: Remove unnecessary k.alloc/v.alloc OOM messages

alloc failures use dump_stack so emitting an additional
out-of-memory message is an unnecessary duplication.

Remove the allocation failure messages.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b924551bed09f61b64f21bffe241afc5526b091a 18-Jan-2012 Jiri Bohac <jbohac@suse.cz> bonding: fix enslaving in alb mode when link down

bond_alb_init_slave() is called from bond_enslave() and sets the slave's MAC
address. This is done differently for TLB and ALB modes.
bond->alb_info.rlb_enabled is used to discriminate between the two modes but
this flag may be uninitialized if the slave is being enslaved prior to calling
bond_open() -> bond_alb_initialize() on the master.

It turns out all the callers of alb_set_slave_mac_addr() pass
bond->alb_info.rlb_enabled as the hw parameter.

This patch cleans up the unnecessary parameter of alb_set_slave_mac_addr() and
makes the function decide based on the bonding mode instead, which fixes the
above problem.

Reported-by: Narendra K <Narendra_K@Dell.com>
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
f515e6b77045b4b1f54617d9fbf4a22b95a58757 09-Jan-2012 Maxim Uvarov <maxim.uvarov@oracle.com> bond_alb: don't disable softirq under bond_alb_xmit

No need to lock soft irqs under bond_alb_xmit()
which already has softirq disabled.

Changes:
1. add non-bh/bh version to tlb_clear_slave()

2. represent BH and non BH hash table locks
_lock_rx_hashtbl_bh/_unlock_rx_hashtbl_bh
_lock_rx_hashtbl/_unlock_rx_hashtbl
_lock_tx_hashtbl_bh/_unlock_tx_hashtbl_bh
_lock_tx_hashtbl/_unlock_tx_hashtbl

Signed-off-by: Maxim Uvarov <maxim.uvarov@oracle.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e6d265e8504ab4a3368b8645d318b344ee88b280 28-Oct-2011 Jay Vosburgh <fubar@us.ibm.com> bonding: eliminate bond_close race conditions

This patch resolves two sets of race conditions.

Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> reported the
first, as follows:

The bond_close() calls cancel_delayed_work() to cancel delayed works.
It, however, cannot cancel works that were already queued in workqueue.
The bond_open() initializes work->data, and proccess_one_work() refers
get_work_cwq(work)->wq->flags. The get_work_cwq() returns NULL when
work->data has been initialized. Thus, a panic occurs.

He included a patch that converted the cancel_delayed_work calls
in bond_close to flush_delayed_work_sync, which eliminated the above
problem.

His patch is incorporated, at least in principle, into this
patch. In this patch, we use cancel_delayed_work_sync in place of
flush_delayed_work_sync, and also convert bond_uninit in addition to
bond_close.

This conversion to _sync, however, opens new races between
bond_close and three periodically executing workqueue functions:
bond_mii_monitor, bond_alb_monitor and bond_activebackup_arp_mon.

The race occurs because bond_close and bond_uninit are always
called with RTNL held, and these workqueue functions may acquire RTNL to
perform failover-related activities. If bond_close or bond_uninit is
waiting in cancel_delayed_work_sync, deadlock occurs.

These deadlocks are resolved by having the workqueue functions
acquire RTNL conditionally. If the rtnl_trylock() fails, the functions
reschedule and return immediately. For the cases that are attempting to
perform link failover, a delay of 1 is used; for the other cases, the
normal interval is used (as those activities are not as time critical).

Additionally, the bond_mii_monitor function now stores the delay
in a variable (mimicing the structure of activebackup_arp_mon).

Lastly, all of the above renders the kill_timers sentinel moot,
and therefore it has been removed.

Tested-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a0db2dad0935e798973bb79676e722b82f177206 23-Sep-2011 Andy Gospodarek <andy@greyhouse.net> bonding: properly stop queuing work when requested

During a test where a pair of bonding interfaces using ARP monitoring
were both brought up and torn down (with an rmmod) repeatedly, a panic
in the timer code was noticed. I tracked this down and determined that
any of the bonding functions that ran as workqueue handlers and requeued
more work might not properly exit when the module was removed.

There was a flag protected by the bond lock called kill_timers that is
set when the interface goes down or the module is removed, but many of
the functions that monitor link status now unlock the bond lock to take
rtnl first. There is a chance that another CPU running the rmmod could
get the lock and set kill_timers after the first check has passed.

This patch does not allow any function to queue work that will make
itself run unless kill_timers is not set. I also noticed while doing
this work that bond_resend_igmp_join_requests did not have a check for
kill_timers, so I added the needed call there as well.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Reported-by: Liang Zheng <lzheng@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cc0e40700656b09d93b062ef6c818aa45429d09a 20-Jul-2011 Jiri Pirko <jpirko@redhat.com> bonding: do vlan cleanup

Now when all devices are cleaned up, bond can be cleaned up as well

- remove bond->vlgrp
- remove bond_vlan_rx_register
- substitute necessary occurences of vlan_group_get_device

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9fe0617d9b6d21f700ee9e658e1c9fe3be2fb402 25-May-2011 Neil Horman <nhorman@tuxdriver.com> bonding: prevent deadlock on slave store with alb mode (v3)

This soft lockup was recently reported:

[root@dell-per715-01 ~]# echo +bond5 > /sys/class/net/bonding_masters
[root@dell-per715-01 ~]# echo +eth1 > /sys/class/net/bond5/bonding/slaves
bonding: bond5: doing slave updates when interface is down.
bonding bond5: master_dev is not up in bond_enslave
[root@dell-per715-01 ~]# echo -eth1 > /sys/class/net/bond5/bonding/slaves
bonding: bond5: doing slave updates when interface is down.

BUG: soft lockup - CPU#12 stuck for 60s! [bash:6444]
CPU 12:
Modules linked in: bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc
be2d
Pid: 6444, comm: bash Not tainted 2.6.18-262.el5 #1
RIP: 0010:[<ffffffff80064bf0>] [<ffffffff80064bf0>]
.text.lock.spinlock+0x26/00
RSP: 0018:ffff810113167da8 EFLAGS: 00000286
RAX: ffff810113167fd8 RBX: ffff810123a47800 RCX: 0000000000ff1025
RDX: 0000000000000000 RSI: ffff810123a47800 RDI: ffff81021b57f6f8
RBP: ffff81021b57f500 R08: 0000000000000000 R09: 000000000000000c
R10: 00000000ffffffff R11: ffff81011d41c000 R12: ffff81021b57f000
R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000282
FS: 00002b3b41ef3f50(0000) GS:ffff810123b27940(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002b3b456dd000 CR3: 000000031fc60000 CR4: 00000000000006e0

Call Trace:
[<ffffffff80064af9>] _spin_lock_bh+0x9/0x14
[<ffffffff886937d7>] :bonding:tlb_clear_slave+0x22/0xa1
[<ffffffff8869423c>] :bonding:bond_alb_deinit_slave+0xba/0xf0
[<ffffffff8868dda6>] :bonding:bond_release+0x1b4/0x450
[<ffffffff8006457b>] __down_write_nested+0x12/0x92
[<ffffffff88696ae4>] :bonding:bonding_store_slaves+0x25c/0x2f7
[<ffffffff801106f7>] sysfs_write_file+0xb9/0xe8
[<ffffffff80016b87>] vfs_write+0xce/0x174
[<ffffffff80017450>] sys_write+0x45/0x6e
[<ffffffff8005d28d>] tracesys+0xd5/0xe0

It occurs because we are able to change the slave configuarion of a bond while
the bond interface is down. The bonding driver initializes some data structures
only after its ndo_open routine is called. Among them is the initalization of
the alb tx and rx hash locks. So if we add or remove a slave without first
opening the bond master device, we run the risk of trying to lock/unlock a
spinlock that has garbage for data in it, which results in our above softlock.

Note that sometimes this works, because in many cases an unlocked spinlock has
the raw_lock parameter initialized to zero (meaning that the kzalloc of the
net_device private data is equivalent to calling spin_lock_init), but thats not
true in all cases, and we aren't guaranteed that condition, so we need to pass
the relevant spinlocks through the spin_lock_init function.

Fix it by moving the spin_lock_init calls for the tx and rx hashtable locks to
the ndo_init path, so they are ready for use by the bond_store_slaves path.

Change notes:
v2) Based on conversation with Jay and Nicolas it seems that the ability to
enslave devices while the bond master is down should be safe to do. As such
this is an outlier bug, and so instead we'll just initalize the errant spinlocks
in the init path rather than the open path, solving the problem. We'll also
remove the warnings about the bond being down during enslave operations, since
it should be safe

v3) Fix spelling error

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: jtluka@redhat.com
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: nicolas.2p.debian@gmail.com
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
0693e88e6ccf615d9674548d8b924cdd9a1c976c 07-May-2011 Michał Mirosław <mirq-linux@rere.qmqm.pl> net: bonding: factor out rlock(bond->lock) in xmit path

Pull read_lock(&bond->lock) and BOND_IS_OK() to bond_start_xmit() from
mode-dependent xmit functions.

netif_running() is always true in hard_start_xmit.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
3aba891dde3842d89ad022237b99c1ed308040b0 19-Apr-2011 Jiri Pirko <jpirko@redhat.com> bonding: move processing of recv handlers into handle_frame()

Since now when bonding uses rx_handler, all traffic going into bond
device goes thru bond_handle_frame. So there's no need to go back into
bonding code later via ptype handlers. This patch converts
original ptype handlers into "bonding receive probes". These functions
are called from bond_handle_frame and they are registered per-mode.

Note that vlan packets are also handled because they are always untagged
thanks to vlan_untag()

Note that this also allows arpmon for eth-bond-bridge-vlan topology.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
77c8e2c01542649f7a02fef8eb3b3d0e7fed6bbd 11-Apr-2011 Peter Pan(潘卫平) <panweiping3@gmail.com> bonding:fix two typos

replace relpy with reply.
replace premanent with permanent.

Signed-off-by: Weiping Pan(潘卫平) <panweiping3@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
38dbaf0afb518e462de7afca552acad048237a73 08-Apr-2011 Peter Pan(潘卫平) <panweiping3@gmail.com> bonding:set save_load to 0 when initializing

It is unnecessary to set save_load to 1 here,
as the tx_hashtbl is just kzalloced.

Signed-off-by: Weiping Pan(潘卫平) <panweiping3@gmail.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e364a3416d81c7717dd642dc9b3ab132b7885f66 28-Feb-2011 Amerigo Wang <amwang@redhat.com> bonding: use the correct size for _simple_hash()

Clearly it should be the size of ->ip_dst here.
Although this is harmless, but it still reads odd.

Signed-off-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b30532515f0a62bfe17207ab00883dd262497006 20-Jan-2011 Neil Horman <nhorman@tuxdriver.com> bonding: Ensure that we unshare skbs prior to calling pskb_may_pull

Recently reported oops:

kernel BUG at net/core/skbuff.c:813!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/virtual/net/bond0/broadcast
CPU 8
Modules linked in: sit tunnel4 cpufreq_ondemand acpi_cpufreq freq_table bonding
ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801
i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma i7core_edac edac_core bnx2
ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase
scsi_transport_sas dm_mod [last unloaded: microcode]

Modules linked in: sit tunnel4 cpufreq_ondemand acpi_cpufreq freq_table bonding
ipv6 dm_mirror dm_region_hash dm_log cdc_ether usbnet mii serio_raw i2c_i801
i2c_core iTCO_wdt iTCO_vendor_support shpchp ioatdma i7core_edac edac_core bnx2
ixgbe dca mdio sg ext4 mbcache jbd2 sd_mod crc_t10dif mptsas mptscsih mptbase
scsi_transport_sas dm_mod [last unloaded: microcode]
Pid: 0, comm: swapper Not tainted 2.6.32-71.el6.x86_64 #1 BladeCenter HS22
-[7870AC1]-
RIP: 0010:[<ffffffff81405b16>] [<ffffffff81405b16>]
pskb_expand_head+0x36/0x1e0
RSP: 0018:ffff880028303b70 EFLAGS: 00010202
RAX: 0000000000000002 RBX: ffff880c6458ec80 RCX: 0000000000000020
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880c6458ec80
RBP: ffff880028303bc0 R08: ffffffff818a6180 R09: ffff880c6458ed64
R10: ffff880c622b36c0 R11: 0000000000000400 R12: 0000000000000000
R13: 0000000000000180 R14: ffff880c622b3000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000038653452a4 CR3: 0000000001001000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffff8806649c2000, task ffff880c64f16ab0)
Stack:
ffff880028303bc0 ffffffff8104fff9 000000000000001c 0000000100000000
<0> ffff880000047d80 ffff880c6458ec80 000000000000001c ffff880c6223da00
<0> ffff880c622b3000 0000000000000000 ffff880028303c10 ffffffff81407f7a
Call Trace:
<IRQ>
[<ffffffff8104fff9>] ? __wake_up_common+0x59/0x90
[<ffffffff81407f7a>] __pskb_pull_tail+0x2aa/0x360
[<ffffffffa0244530>] bond_arp_rcv+0x2c0/0x2e0 [bonding]
[<ffffffff814a0857>] ? packet_rcv+0x377/0x440
[<ffffffff8140f21b>] netif_receive_skb+0x2db/0x670
[<ffffffff8140f788>] napi_skb_finish+0x58/0x70
[<ffffffff8140fc89>] napi_gro_receive+0x39/0x50
[<ffffffffa01286eb>] ixgbe_clean_rx_irq+0x35b/0x900 [ixgbe]
[<ffffffffa01290f6>] ixgbe_clean_rxtx_many+0x136/0x240 [ixgbe]
[<ffffffff8140fe53>] net_rx_action+0x103/0x210
[<ffffffff81073bd7>] __do_softirq+0xb7/0x1e0
[<ffffffff810d8740>] ? handle_IRQ_event+0x60/0x170
[<ffffffff810142cc>] call_softirq+0x1c/0x30
[<ffffffff81015f35>] do_softirq+0x65/0xa0
[<ffffffff810739d5>] irq_exit+0x85/0x90
[<ffffffff814cf915>] do_IRQ+0x75/0xf0
[<ffffffff81013ad3>] ret_from_intr+0x0/0x11
<EOI>
[<ffffffff8101bc01>] ? mwait_idle+0x71/0xd0
[<ffffffff814cd80a>] ? atomic_notifier_call_chain+0x1a/0x20
[<ffffffff81011e96>] cpu_idle+0xb6/0x110
[<ffffffff814c17c8>] start_secondary+0x1fc/0x23f

Resulted from bonding driver registering packet handlers via dev_add_pack and
then trying to call pskb_may_pull. If another packet handler (like for AF_PACKET
sockets) gets called first, the delivered skb will have a user count > 1, which
causes pskb_may_pull to BUG halt when it does its skb_shared check. Fix this by
calling skb_share_check prior to the may_pull call sites in the bonding driver
to clone the skb when needed. Tested by myself and the reported successfully.

Signed-off-by: Neil Horman
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
411204a5a1ec1a35363d8ef450c77e2b8235da4d 12-Dec-2010 Taku Izumi <izumi.taku@jp.fujitsu.com> bonding: migrate some macros from bond_alb.c to bond_alb.h

This patch simply migrates some macros from bond_alb.c to bond_alb.h.

Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ab12811c89e88f2e66746790b1fe4469ccb7bdd9 10-Sep-2010 Andy Gospodarek <andy@greyhouse.net> bonding: correctly process non-linear skbs

It was recently brought to my attention that 802.3ad mode bonds would no
longer form when using some network hardware after a driver update.
After snooping around I realized that the particular hardware was using
page-based skbs and found that skb->data did not contain a valid LACPDU
as it was not stored there. That explained the inability to form an
802.3ad-based bond. For balance-alb mode bonds this was also an issue
as ARPs would not be properly processed.

This patch fixes the issue in my tests and should be applied to 2.6.36
and as far back as anyone cares to add it to stable.

Thanks to Alexander Duyck <alexander.h.duyck@intel.com> and Jesse
Brandeburg <jesse.brandeburg@intel.com> for the suggestions on this one.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: stable@kerne.org
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
d8190dff018ffe932d17cae047c6b3d1c5fc7574 23-Jul-2010 Greg Edwards <greg.edwards@hp.com> bonding: set device in RLB ARP packet handler

After:

commit 6146b1a4da98377e4abddc91ba5856bef8f23f1e
Author: Jay Vosburgh <fubar@us.ibm.com>
Date: Tue Nov 4 17:51:15 2008 -0800

bonding: Fix ALB mode to balance traffic on VLANs

the dev field in the RLB ARP packet handler was set to NULL to wildcard
and accommodate balancing VLANs on top of bonds.

This has the side-effect of the packet handler being called against
other, non RLB-enabled bonds, and a kernel oops results when it tries to
dereference rx_hashtbl in rlb_update_entry_from_arp(), which won't be
set for those bonds, e.g. active-backup.

With the __netif_receive_skb() changes from:

commit 1f3c8804acba841b5573b953f5560d2683d2db0d
Author: Andy Gospodarek <andy@greyhouse.net>
Date: Mon Dec 14 10:48:58 2009 +0000

bonding: allow arp_ip_targets on separate vlans to use arp validation

frames received on VLANs correctly make their way to the bond's handler,
so we no longer need to wildcard the device.

The oops can be reproduced by:

modprobe bonding

echo active-backup > /sys/class/net/bond0/bonding/mode
echo 100 > /sys/class/net/bond0/bonding/miimon
ifconfig bond0 xxx.xxx.xxx.xxx netmask xxx.xxx.xxx.xxx
echo +eth0 > /sys/class/net/bond0/bonding/slaves
echo +eth1 > /sys/class/net/bond0/bonding/slaves

echo +bond1 > /sys/class/net/bonding_masters
echo balance-alb > /sys/class/net/bond1/bonding/mode
echo 100 > /sys/class/net/bond1/bonding/miimon
ifconfig bond1 xxx.xxx.xxx.xxx netmask xxx.xxx.xxx.xxx
echo +eth2 > /sys/class/net/bond1/bonding/slaves
echo +eth3 > /sys/class/net/bond1/bonding/slaves

Pass some traffic on bond0. Boom.

[ Tested, behaves as advertised. I do not believe a test of the bonding
mode is necessary, as there is no race between the packet handler and
the bonding mode changing (the mode can only change when the device is
closed). Also updated the log message to include the reproduction and
full commit ids. -J ]

Signed-off-by: Greg Edwards <greg.edwards@hp.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
f35188faa0fbabefac476536994f4b6f3677380f 21-Jul-2010 Jay Vosburgh <fubar@us.ibm.com> bonding: change test for presence of VLANs

After commit ad1afb00393915a51c21b1ae8704562bf036855f
("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)")
it is now regular practice for a VLAN "add vid" for VLAN 0 to
arrive prior to any VLAN registration or creation of a vlan_group.

This patch updates the bonding code that tests for the presence
of VLANs configured above bonding. The new logic tests for bond->vlgrp
to determine if a registration has occured, instead of testing that
bonding's internal vlan_list is empty.

The old code would panic when vlan_list was not empty, but
vlgrp was still NULL (because only an "add vid" for VLAN 0 had occured).

Bonding still adds VLAN 0 to its internal list so that 802.1p
frames are handled correctly on transmit when non-VLAN accelerated
slaves are members of the bond. The test against bond->vlan_list
remains in bond_dev_queue_xmit for this reason.

Modification to the bond->vlgrp now occurs under lock (in
addition to RTNL), because not all inspections of it occur under RTNL.

Additionally, because 8021q will never issue a "kill vid" for
VLAN 0, there is now logic in bond_uninit to release any remaining
entries from vlan_list.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Cc: Pedro Garcia <pedro.netdev@dondevamos.com>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
42d782ac1bef7cbcdf05b857731345c6e8149f90 29-Jun-2010 Flavio Leitner <fleitner@redhat.com> bonding: check if clients MAC addr has changed

When two systems using bonding devices in adaptive load
balancing (ALB) communicates with each other, an endless
ping-pong of ARP replies starts between these two systems.

What happens? In the ALB mode, bonding driver keeps track
of each client connected in a hash table, so it can do the
receive load balancing (RLB). This hash table is updated
when an ARP reply is received, then it scans for the client
entry, updates its MAC address and flag it to be announced
later. Therefore, two seconds later, the alb monitor runs
and send for each updated client entry two ARP replies
updating this specific client. The same process happens on
the receiving system, causing the endless ping-pong of arp
replies.

See more information including the relevant functions below:

System 1 System 2
bond0 bond0

ping <system2>
ARP request --------->
<--------- ARP reply

+->rlb_arp_recv <---------------------+ <--- loop begins
| rlb_update_entry_from_arp |
| client_info->ntt = 1; |
| bond_info->rx_ntt = 1; |
| |
| <communication succeed> |
| |
| bond_alb_monitor |
| rlb_update_rx_clients |
| rlb_update_client |
| arp_create(ARPOP_REPLY) |
| send ARP reply --------------> V
| send ARP reply -------------->
| rlb_arp_recv
| rlb_update_entry_from_arp
| client_info->ntt = 1;
| bond_info->rx_ntt = 1;
| < snipped, same as in system 1>
+------- <-------------- send ARP reply
<-------------- send ARP reply

Besides the unneeded networking traffic, this loop breaks
a cluster because a backup system can't take over the IP
address. There is always one system sending an ARP reply
poisoning the network.

This patch fixes the problem adding a check for the MAC
address before updating it. Thus, if the MAC address didn't
change, there is no need to update neither to announce it later.

Signed-off-by: Flavio Leitner <fleitner@redhat.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
097811bb48c7837db94d7fe5d94f0f4b5e19e78c 19-May-2010 Jiri Pirko <jpirko@redhat.com> bonding: optimize tlb_get_least_loaded_slave

In the worst case, when the first loop breaks an the end of the slave list,
the slave list is iterated through twice. This patch reduces this
function only to one loop. Also makes it simpler.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a4aee5c808fc5bf6889c9012217841eb3fd91a6a 14-Dec-2009 Joe Perches <joe@perches.com> drivers/net/bonding/: : use pr_fmt

Add #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
Remove DRV_NAME from pr_<level>s
Consolidate long format strings
Remove some extra tab indents
Remove some unnecessary ()s from pr_<level>s arguments
Align pr_<level> arguments

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
94e2bd688820aed72b4f8092f88c2ccf64e003de 16-Oct-2009 Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> tree-wide: fix some typos and punctuation in comments

fix some typos and punctuation in comments

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
ec87fd3b4e111e8bc367d247a963e27e5b86df26 29-Oct-2009 Eric W. Biederman <ebiederm@aristanetworks.com> bond: Add support for multiple network namespaces

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
885a136c52a8871175477baf3903e1c38751b35a 01-Sep-2009 Eric Dumazet <eric.dumazet@gmail.com> bonding: use compare_ether_addr_64bits() in ALB

We can speedup ether addresses compares using compare_ether_addr_64bits()
instead of memcmp(). We make sure all operands are at least 8 bytes long and
16bits aligned (or better, long word aligned if possible)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e5e2a8fd8358d1b3a2c51c3248edee72e4194703 13-Aug-2009 Jiri Pirko <jpirko@redhat.com> bonding: wipe out printk's

I did not introduce new lines over 80 chars. I even eliminated some of
them.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ec634fe328182a1a098585bfc7b69e5042bdb08d 06-Jul-2009 Patrick McHardy <kaber@trash.net> net: convert remaining non-symbolic return values in ndo_start_xmit() functions

This patch converts the remaining occurences of raw return values to their
symbolic counterparts in ndo_start_xmit() functions that were missed by the
previous automatic conversion.

Additionally code that assumed the symbolic value of NETDEV_TX_OK to be zero
is changed to explicitly use NETDEV_TX_OK.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
815bcc2719c12b6f5b511706e2d19728e07f0b02 04-May-2009 Jay Vosburgh <fubar@us.ibm.com> bonding: fix alb mode locking regression

Fix locking issue in alb MAC address management; removed
incorrect locking and replaced with correct locking. This bug was
introduced in commit 059fe7a578fba5bbb0fdc0365bfcf6218fa25eb0
("bonding: Convert locks to _bh, rework alb locking for new locking")

Bug reported by Paul Smith <paul@mad-scientist.net>, who also
tested the fix.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2690f8d62e98779c71625dba9a0fd525d8b2263d 15-Apr-2009 Jay Vosburgh <fubar@us.ibm.com> bonding: Remove debug printk

Remove debug printk I accidently left in as part of commit:

commit 6146b1a4da98377e4abddc91ba5856bef8f23f1e
Author: Jay Vosburgh <fubar@us.ibm.com>
Date: Tue Nov 4 17:51:15 2008 -0800

bonding: Fix ALB mode to balance traffic on VLANs

Reported by Duncan Gibb <duncan.gibb@siriusit.co.uk>

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1f78d9f94539b8806b81057e75025f2bac7d7ccc 14-Feb-2009 Hannes Eder <hannes@hanneseder.net> drivers/net/bonding: fix sparse warnings: context imbalance

Impact: Attribute functions with __acquires(...) and/or __releases(...).

Fix this sparse warnings:
drivers/net/bonding/bond_alb.c:1675:9: warning: context imbalance in 'bond_alb_handle_active_change' - unexpected unlock
drivers/net/bonding/bond_alb.c:1742:9: warning: context imbalance in 'bond_alb_set_mac_address' - unexpected unlock
drivers/net/bonding/bond_main.c:1025:17: warning: context imbalance in 'bond_do_fail_over_mac' - unexpected unlock
drivers/net/bonding/bond_main.c:3195:13: warning: context imbalance in 'bond_info_seq_start' - wrong count at exit
drivers/net/bonding/bond_main.c:3234:13: warning: context imbalance in 'bond_info_seq_stop' - unexpected unlock

Signed-off-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
09640e6365c679b5642b1c41b6d7078f51689ddf 01-Feb-2009 Harvey Harrison <harvey.harrison@gmail.com> net: replace uses of __constant_{endian}

Base versions handle constant folding now.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5a03cdb7f2d7ff88e50153d8c3b90a1d52dca435 10-Dec-2008 Holger Eitzenberger <holger@eitzenberger.org> bonding: use pr_debug instead of own macros

Use pr_debug() instead of own macros.

Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
eb7cc59a038b4e1914ae991d313f35904924759f 20-Nov-2008 Stephen Hemminger <shemminger@vyatta.com> bonding: convert to net_device_ops

Convert to net_device_ops table.
Note: for some operations move error checking into generic networking
layer (rather than looking at pointers in bonding).

A couple of gratituous style cleanups to get rid of extra {}

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
454d7c9b14e20fd1949e2686e9de4a2926e01476 13-Nov-2008 Wang Chen <wangchen@cn.fujitsu.com> netdevice: safe convert to netdev_priv() #part-1

We have some reasons to kill netdev->priv:
1. netdev->priv is equal to netdev_priv().
2. netdev_priv() wraps the calculation of netdev->priv's offset, obviously
netdev_priv() is more flexible than netdev->priv.
But we cann't kill netdev->priv, because so many drivers reference to it
directly.

This patch is a safe convert for netdev->priv to netdev_priv(netdev).
Since all of the netdev->priv is only for read.
But it is too big to be sent in one mail.
I split it to 4 parts and make every part smaller than 100,000 bytes,
which is max size allowed by vger.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6146b1a4da98377e4abddc91ba5856bef8f23f1e 05-Nov-2008 Jay Vosburgh <fubar@us.ibm.com> bonding: Fix ALB mode to balance traffic on VLANs

The current ALB function that processes incoming ARPs
does not handle traffic for VLANs configured above bonding. This causes
traffic on those VLANs to all be assigned the same slave. This patch
corrects that misbehavior by locating the bonding interface nested below
the VLAN interface.

Bug reported by Sven Anders <anders@anduras.de>, who also
tested an earlier version of this patch and confirmed that it resolved
the problem.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
ce39a800ea87c655de49af021c8b20ee323cb40d 31-Oct-2008 Andy Gospodarek <andy@greyhouse.net> bonding: fix panic when taking bond interface down before removing module

A panic was discovered with bonding when using mode 5 or 6 and trying to
remove the slaves from the bond after the interface was taken down.
When calling 'ifconfig bond0 down' the following happens:

bond_close()
bond_alb_deinitialize()
tlb_deinitialize()
kfree(bond_info->tx_hashtbl)
bond_info->tx_hashtbl = NULL

Unfortunately if there are still slaves in the bond, when removing the
module the following happens:

bonding_exit()
bond_free_all()
bond_release_all()
bond_alb_deinit_slave()
tlb_clear_slave()
tx_hash_table = BOND_ALB_INFO(bond).tx_hashtbl
u32 next_index = tx_hash_table[index].next

As you might guess we panic when trying to access a few entries into the
table that no longer exists.

I experimented with several options (like moving the calls to
tlb_deinitialize somewhere else), but it really makes the most sense to
be part of the bond_close routine. It also didn't seem logical move
tlb_clear_slave around too much, so the simplest option seems to add a
check in tlb_clear_slave to make sure we haven't already wiped the
tx_hashtbl away before searching for all the non-existent hash-table
entries that used to point to the slave as the output interface.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2d1ea19da0e84117d3ebbad981e4664bef03152e 28-Aug-2008 Vlad Yasevich <vladislav.yasevich@hp.com> bonding: Do not tx-balance some IPv6 packets on ALB/TLB bonds

IPv6 all-node-multicasts and DAD probes should not be tx-balanced
on ALB/TLB bonds. The all-node-multicast is an equivalent to IPv4
broadcasts. DAD probes have to be sent only on the primary so that
we don't get false-positive detections.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
f14c4e4e3651b76ae09082fa66cda37e10ac2b43 02-Sep-2008 Brian Haley <brian.haley@hp.com> bonding: change some __constant_htons() to htons()

Resending since I didn't see any responses from the first try.

Change __constant_htons() to htons() in the bonding driver, it should
only be used for initializers.

-Brian

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
7e1a1ac1fbaa88fe254400b7f30b775502932ad3 15-Jul-2008 Wang Chen <wangchen@cn.fujitsu.com> bonding: Check return of dev_set_promiscuity/allmulti

dev_set_promiscuity/allmulti might overflow.
Commit: "netdevice: Fix promiscuity and allmulti overflow" in net-next makes
dev_set_promiscuity/allmulti return error number if overflow happened.

In bond_alb and bond_main, we check all positive increment for promiscuity
and allmulti to get error return.
But there are still two problems left.
1. Some code path has no mechanism to signal errors upstream.
2. If there are multi slaves, it's hard to tell which slaves increment
promisc/allmulti successfully and which failed.
So I left these problems to be FIXME.
Fortunately, the overflow is very rare case.

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
966bc6f434df4a02108d01dda8cd52951fe853da 22-Mar-2008 Jay Vosburgh <fubar@us.ibm.com> bonding: fix two compiler warnings

Fix two compiler warnings that are new with recent versions of gcc
(apparently 4.2 and up). One is fixed by refactoring; this change was
supplied by Stephen Hemminger. The other was fixed by labelling the
variable as uninitialized_var() after confirming via inspection that it
cannot actually be used uninitialized.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
c346dca10840a874240c78efe3f39acf4312a1f2 25-Mar-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS.

Introduce per-net_device inlines: dev_net(), dev_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2543331d367c9fe54f4ba73300894bc21e0a08f4 18-Jan-2008 Jay Vosburgh <fubar@us.ibm.com> bonding: fix locking during alb failover and slave removal

alb_fasten_mac_swap (actually rlb_teach_disabled_mac_on_primary)
requries RTNL and no other locks. This could cause dev_set_promiscuity
and/or dev_set_mac_address to be called with improper locking.

Changed callers to hold only RTNL during calls to alb_fasten_mac_swap
or functions calling it. Updated header comments in affected functions to
reflect proper reality of locking requirements.

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
e0138a66e18c6755ee29ce13b3f1142af775dc5f 18-Jan-2008 Jay Vosburgh <fubar@us.ibm.com> bonding: fix ASSERT_RTNL that produces spurious warnings

Move an ASSERT_RTNL down to where we should hold only RTNL;
the existing check produces spurious warnings because we hold additional
locks at _bh, tripping a debug warning in spin_lock_mutex().

Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
d0e81b7e2246a41d068ecaf15aac9de570816d63 18-Oct-2007 Jay Vosburgh <fubar@us.ibm.com> bonding: Acquire correct locks in alb for promisc change

Update ALB mode monitor to hold correct locks (RTNL and nothing
else) when calling dev_set_promiscuity.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
6603a6f25e4bca922a7dfbf0bf03072d98850176 18-Oct-2007 Jay Vosburgh <fubar@us.ibm.com> bonding: Convert more locks to _bh, acquire rtnl, for new locking

Convert more lock acquisitions to _bh flavor to avoid deadlock
with workqueue activity and add acquisition of RTNL in appropriate places.
Affects ALB mode, as well as core bonding functions and sysfs.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
059fe7a578fba5bbb0fdc0365bfcf6218fa25eb0 18-Oct-2007 Jay Vosburgh <fubar@us.ibm.com> bonding: Convert locks to _bh, rework alb locking for new locking

Convert locking-related activity to new & improved system.
Convert some lock acquisitions to _bh and rework parts of ALB mode, both
to avoid deadlocks with workqueue activity.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
1b76b31693d4a6088dec104ff6a6ead54081a3c2 18-Oct-2007 Jay Vosburgh <fubar@us.ibm.com> Convert bonding timers to workqueues

Convert bonding timers to workqueues. This converts the various
monitor functions to run in periodic work queues instead of timers. This
patch introduces the framework and convers the calls, but does not resolve
various locking issues, and does not stand alone.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
d3bb52b0948cf118131c951c5a34a2d4d0246171 23-Aug-2007 Al Viro <viro@zeniv.linux.org.uk> endianness annotations drivers/net/bonding/

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
e730c15519d09ea528b4d2f1103681fa5937c0e6 17-Sep-2007 Eric W. Biederman <ebiederm@xmission.com> [NET]: Make packet reception network namespace safe

This patch modifies every packet receive function
registered with dev_add_pack() to drop packets if they
are not from the initial network namespace.

This should ensure that the various network stacks do
not receive packets in a anything but the initial network
namespace until the code has been converted and is ready
for them.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b0e380b1d8a8e0aca215df97702f99815f05c094 11-Apr-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: unions of just one member don't get anything done, kill them

Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and
skb->mac to skb->mac_header, to match the names of the associated helpers
(skb[_[re]set]_{transport,network,mac}_header).

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
0660e03f6b18f19b6bbafe7583265a51b90daf36 26-Apr-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce ipv6_hdr(), remove skb->nh.ipv6h

Now the skb->nh union has just one member, .raw, i.e. it is just like the
skb->mac union, strange, no? I'm just leaving it like that till the transport
layer is done with, when we'll rename skb->mac.raw to skb->mac_header (or
->mac_header_offset?), ditto for ->{h,nh}.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
eddc9ec53be2ecdbf4efe0efd4a83052594f0ac0 21-Apr-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iph

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
d56f90a7c96da5187f0cdf07ee7434fe6aa78bbc 11-Apr-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce skb_network_header()

For the places where we need a pointer to the network header, it is still legal
to touch skb->nh.raw directly if just adding to, subtracting from or setting it
to another layer header.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e7dd65dafda5737a983c04d652a69ab8da78ee3f 11-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF] bonding: Set skb->nh.raw relative to skb->mac.raw

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a16aeb36239ce612699ed64a75a03c88cbc657e8 10-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [BONDING]: Introduce arp_pkt()

For consistency with all the other skb->nh.raw accessors.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
459a98ed881802dee55897441bc7f77af614368e 19-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce skb_reset_mac_header(skb)

For the common, open coded 'skb->mac.raw = skb->data' operation, so that we can
later turn skb->mac.raw into a offset, reducing the size of struct sk_buff in
64bit land while possibly keeping it as a pointer on 32bit.

This one touches just the most simple case, next will handle the slightly more
"complex" cases.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
243cb4e56061c3f4cb76312c5527840344d57c3b 06-Feb-2007 Joe Jin <lkmaillist@gmail.com> [BONDING]: Replace kmalloc() + memset() pairs with the appropriate kzalloc() calls

Replace kmalloc() + memset() pairs with the appropriate kzalloc() calls in
the bonding driver.

Signed-off-by: Joe Jin <lkmaillist@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
39984a9fad0c642182f426d7771332d46f222103 30-Sep-2006 Karsten Keil <kkeil@suse.de> [PATCH] bonding: fix deadlock on high loads in bond_alb_monitor()

In bond_alb_monitor the bond->curr_slave_lock write lock is taken
and then dev_set_promiscuity maybe called which can take some time,
depending on the network HW. If a network IRQ for this card come in
the softirq handler maybe try to deliver more packets which end up in
a request to the read lock of bond->curr_slave_lock -> deadlock.
This issue was found by a test lab during network stress tests, this patch
disable the softirq handler for this case and solved the issue.

Signed-off-by: Karsten Keil <kkeil@suse.de>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
f71e130966ba429dbd24be08ddbcdf263df9a5ad 04-Mar-2006 Arjan van de Ven <arjan@infradead.org> Massive net driver const-ification.
5af47b2ff124fdad9ba84baeb9f7eeebeb227b43 09-Jan-2006 Jay Vosburgh <fubar@us.ibm.com> [PATCH] bonding: UPDATED hash-table corruption in bond_alb.c

I believe I see the race Michael refers to (tlb_choose_channel
may set head, which tlb_init_slave clears), although I was not able to
reproduce it. I have updated his patch for the current netdev-2.6.git
tree and added a version update. His original comment follows:

Our systems have been crashing during testing of PCI HotPlug
support in the various networking components. We've faulted in
the bonding driver due to a bug in bond_alb.c:tlb_clear_slave()

In that routine, the last modification to the TLB hash table is
made without protection of the lock, allowing a race that can lead
tlb_choose_channel() to select an invalid table element.

-J

Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
2e06cb5859fdaeba0529806eb1bf161ffd0db201 28-Nov-2005 Jeff Garzik <jgarzik@pobox.com> [bonding] Remove superfluous changelog.

No need to record this information in source code, its all in the git
repository, and kernel archives.
e944ef79184ff7f283e7bf79496d2873a0b0410b 09-Nov-2005 Mitch Williams <mitch.a.williams@intel.com> [PATCH] bonding: spelling and whitespace corrections

Minor spelling and whitespace corrections.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
b76850ab577bb4b929e60894d2025bbfcc043984 09-Nov-2005 Mitch Williams <mitch.a.williams@intel.com> [PATCH] bonding: explicitly clear RLB flag during ALB init

Explicitly clear RLB flag during ALB init. This is needed for sysfs
support, since the bond mode can be changed at runtime via sysfs.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
0d206a3af4329bd833cfa5fe1cc7fe146e49c131 09-Nov-2005 Mitch Williams <mitch.a.williams@intel.com> [PATCH] bonding: move kmalloc out of spinlock in ALB init

Move memory allocations out of the spinlock during ALB init. This gets
rid of a sleeping-inside-spinlock warning and accompanying stack dump.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
4e0952c74ee450ded86e8946ce58ea8dfd05b007 09-Nov-2005 Mitch Williams <mitch.a.williams@intel.com> [PATCH] bonding: add bond name to all error messages

Add the bond name to all error messages so we can tell which one is
complaining. Also reformats some error messages to be more consistent.

Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Acked-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
f2ccd8fa06c8e302116e71df372f5c1f83432e03 10-Aug-2005 David S. Miller <davem@davemloft.net> [NET]: Kill skb->real_dev

Bonding just wants the device before the skb_bond()
decapsulation occurs, so simply pass that original
device into packet_type->func() as an argument.

It remains to be seen whether we can use this same
exact thing to get rid of skb->input_dev as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
6b38aefe924daf2e4fdd73b384f21c913f31b668 28-Jul-2005 John W. Linville <linville@tuxdriver.com> [PATCH] bonding: ALB -- allow slave to use bond's MAC address if its own MAC address conflicts

In ALB mode, allow new slave to use bond's MAC address if the new
slave's MAC address is being used within the bond and no other slave
is using the bond's MAC address.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 17-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org> Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!