History log of /drivers/net/ifb.c
Revision Date Author Comments
0287587884b15041203b3a362d485e1ab1f24445 06-Oct-2014 Eric Dumazet <edumazet@google.com> net: better IFF_XMIT_DST_RELEASE support

Testing xmit_more support with netperf and connected UDP sockets,
I found strange dst refcount false sharing.

Current handling of IFF_XMIT_DST_RELEASE is not optimal.

Dropping dst in validate_xmit_skb() is certainly too late in case
packet was queued by cpu X but dequeued by cpu Y

The logical point to take care of drop/force is in __dev_queue_xmit()
before even taking qdisc lock.

As Julian Anastasov pointed out, need for skb_dst() might come from some
packet schedulers or classifiers.

This patch adds new helper to cleanly express needs of various drivers
or qdiscs/classifiers.

Drivers that need skb_dst() in their ndo_start_xmit() should call
following helper in their setup instead of the prior :

dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
->
netif_keep_dst(dev);

Instead of using a single bit, we use two bits, one being
eventually rebuilt in bonding/team drivers.

The other one, is permanent and blocks IFF_XMIT_DST_RELEASE being
rebuilt in bonding/team. Eventually, we could add something
smarter later.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
c835a677331495cf137a7f8a023463afd9f032f8 14-Jul-2014 Tom Gundersen <teg@jklm.no> net: set name_assign_type in alloc_netdev()

Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert
all users to pass NET_NAME_UNKNOWN.

Coccinelle patch:

@@
expression sizeof_priv, name, setup, txqs, rxqs, count;
@@

(
-alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs)
+alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs)
|
-alloc_netdev_mq(sizeof_priv, name, setup, count)
+alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count)
|
-alloc_netdev(sizeof_priv, name, setup)
+alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup)
)

v9: move comments here from the wrong commit

Signed-off-by: Tom Gundersen <teg@jklm.no>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8dd6e147b0c29723ec10d0e836c7f3466d61a19b 28-Mar-2014 Vlad Yasevich <vyasevic@redhat.com> ifb: Remove vlan acceleration from vlan_features

Do not include vlan acceleration features in vlan_features as that
precludes correct Q-in-Q operation.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
57a7744e09867ebcfa0ccf1d6d529caa7728d552 14-Mar-2014 Eric W. Biederman <ebiederm@xmission.com> net: Replace u64_stats_fetch_begin_bh to u64_stats_fetch_begin_irq

Replace the bh safe variant with the hard irq safe variant.

We need a hard irq safe variant to deal with netpoll transmitting
packets from hard irq context, and we need it in most if not all of
the places using the bh safe variant.

Except on 32bit uni-processor the code is exactly the same so don't
bother with a bh variant, just have a hard irq safe variant that
everyone can use.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
827da44c61419f29ae3be198c342e2147f1a10cb 08-Oct-2013 John Stultz <john.stultz@linaro.org> net: Explicitly initialize u64_stats_sync structures for lockdep

In order to enable lockdep on seqcount/seqlock structures, we
must explicitly initialize any locks.

The u64_stats_sync structure, uses a seqcount, and thus we need
to introduce a u64_stats_init() function and use it to initialize
the structure.

This unfortunately adds a lot of fairly trivial initialization code
to a number of drivers. But the benefit of ensuring correctness makes
this worth while.

Because these changes are required for lockdep to be enabled, and the
changes are quite trivial, I've not yet split this patch out into 30-some
separate patches, as I figured it would be better to get the various
maintainers thoughts on how to best merge this change along with
the seqcount lockdep enablement.

Feedback would be appreciated!

Signed-off-by: John Stultz <john.stultz@linaro.org>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: James Morris <jmorris@namei.org>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Mirko Lindner <mlindner@marvell.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Roger Luethi <rl@hellgate.ch>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Simon Horman <horms@verge.net.au>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
f2966cd5691058b8674a20766525bedeaea9cbcf 11-Jul-2013 dingtianhong <dingtianhong@huawei.com> ifb: fix oops when loading the ifb failed

If __rtnl_link_register() return faild when loading the ifb, it will
take the wrong path and get oops, so fix it just like dummy.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
440d57bc5ff55ec1efb3efc9cbe9420b4bbdfefa 09-Jul-2013 dingtianhong <dingtianhong@huawei.com> ifb: fix rcu_sched self-detected stalls

According to the commit 16b0dc29c1af9df341428f4c49ada4f626258082
(dummy: fix rcu_sched self-detected stalls)

Eric Dumazet fix the problem in dummy, but the ifb will occur the
same problem like the dummy modules.

Trying to "modprobe ifb numifbs=30000" triggers :

INFO: rcu_sched self-detected stall on CPU

After this splat, RTNL is locked and reboot is needed.

We must call cond_resched() to avoid this, even holding RTNL.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
28d2b136ca6c7bf7173a43a90f747ecda5b0520d 19-Apr-2013 Patrick McHardy <kaber@trash.net> net: vlan: announce STAG offload capability in some drivers

- macvlan: propagate STAG filtering capabilities from underlying device
- ifb: announce STAG tagging support in addition to CTAG tagging support
- veth: announce STAG tagging/stripping support in addition to CTAG support

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
f646968f8f7c624587de729115d802372b9063dd 19-Apr-2013 Patrick McHardy <kaber@trash.net> net: vlan: rename NETIF_F_HW_VLAN_* feature flags to NETIF_F_HW_VLAN_CTAG_*

Rename the hardware VLAN acceleration features to include "CTAG" to indicate
that they only support CTAGs. Follow up patches will introduce 802.1ad
server provider tagging (STAGs) and require the distinction for hardware not
supporting acclerating both.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
73bf0d0eecba15e2a2f96b1092554b01fc07044b 13-Jan-2013 Eric Dumazet <edumazet@google.com> ifb: dont hard code inet_net use

ifb should lookup devices in the appropriate namespace.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
f2cedb63df14342ad40a8b5b324fc5d94a60b665 15-Feb-2012 Danny Kukawka <danny.kukawka@bisect.de> net: replace random_ether_addr() with eth_hw_addr_random()

Replace usage of random_ether_addr() with eth_hw_addr_random()
to set addr_assign_type correctly to NET_ADDR_RANDOM.

Change the trivial cases.

v2: adapt to renamed eth_hw_addr_random()

Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
34324dc2bf27c1773045fea63cb11f7e2a6ad2b9 15-Nov-2011 Michał Mirosław <mirq-linux@rere.qmqm.pl> net: remove NETIF_F_NO_CSUM feature bit

Only distinct use is checking if NETIF_F_NOCACHE_COPY should be
enabled by default. The check heuristics is altered a bit here,
so it hits other people than before. The default shouldn't be
trusted for performance-critical cases anyway.

For all other uses NETIF_F_NO_CSUM is equivalent to NETIF_F_HW_CSUM.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
550fd08c2cebad61c548def135f67aba284c6162 26-Jul-2011 Neil Horman <nhorman@tuxdriver.com> net: Audit drivers to identify those needing IFF_TX_SKB_SHARING cleared

After the last patch, We are left in a state in which only drivers calling
ether_setup have IFF_TX_SKB_SHARING set (we assume that drivers touching real
hardware call ether_setup for their net_devices and don't hold any state in
their skbs. There are a handful of drivers that violate this assumption of
course, and need to be fixed up. This patch identifies those drivers, and marks
them as not being able to support the safe transmission of skbs by clearning the
IFF_TX_SKB_SHARING flag in priv_flags

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Karsten Keil <isdn@linux-pingi.de>
CC: "David S. Miller" <davem@davemloft.net>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Patrick McHardy <kaber@trash.net>
CC: Krzysztof Halasa <khc@pm.waw.pl>
CC: "John W. Linville" <linville@tuxdriver.com>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: Marcel Holtmann <marcel@holtmann.org>
CC: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
3b0c9cbb6e5fe25a4162be68ed1459b6f7432da9 20-Jun-2011 stephen hemminger <shemminger@vyatta.com> ifb: convert to 64 bit stats

Convert input functional block device to use 64 bit stats.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
a6b7a407865aab9f849dd99a71072b7cd1175116 06-Jun-2011 Alexey Dobriyan <adobriyan@gmail.com> net: remove interrupt.h inclusion from netdevice.h

* remove interrupt.g inclusion from netdevice.h -- not needed
* fixup fallout, add interrupt.h and hardirq.h back where needed.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1c5cae815d19ffe02bdfda1260949ef2b1806171 30-Apr-2011 Jiri Pirko <jpirko@redhat.com> net: call dev_alloc_name from register_netdevice

Force dev_alloc_name() to be called from register_netdevice() by
dev_get_valid_name(). That allows to remove multiple explicit
dev_alloc_name() calls.

The possibility to call dev_alloc_name in advance remains.

This also fixes veth creation regresion caused by
84c49d8c3e4abefb0a41a77b25aa37ebe8d6b743

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
39980292fda20b38baf95bfa577db8b678eecc86 03-Jan-2011 Eric Dumazet <eric.dumazet@gmail.com> ifb: add performance flags

Le lundi 03 janvier 2011 à 11:40 -0800, David Miller a écrit :
> From: Jarek Poplawski <jarkao2@gmail.com>
> Date: Mon, 3 Jan 2011 20:37:03 +0100
>
> > On Sun, Jan 02, 2011 at 09:24:36PM +0100, Eric Dumazet wrote:
> >> Le mercredi 29 décembre 2010 ?? 00:07 +0100, Jarek Poplawski a écrit :
> >>
> >> > Ingress is before vlans handler so these features and the
> >> > NETIF_F_HW_VLAN_TX flag seem useful for ifb considering
> >> > dev_hard_start_xmit() checks.
> >>
> >> OK, here is v2 of the patch then, thanks everybody.
> >>
> >>
> >> [PATCH v2 net-next-2.6] ifb: add performance flags
> >>
> >> IFB can use the full set of features flags (NETIF_F_SG |
> >> NETIF_F_FRAGLIST | NETIF_F_TSO | NETIF_F_NO_CSUM | NETIF_F_HIGHDMA) to
> >> avoid unnecessary split of some packets (GRO for example)
> >>
> >> Changli suggested to also set vlan_features,
> >
> > He also suggested more GSO flags of which especially NETIF_F_TSO6
> > seems interesting (wrt GRO)?
>
> I think at least TSO6 would very much be appropriate here.

Yes, why not, I am only wondering why loopback / dummy (and others ?)
only set NETIF_F_TSO :)

Since I want to play with ECN, I might also add NETIF_F_TSO_ECN ;)

For other flags, I really doubt it can matter on ifb ?

[PATCH v3 net-next-2.6] ifb: add performance flags

IFB can use the full set of features flags (NETIF_F_SG |
NETIF_F_FRAGLIST | NETIF_F_TSO | NETIF_F_NO_CSUM | NETIF_F_HIGHDMA) to
avoid unnecessary split of some packets (GRO for example)

Changli suggested to also set vlan_features, NETIF_F_TSO6,
NETIF_F_TSO_ECN.

Jarek suggested to add NETIF_F_HW_VLAN_TX as well.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Changli Gao <xiaosuo@gmail.com>
Cc: Jarek Poplawski <jarkao2@gmail.com>
Cc: Pawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
1a75972c61f2c224eb5283c183f9f6b17fb09b6b 14-Dec-2010 Eric Dumazet <eric.dumazet@gmail.com> ifb: use netif_receive_skb() instead of netif_rx()

In ri_tasklet(), we run from softirq, so can directly handle packet
through netif_receive_skb() instead of netif_rx().
There is no risk of recursion.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7edc3453e54432a9f1c636b6481f1107c9db19bd 16-Dec-2010 Eric Dumazet <eric.dumazet@gmail.com> ifb: fix a lockdep splat

After recent ifb changes, we must use lockless __skb_dequeue() since
lock is not anymore initialized.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jamal Hadi Salim <hadi@cyberus.ca>
Cc: Changli Gao <xiaosuo@gmail.com>
Acked-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
957fca95e3521e471aac4c2e4cfbc21f399bdd84 04-Dec-2010 Changli Gao <xiaosuo@gmail.com> ifb: use the lockless variants of skb_queue

rq and tq are both protected by tx queue lock, so we can simply use
the lockless variants of skb_queue.

skb_queue_splice_tail_init() is used instead of the open coded and slow
one.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
c6350362cbb19882ba0eb3578cc1abc07e6ea204 03-Dec-2010 Changli Gao <xiaosuo@gmail.com> ifb: remove unused macro TX_TIMEOUT

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
e1f91505025db74c261962dc16d58f79b9b0c83c 03-Dec-2010 Changli Gao <xiaosuo@gmail.com> ifb: remove the useless debug stats

These debug stats are not exported, and become useless.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
75c1c82566f23dd539fb7ccbf57a1caa7ba82628 04-Dec-2010 Changli Gao <xiaosuo@gmail.com> ifb: goto resched directly if error happens and dp->tq isn't empty

If we break the loop when there are still skbs in tq and no skb in
rq, the skbs will be left in txq until new skbs are enqueued into rq.
In rare cases, no new skb is queued, then these skbs will stay in rq
forever.

After this patch, if tq isn't empty when we break the loop, we goto
resched directly.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
1ae5dc342ac78d7a42965fd1f323815f6f5ef2c1 10-May-2010 Eric Dumazet <eric.dumazet@gmail.com> net: trans_start cleanups

Now that core network takes care of trans_start updates, dont do it
in drivers themselves, if possible. Drivers can avoid one cache miss
(on dev->trans_start) in their start_xmit() handler.

Exceptions are NETIF_F_LLTX drivers

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8964be4a9a5ca8cab1219bb046db2f6d1936227c 21-Nov-2009 Eric Dumazet <eric.dumazet@gmail.com> net: rename skb->iif to skb->skb_iif

To help grep games, rename iif to skb_iif

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
05e8689c9a3a208bf75b60662778d81e23eac460 01-Nov-2009 Eric Dumazet <eric.dumazet@gmail.com> ifb: RCU locking avoids touching dev refcount

Avoids touching dev refcount in hotpath

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
db519144243de6b17ff0c56c26f06059743110a7 20-Oct-2009 Eric Dumazet <eric.dumazet@gmail.com> ifb: should not use __dev_get_by_index() without locks

At this point (ri_tasklet()), RTNL or dev_base_lock are not held,
we must use dev_get_by_index() instead of __dev_get_by_index()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
424efe9caf6047ffbcd6b383ff4d2347254aabf1 31-Aug-2009 Stephen Hemminger <shemminger@vyatta.com> netdev: convert pseudo drivers to netdev_tx_t

These are all drivers that don't touch real hardware.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ec634fe328182a1a098585bfc7b69e5042bdb08d 06-Jul-2009 Patrick McHardy <kaber@trash.net> net: convert remaining non-symbolic return values in ndo_start_xmit() functions

This patch converts the remaining occurences of raw return values to their
symbolic counterparts in ndo_start_xmit() functions that were missed by the
previous automatic conversion.

Additionally code that assumed the symbolic value of NETDEV_TX_OK to be zero
is changed to explicitly use NETDEV_TX_OK.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
93f154b594fe47e4a7e5358b309add449a046cd3 19-May-2009 Eric Dumazet <dada1@cosmosbay.com> net: release dst entry in dev_hard_start_xmit()

One point of contention in high network loads is the dst_release() performed
when a transmited skb is freed. This is because NIC tx completion calls
dev_kree_skb() long after original call to dev_queue_xmit(skb).

CPU cache is cold and the atomic op in dst_release() stalls. On SMP, this is
quite visible if one CPU is 100% handling softirqs for a network device,
since dst_clone() is done by other cpus, involving cache line ping pongs.

It seems right place to release dst is in dev_hard_start_xmit(), for most
devices but ones that are virtual, and some exceptions.

David Miller suggested to define a new device flag, set in alloc_netdev_mq()
(so that most devices set it at init time), and carefuly unset in devices
which dont want a NULL skb->dst in their ndo_start_xmit().

List of devices that must clear this flag is :

- loopback device, because it calls netif_rx() and quoting Patrick :
"ip_route_input() doesn't accept loopback addresses, so loopback packets
already need to have a dst_entry attached."
- appletalk/ipddp.c : needs skb->dst in its xmit function

- And all devices that call again dev_queue_xmit() from their xmit function
(as some classifiers need skb->dst) : bonding, vlan, macvlan, eql, ifb, hdlc_fr

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
008298231abbeb91bc7be9e8b078607b816d1a4a 21-Nov-2008 Stephen Hemminger <shemminger@vyatta.com> netdev: add more functions to netdevice ops

This patch moves neigh_setup and hard_start_xmit into the network device ops
structure. For bisection, fix all the previously converted drivers as well.
Bonding driver took the biggest hit on this.

Added a prefetch of the hard_start_xmit in the fast path to try and reduce
any impact this would have.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8dfcdf342d9e8294a3292005f9158022289dfd67 20-Nov-2008 Stephen Hemminger <shemminger@vyatta.com> ifb: convert to net_device_ops

Convert to new network device ops interface.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c3f26a269c2421f97f10cf8ed05d5099b573af4d 01-Aug-2008 David S. Miller <davem@davemloft.net> netdev: Fix lockdep warnings in multiqueue configurations.

When support for multiple TX queues were added, the
netif_tx_lock() routines we converted to iterate over
all TX queues and grab each queue's spinlock.

This causes heartburn for lockdep and it's not a healthy
thing to do with lots of TX queues anyways.

So modify this to use a top-level lock and a "frozen"
state for the individual TX queues.

Signed-off-by: David S. Miller <davem@davemloft.net>
83874000929ed63aef30b44083a9f713135ff040 17-Jul-2008 David S. Miller <davem@davemloft.net> pkt_sched: Kill netdev_queue lock.

We can simply use the qdisc->q.lock for all of the
qdisc tree synchronization.

Signed-off-by: David S. Miller <davem@davemloft.net>
e8a0464cc950972824e2e128028ae3db666ec1ed 17-Jul-2008 David S. Miller <davem@davemloft.net> netdev: Allocate multiple queues for TX.

alloc_netdev_mq() now allocates an array of netdev_queue
structures for TX, based upon the queue_count argument.

Furthermore, all accesses to the TX queues are now vectored
through the netdev_get_tx_queue() and netdev_for_each_tx_queue()
interfaces. This makes it easy to grep the tree for all
things that want to get to a TX queue of a net device.

Problem spots which are not really multiqueue aware yet, and
only work with one queue, can easily be spotted by grepping
for all netdev_get_tx_queue() calls that pass in a zero index.

Signed-off-by: David S. Miller <davem@davemloft.net>
555353cfa1aee293de445bfa6de43276138ddd82 09-Jul-2008 David S. Miller <davem@davemloft.net> netdev: The ingress_lock member is no longer needed.

Every qdisc is assosciated with a queue, and in the case of ingress
qdiscs that will now be netdev->rx_queue so using that queue's lock is
the thing to do.

Signed-off-by: David S. Miller <davem@davemloft.net>
dc2b48475a0a36f8b3bbb2da60d3a006dc5c2c84 09-Jul-2008 David S. Miller <davem@davemloft.net> netdev: Move queue_lock into struct netdev_queue.

The lock is now an attribute of the device queue.

One thing to notice is that "suspicious" places
emerge which will need specific training about
multiple queue handling. They are so marked with
explicit "netdev->rx_queue" and "netdev->tx_queue"
references.

Signed-off-by: David S. Miller <davem@davemloft.net>
94833dfb8c98ed4ca1944dd2c1339d88a2d1c758 21-Mar-2008 Jarek Poplawski <jarkao2@gmail.com> [NET] ifb: set separate lockdep classes for queue locks

[ 10.536424] =======================================================
[ 10.536424] [ INFO: possible circular locking dependency detected ]
[ 10.536424] 2.6.25-rc3-devel #3
[ 10.536424] -------------------------------------------------------
[ 10.536424] swapper/0 is trying to acquire lock:
[ 10.536424] (&dev->queue_lock){-+..}, at: [<c0299b4a>]
dev_queue_xmit+0x175/0x2f3
[ 10.536424]
[ 10.536424] but task is already holding lock:
[ 10.536424] (&p->tcfc_lock){-+..}, at: [<f8a67154>] tcf_mirred+0x20/0x178
[act_mirred]
[ 10.536424]
[ 10.536424] which lock already depends on the new lock.

lockdep warns of locking order while using ifb with sch_ingress and
act_mirred: ingress_lock, tcfc_lock, queue_lock (usually queue_lock
is at the beginning). This patch is only to tell lockdep that ifb is
a different device (e.g. from eth) and has its own pair of queue
locks. (This warning is a false-positive in common scenario of using
ifb; yet there are possible situations, when this order could be
dangerous; lockdep should warn in such a case.) (With suggestions by
David S. Miller)

Reported-and-tested-by: Denys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
09f75cd7bf13720738e6a196cc0107ce9a5bd5a0 04-Oct-2007 Jeff Garzik <jeff@garzik.org> [NET] drivers/net: statistics cleanup #1 -- save memory and shrink code

We now have struct net_device_stats embedded in struct net_device,
and the default ->get_stats() hook does the obvious thing for us.

Run through drivers/net/* and remove the driver-local storage of
statistics, and driver-local ->get_stats() hook where applicable.

This was just the low-hanging fruit in drivers/net; plenty more drivers
remain to be updated.

[ Resolved conflicts with napi_struct changes and fix sunqe build
regression... -DaveM ]

Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10d024c1b2fd58af8362670d7d6e5ae52fc33353 17-Sep-2007 Ralf Baechle <ralf@linux-mips.org> [NET]: Nuke SET_MODULE_OWNER macro.

It's been a useless no-op for long enough in 2.6 so I figured it's time to
remove it. The number of people that could object because they're
maintaining unified 2.4 and 2.6 drivers is probably rather small.

[ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ]

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
881d966b48b035ab3f3aeaae0f3d3f9b584f45b2 17-Sep-2007 Eric W. Biederman <ebiederm@xmission.com> [NET]: Make the device list and device lookups per namespace.

This patch makes most of the generic device layer network
namespace safe. This patch makes dev_base_head a
network namespace variable, and then it picks up
a few associated variables. The functions:
dev_getbyhwaddr
dev_getfirsthwbytype
dev_get_by_flags
dev_get_by_name
__dev_get_by_name
dev_get_by_index
__dev_get_by_index
dev_ioctl
dev_ethtool
dev_load
wireless_process_ioctl

were modified to take a network namespace argument, and
deal with it.

vlan_ioctl_set and brioctl_set were modified so their
hooks will receive a network namespace argument.

So basically anthing in the core of the network stack that was
affected to by the change of dev_base was modified to handle
multiple network namespaces. The rest of the network stack was
simply modified to explicitly use &init_net the initial network
namespace. This can be fixed when those components of the network
stack are modified to handle multiple network namespaces.

For now the ifindex generator is left global.

Fundametally ifindex numbers are per namespace, or else
we will have corner case problems with migration when
we get that far.

At the same time there are assumptions in the network stack
that the ifindex of a network device won't change. Making
the ifindex number global seems a good compromise until
the network stack can cope with ifindex changes when
you change namespaces, and the like.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
0e06877c6fdbc67b1132be895f995acd1ff30135 12-Jul-2007 Patrick McHardy <kaber@trash.net> [RTNETLINK]: rtnl_link: allow specifying initial device address

Drivers need to validate the initial addresses in their netlink attribute
validation function or manually reject them if they can't support this.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2d85cba2b272a5201a60966a65a4f8c0bcc0bb71 12-Jul-2007 Patrick McHardy <kaber@trash.net> [RTNETLINK]: rtnl_link API simplification

All drivers need to unregister their devices in the module unload function.
While doing so they must hold the rtnl and atomically unregister the
rtnl_link ops as well. This makes the rtnl_link_unregister function that
takes the rtnl itself completely useless.

Provide default newlink/dellink functions, make __rtnl_link_unregister and
rtnl_link_unregister unregister all devices with matching rtnl_link_ops and
change the existing users to take advantage of that.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9ba2cd656021e7f70038ba9d551224e04d0bfcef 13-Jun-2007 Patrick McHardy <kaber@trash.net> [IFB]: Use rtnl_link API

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
62b7ffcaaa4e91ed547fc55758076ac536bd5571 13-Jun-2007 Patrick McHardy <kaber@trash.net> [IFB]: Keep ifb devices on list

Use a list instead of an array to allow creating new devices.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
c01003c20563d1e75ec9828d21743919d2b43977 29-Mar-2007 Patrick McHardy <kaber@trash.net> [IFB]: Fix crash on input device removal

The input_device pointer is not refcounted, which means the device may
disappear while packets are queued, causing a crash when ifb passes packets
with a stale skb->dev pointer to netif_rx().

Fix by storing the interface index instead and do a lookup where neccessary.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
bcdddfb66cc998252d34758ce4109cedc0d24a5c 30-Jan-2007 Linus Torvalds <torvalds@woody.linux-foundation.org> Revert "net: ifb error path loop fix"

This reverts commit 0c0b3ae68ec93b1db5c637d294647d1cca0df763.

Quoth David:

"Jeff, please revert

It's wrong. We had a lengthy analysis of this piece of code
several months ago, and it is correct.

Consider, if we run the loop and we get an error
the following happens:

1) attempt of ifb_init_one(i) fails, therefore we should
not try to "ifb_free_one()" on "i" since it failed
2) the loop iteration first increments "i", then it
check for error

Therefore we must decrement "i" twice before the first
free during the cleanup. One to "undo" the for() loop
increment, and one to "skip" the ifb_init_one() case which
failed."

Reported-by: David Miller <davem@davemloft.net>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Cc: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
0c0b3ae68ec93b1db5c637d294647d1cca0df763 27-Jan-2007 Mariusz Kozlowski <m.kozlowski@tuxland.pl> net: ifb error path loop fix

On error we should start freeing resources at [i-1] not [i-2].

Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
3136dcb3cd6e5b4ed4bd34d422f8cdeec4da6836 02-Jan-2007 dean gaudet <dean@arctic.org> [NET]: ifb double-counts packets

Signed-off-by: dean gaudet <dean@arctic.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8057de64fd4734ae3e70cf76deb77f1c19958494 03-Oct-2006 Zach Brown <zach.brown@oracle.com> [PATCH] pr_debug: ifb: replace missing comma to separate pr_debug arguments

ifb: replace missing comma to separate pr_debug arguments

Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
6aa20a2235535605db6d6d2bd850298b2fe7f31e 13-Sep-2006 Jeff Garzik <jeff@garzik.org> drivers/net: Trim trailing whitespace

Signed-off-by: Jeff Garzik <jeff@garzik.org>
4a9c74e5830444c1c3235848e06402c1d2ece1ea 21-Jul-2006 Nicolas Dichtel <nicolas.dichtel@6wind.com> [IFB] After ifb_init_one() failed, i is increased. Decrease

It before entering in the loop for freeing the other ifb devices.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
6ab3d5624e172c553004ecc862bfeac16d9d68b7 30-Jun-2006 Jörn Engel <joern@wohnheim.fh-wedel.de> Remove obsolete #include <linux/config.h>

Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
932ff279a43ab7257942cddff2595acd541cc49b 09-Jun-2006 Herbert Xu <herbert@gondor.apana.org.au> [NET]: Add netif_tx_lock

Various drivers use xmit_lock internally to synchronise with their
transmission routines. They do so without setting xmit_lock_owner.
This is fine as long as netpoll is not in use.

With netpoll it is possible for deadlocks to occur if xmit_lock_owner
isn't set. This is because if a printk occurs while xmit_lock is held
and xmit_lock_owner is not set can cause netpoll to attempt to take
xmit_lock recursively.

While it is possible to resolve this by getting netpoll to use
trylock, it is suboptimal because netpoll's sole objective is to
maximise the chance of getting the printk out on the wire. So
delaying or dropping the message is to be avoided as much as possible.

So the only alternative is to always set xmit_lock_owner. The
following patch does this by introducing the netif_tx_lock family of
functions that take care of setting/unsetting xmit_lock_owner.

I renamed xmit_lock to _xmit_lock to indicate that it should not be
used directly. I didn't provide irq versions of the netif_tx_lock
functions since xmit_lock is meant to be a BH-disabling lock.

This is pretty much a straight text substitution except for a small
bug fix in winbond. It currently uses
netif_stop_queue/spin_unlock_wait to stop transmission. This is
unsafe as an IRQ can potentially wake up the queue. So it is safer to
use netif_tx_disable.

The hamradio bits used spin_lock_irq but it is unnecessary as
xmit_lock must never be taken in an IRQ handler.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
35eaa31e5d6b0653c11b5661572152295b45b7a7 24-Feb-2006 Richard Lucassen <spamtrap@lucassen.org> [NET]: Increase default IFB device count.

The most usable number of ifb devices is 2. Change the default to 2.

Signed-off-by: Richard Lucassen <spamtrap@lucassen.org>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
253af4235d24ddfcd9f5403485e9273b33d8fa5e 09-Jan-2006 Jamal Hadi Salim <hadi@cyberus.ca> [NET]: Add IFB (Intermediate Functional Block) network device.

A new device to do intermidiate functional block in a system shared
manner. To use the new functionality, you need to turn on
qos/classifier actions.

The new functionality can be grouped as:

1) qdiscs/policies that are per device as opposed to system wide. ifb
allows for a device which can be redirected to thus providing an
impression of sharing.

2) Allows for queueing incoming traffic for shaping instead of
dropping.

Packets are redirected to this device using tc/action mirred redirect
construct. If they are sent to it by plain routing instead then they
will merely be dropped and the stats would indicate that.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>