History log of /net/ipv6/mcast.c
Revision Date Author Comments
feb91a02ccb09661507f170b2a444aec94f307f9 05-Nov-2014 Daniel Borkmann <dborkman@redhat.com> ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs

It has been reported that generating an MLD listener report on
devices with large MTUs (e.g. 9000) and a high number of IPv6
addresses can trigger a skb_over_panic():

skbuff: skb_over_panic: text:ffffffff80612a5d len:3776 put:20
head:ffff88046d751000 data:ffff88046d751010 tail:0xed0 end:0xec0
dev:port1
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:100!
invalid opcode: 0000 [#1] SMP
Modules linked in: ixgbe(O)
CPU: 3 PID: 0 Comm: swapper/3 Tainted: G O 3.14.23+ #4
[...]
Call Trace:
<IRQ>
[<ffffffff80578226>] ? skb_put+0x3a/0x3b
[<ffffffff80612a5d>] ? add_grhead+0x45/0x8e
[<ffffffff80612e3a>] ? add_grec+0x394/0x3d4
[<ffffffff80613222>] ? mld_ifc_timer_expire+0x195/0x20d
[<ffffffff8061308d>] ? mld_dad_timer_expire+0x45/0x45
[<ffffffff80255b5d>] ? call_timer_fn.isra.29+0x12/0x68
[<ffffffff80255d16>] ? run_timer_softirq+0x163/0x182
[<ffffffff80250e6f>] ? __do_softirq+0xe0/0x21d
[<ffffffff8025112b>] ? irq_exit+0x4e/0xd3
[<ffffffff802214bb>] ? smp_apic_timer_interrupt+0x3b/0x46
[<ffffffff8063f10a>] ? apic_timer_interrupt+0x6a/0x70

mld_newpack() skb allocations are usually requested with dev->mtu
in size, since commit 72e09ad107e7 ("ipv6: avoid high order allocations")
we have changed the limit in order to be less likely to fail.

However, in MLD/IGMP code, we have some rather ugly AVAILABLE(skb)
macros, which determine if we may end up doing an skb_put() for
adding another record. To avoid possible fragmentation, we check
the skb's tailroom as skb->dev->mtu - skb->len, which is a wrong
assumption as the actual max allocation size can be much smaller.

The IGMP case doesn't have this issue as commit 57e1ab6eaddc
("igmp: refine skb allocations") stores the allocation size in
the cb[].

Set a reserved_tailroom to make it fit into the MTU and use
skb_availroom() helper instead. This also allows to get rid of
igmp_skb_size().

Reported-by: Wei Liu <lw1a2.jing@gmail.com>
Fixes: 72e09ad107e7 ("ipv6: avoid high order allocations")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: David L Stevens <david.stevens@oracle.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
35f7aa5309c048bb70e58571942795fa9411ce6a 20-Sep-2014 Daniel Borkmann <dborkman@redhat.com> ipv6: mld: answer mldv2 queries with mldv1 reports in mldv1 fallback

RFC2710 (MLDv1), section 3.7. says:

The length of a received MLD message is computed by taking the
IPv6 Payload Length value and subtracting the length of any IPv6
extension headers present between the IPv6 header and the MLD
message. If that length is greater than 24 octets, that indicates
that there are other fields present *beyond* the fields described
above, perhaps belonging to a *future backwards-compatible* version
of MLD. An implementation of the version of MLD specified in this
document *MUST NOT* send an MLD message longer than 24 octets and
MUST ignore anything past the first 24 octets of a received MLD
message.

RFC3810 (MLDv2), section 8.2.1. states for *listeners* regarding
presence of MLDv1 routers:

In order to be compatible with MLDv1 routers, MLDv2 hosts MUST
operate in version 1 compatibility mode. [...] When Host
Compatibility Mode is MLDv2, a host acts using the MLDv2 protocol
on that interface. When Host Compatibility Mode is MLDv1, a host
acts in MLDv1 compatibility mode, using *only* the MLDv1 protocol,
on that interface. [...]

While section 8.3.1. specifies *router* behaviour regarding presence
of MLDv1 routers:

MLDv2 routers may be placed on a network where there is at least
one MLDv1 router. The following requirements apply:

If an MLDv1 router is present on the link, the Querier MUST use
the *lowest* version of MLD present on the network. This must be
administratively assured. Routers that desire to be compatible
with MLDv1 MUST have a configuration option to act in MLDv1 mode;
if an MLDv1 router is present on the link, the system administrator
must explicitly configure all MLDv2 routers to act in MLDv1 mode.
When in MLDv1 mode, the Querier MUST send periodic General Queries
truncated at the Multicast Address field (i.e., 24 bytes long),
and SHOULD also warn about receiving an MLDv2 Query (such warnings
must be rate-limited). The Querier MUST also fill in the Maximum
Response Delay in the Maximum Response Code field, i.e., the
exponential algorithm described in section 5.1.3. is not used. [...]

That means that we should not get queries from different versions of
MLD. When there's a MLDv1 router present, MLDv2 enforces truncation
and MRC == MRD (both fields are overlapping within the 24 octet range).

Section 8.3.2. specifies behaviour in the presence of MLDv1 multicast
address *listeners*:

MLDv2 routers may be placed on a network where there are hosts
that have not yet been upgraded to MLDv2. In order to be compatible
with MLDv1 hosts, MLDv2 routers MUST operate in version 1 compatibility
mode. MLDv2 routers keep a compatibility mode per multicast address
record. The compatibility mode of a multicast address is determined
from the Multicast Address Compatibility Mode variable, which can be
in one of the two following states: MLDv1 or MLDv2.

The Multicast Address Compatibility Mode of a multicast address
record is set to MLDv1 whenever an MLDv1 Multicast Listener Report is
*received* for that multicast address. At the same time, the Older
Version Host Present timer for the multicast address is set to Older
Version Host Present Timeout seconds. The timer is re-set whenever a
new MLDv1 Report is received for that multicast address. If the Older
Version Host Present timer expires, the router switches back to
Multicast Address Compatibility Mode of MLDv2 for that multicast
address. [...]

That means, what can happen is the following scenario, that hosts can
act in MLDv1 compatibility mode when they previously have received an
MLDv1 query (or, simply operate in MLDv1 mode-only); and at the same
time, an MLDv2 router could start up and transmits MLDv2 startup query
messages while being unaware of the current operational mode.

Given RFC2710, section 3.7 we would need to answer to that with an MLDv1
listener report, so that the router according to RFC3810, section 8.3.2.
would receive that and internally switch to MLDv1 compatibility as well.

Right now, I believe since the initial implementation of MLDv2, Linux
hosts would just silently drop such MLDv2 queries instead of replying
with an MLDv1 listener report, which would prevent a MLDv2 router going
into fallback mode (until it receives other MLDv1 queries).

Since the mapping of MRC to MRD in exactly such cases can make use of
the exponential algorithm from 5.1.3, we cannot [strictly speaking] be
aware in MLDv1 of the encoding in MRC, it seems also not mentioned by
the RFC. Since encodings are the same up to 32767, assume in such a
situation this value as a hard upper limit we would clamp. We have asked
one of the RFC authors on that regard, and he mentioned that there seem
not to be any implementations that make use of that exponential algorithm
on startup messages. In any case, this patch fixes this MLD
interoperability issue.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
1691c63ea42d6f57ba769df401b9773664edb936 12-Sep-2014 WANG Cong <xiyou.wangcong@gmail.com> ipv6: refactor ipv6_dev_mc_inc()

Refactor out allocation and initialization and make
the refcount code more readable.

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
f7ed925c1b4d62e82b72d8e99fa4be52fb0c73b6 12-Sep-2014 WANG Cong <xiyou.wangcong@gmail.com> ipv6: update the comment in mcast.c

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
414b6c943fe25c5c576b6f0ce9077c29a150d826 12-Sep-2014 WANG Cong <xiyou.wangcong@gmail.com> ipv6: drop some rcu_read_lock in mcast

Similarly the code is already protected by rtnl lock.

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b5350916bfd49f737d73c4c512fbea1b3537c703 12-Sep-2014 WANG Cong <xiyou.wangcong@gmail.com> ipv6: drop ipv6_sk_mc_lock in mcast

Similarly the code is already protected by rtnl lock.

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cbeddd5d163ba7a6cd96a96509f7043cce4f68d2 09-Sep-2014 Daniel Borkmann <dborkman@redhat.com> ipv6: mcast: remove dead debugging defines

It's not used anywhere, so just remove these.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a9ed4a2986e13011fcf4ed2d1a1647c53112f55b 02-Sep-2014 Sabrina Dubroca <sd@queasysnail.net> ipv6: fix rtnl locking in setsockopt for anycast and multicast

Calling setsockopt with IPV6_JOIN_ANYCAST or IPV6_LEAVE_ANYCAST
triggers the assertion in addrconf_join_solict()/addrconf_leave_solict()

ipv6_sock_ac_join(), ipv6_sock_ac_drop(), ipv6_sock_ac_close() need to
take RTNL before calling ipv6_dev_ac_inc/dec. Same thing with
ipv6_sock_mc_join(), ipv6_sock_mc_drop(), ipv6_sock_mc_close() before
calling ipv6_dev_mc_inc/dec.

This patch moves ASSERT_RTNL() up a level in the call stack.

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reported-by: Tommi Rantala <tt.rantala@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2f711939d2ea9dfaecebecd1324d2ec7a7a21f65 02-Sep-2014 Hannes Frederic Sowa <hannes@stressinduktion.org> ipv6: add sysctl_mld_qrv to configure query robustness variable

This patch adds a new sysctl_mld_qrv knob to configure the mldv1/v2 query
robustness variable. It specifies how many retransmit of unsolicited mld
retransmit should happen. Admins might want to tune this on lossy links.

Also reset mld state on interface down/up, so we pick up new sysctl
settings during interface up event.

IPv6 certification requests this knob to be available.

I didn't make this knob netns specific, as it is mostly a setting in a
physical environment and should be per host.

Cc: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
67ba4152e8b77eada6a9c64e3c2c84d6112794fc 24-Aug-2014 Ian Morris <ipm@chirality.org.uk> ipv6: White-space cleansing : Line Layouts

This patch makes no changes to the logic of the code but simply addresses
coding style issues as detected by checkpatch.

Both objdump and diff -w show no differences.

A number of items are addressed in this patch:
* Multiple spaces converted to tabs
* Spaces before tabs removed.
* Spaces in pointer typing cleansed (char *)foo etc.
* Remove space after sizeof
* Ensure spacing around comparators such as if statements.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
e940f5d6ba6a01f8dbb870854d5205d322452730 27-Jun-2014 Hangbin Liu <liuhangbin@gmail.com> ipv6: Fix MLD Query message check

Based on RFC3810 6.2, we also need to check the hop limit and router alert
option besides source address.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
43a43b6040165f7b40b5b489fe61a4cb7f8c4980 31-Mar-2014 Hannes Frederic Sowa <hannes@stressinduktion.org> ipv6: some ipv6 statistic counters failed to disable bh

After commit c15b1ccadb323ea ("ipv6: move DAD and addrconf_verify
processing to workqueue") some counters are now updated in process context
and thus need to disable bh before doing so, otherwise deadlocks can
happen on 32-bit archs. Fabio Estevam noticed this while while mounting
a NFS volume on an ARM board.

As a compensation for missing this I looked after the other *_STATS_BH
and found three other calls which need updating:

1) icmp6_send: ip6_fragment -> icmpv6_send -> icmp6_send (error handling)
2) ip6_push_pending_frames: rawv6_sendmsg -> rawv6_push_pending_frames -> ...
(only in case of icmp protocol with raw sockets in error handling)
3) ping6_v6_sendmsg (error handling)

Fixes: c15b1ccadb323ea ("ipv6: move DAD and addrconf_verify processing to workqueue")
Reported-by: Fabio Estevam <festevam@gmail.com>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6a7cc41872dd46719fde09756e55e870cd453da8 16-Jan-2014 Flavio Leitner <fbl@redhat.com> ipv6: send Change Status Report after DAD is completed

The RFC 3810 defines two type of messages for multicast
listeners. The "Current State Report" message, as the name
implies, refreshes the *current* state to the querier.
Since the querier sends Query messages periodically, there
is no need to retransmit the report.

On the other hand, any change should be reported immediately
using "State Change Report" messages. Since it's an event
triggered by a change and that it can be affected by packet
loss, the rfc states it should be retransmitted [RobVar] times
to make sure routers will receive timely.

Currently, we are sending "Current State Reports" after
DAD is completed. Before that, we send messages using
unspecified address (::) which should be silently discarded
by routers.

This patch changes to send "State Change Report" messages
after DAD is completed fixing the behavior to be RFC compliant
and also to pass TAHI IPv6 testsuite.

Signed-off-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
63862b5bef7349dd1137e4c70702c67d77565785 11-Jan-2014 Aruna-Hewapathirane <aruna.hewapathirane@gmail.com> net: replace macros net_random and net_srandom with direct calls to prandom

This patch removes the net_random and net_srandom macros and replaces
them with direct calls to the prandom ones. As new commits only seem to
use prandom_u32 there is no use to keep them around.
This change makes it easier to grep for users of prandom_u32.

Signed-off-by: Aruna-Hewapathirane <aruna.hewapathirane@gmail.com>
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9260d3e1013701aa814d10c8fc6a9f92bd17d643 29-Sep-2013 Salam Noureddine <noureddine@aristanetworks.com> ipv6 mcast: use in6_dev_put in timer handlers instead of __in6_dev_put

It is possible for the timer handlers to run after the call to
ipv6_mc_down so use in6_dev_put instead of __in6_dev_put in the
handler function in order to do proper cleanup when the refcnt
reaches 0. Otherwise, the refcnt can reach zero without the
inet6_dev being destroyed and we end up leaking a reference to
the net_device and see messages like the following,

unregister_netdevice: waiting for eth0 to become free. Usage count = 1

Tested on linux-3.4.43.

Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b4af8def5c083e5424c66051fe400444a01c4644 04-Sep-2013 Daniel Borkmann <dborkman@redhat.com> net: ipv6: mld: introduce mld_{gq, ifc, dad}_stop_timer functions

We already have mld_{gq,ifc,dad}_start_timer() functions, so introduce
mld_{gq,ifc,dad}_stop_timer() functions to reduce code size and make it
more readable.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2b7c121f82b47ba5efac1ef47355376d1f8d6980 04-Sep-2013 Daniel Borkmann <dborkman@redhat.com> net: ipv6: mld: refactor query processing into v1/v2 functions

Make igmp6_event_query() a bit easier to read by refactoring code
parts into mld_process_v1() and mld_process_v2().

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
cc7f7ab758f66a9110c8e737c3de3e9f5fc209b5 04-Sep-2013 Daniel Borkmann <dborkman@redhat.com> net: ipv6: mld: similarly to MLDv2 have min max_delay of 1

Similarly as we do in MLDv2 queries, set a forged MLDv1 query with
0 ms mld_maxdelay to minimum timer shot time of 1 jiffies. This is
eventually done in igmp6_group_queried() anyway, so we can simplify
a check there.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
58c0ecfd8d9871cfa35bcdbf3e7b3ee9ca62ea67 04-Sep-2013 Daniel Borkmann <dborkman@redhat.com> net: ipv6: mld: implement RFC3810 MLDv2 mode only

RFC3810, 10. Security Considerations says under subsection 10.1.
Query Message:

A forged Version 1 Query message will put MLDv2 listeners on that
link in MLDv1 Host Compatibility Mode. This scenario can be avoided
by providing MLDv2 hosts with a configuration option to ignore
Version 1 messages completely.

Hence, implement a MLDv2-only mode that will ignore MLDv1 traffic:

echo 2 > /proc/sys/net/ipv6/conf/ethX/force_mld_version or
echo 2 > /proc/sys/net/ipv6/conf/all/force_mld_version

Note that <all> device has a higher precedence as it was previously
also the case in the macro MLD_V1_SEEN() that would "short-circuit"
if condition on <all> case.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
e3f5b17047dec4acd8957dad053e70d87f18d97e 04-Sep-2013 Daniel Borkmann <dborkman@redhat.com> net: ipv6: mld: get rid of MLDV2_MRC and simplify calculation

Get rid of MLDV2_MRC and use our new macros for mantisse and
exponent to calculate Maximum Response Delay out of the Maximum
Response Code.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6c567b78c8a7da26e5e6f5bd458882ad9967233a 04-Sep-2013 Daniel Borkmann <dborkman@redhat.com> net: ipv6: mld: clean up MLD_V1_SEEN macro

Replace the macro with a function to make it more readable. GCC will
eventually decide whether to inline this or not (also, that's not
fast-path anyway).

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
89225d1ce6af3916bf32aecbe9d83f571a098588 04-Sep-2013 Daniel Borkmann <dborkman@redhat.com> net: ipv6: mld: fix v1/v2 switchback timeout to rfc3810, 9.12.

i) RFC3810, 9.2. Query Interval [QI] says:

The Query Interval variable denotes the interval between General
Queries sent by the Querier. Default value: 125 seconds. [...]

ii) RFC3810, 9.3. Query Response Interval [QRI] says:

The Maximum Response Delay used to calculate the Maximum Response
Code inserted into the periodic General Queries. Default value:
10000 (10 seconds) [...] The number of seconds represented by the
[Query Response Interval] must be less than the [Query Interval].

iii) RFC3810, 9.12. Older Version Querier Present Timeout [OVQPT] says:

The Older Version Querier Present Timeout is the time-out for
transitioning a host back to MLDv2 Host Compatibility Mode. When an
MLDv1 query is received, MLDv2 hosts set their Older Version Querier
Present Timer to [Older Version Querier Present Timeout].

This value MUST be ([Robustness Variable] times (the [Query Interval]
in the last Query received)) plus ([Query Response Interval]).

Hence, on *default* the timeout results in:

[RV] = 2, [QI] = 125sec, [QRI] = 10sec
[OVQPT] = [RV] * [QI] + [QRI] = 260sec

Having that said, we currently calculate [OVQPT] (here given as 'switchback'
variable) as ...

switchback = (idev->mc_qrv + 1) * max_delay

RFC3810, 9.12. says "the [Query Interval] in the last Query received". In
section "9.14. Configuring timers", it is said:

This section is meant to provide advice to network administrators on
how to tune these settings to their network. Ambitious router
implementations might tune these settings dynamically based upon
changing characteristics of the network. [...]

iv) RFC38010, 9.14.2. Query Interval:

The overall level of periodic MLD traffic is inversely proportional
to the Query Interval. A longer Query Interval results in a lower
overall level of MLD traffic. The value of the Query Interval MUST
be equal to or greater than the Maximum Response Delay used to
calculate the Maximum Response Code inserted in General Query
messages.

I assume that was why switchback is calculated as is (3 * max_delay), although
this setting seems to be meant for routers only to configure their [QI]
interval for non-default intervals. So usage here like this is clearly wrong.

Concluding, the current behaviour in IPv6's multicast code is not conform
to the RFC as switch back is calculated wrongly. That is, it has a too small
value, so MLDv2 hosts switch back again to MLDv2 way too early, i.e. ~30secs
instead of ~260secs on default.

Hence, introduce necessary helper functions and fix this up properly as it
should be.

Introduced in 06da92283 ("[IPV6]: Add MLDv2 support."). Credits to Hannes
Frederic Sowa who also had a hand in this as well. Also thanks to Hangbin Liu
who did initial testing.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: David Stevens <dlstevens@us.ibm.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9fd0784164047583fa856c9a5aeda1d6c6fb0399 20-Aug-2013 Daniel Borkmann <dborkman@redhat.com> net: ipv6: mcast: minor: use defines for rfc3810/8.1 lengths

Instead of hard-coding length values, use a define to make it clear
where those lengths come from.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
c2cef4e88808cc566956605a39a96080912c0566 20-Aug-2013 Daniel Borkmann <dborkman@redhat.com> net: ipv6: minor: *_start_timer: rather use unsigned long

For the functions mld_gq_start_timer(), mld_ifc_start_timer(),
and mld_dad_start_timer(), rather use unsigned long than int
as we operate only on unsigned values anyway. This seems more
appropriate as there is no good reason to do type conversions
to int, that could lead to future errors.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
846989635b368c41efd9079ec6fe0e8d2845ec11 20-Aug-2013 Daniel Borkmann <dborkman@redhat.com> net: ipv6: igmp6_event_query: use msecs_to_jiffies

Use proper API functions to calculate jiffies from milliseconds and
not the crude method of dividing HZ by a value. This ensures more
accurate values even in the case of strange HZ values. While at it,
also simplify code in the mlh2 case by using max().

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
fc4eba58b4c1462ff3d6247b66fb47d6928db6d2 14-Aug-2013 Hannes Frederic Sowa <hannes@stressinduktion.org> ipv6: make unsolicited report intervals configurable for mld

Commit cab70040dfd95ee32144f02fade64f0cb94f31a0 ("net: igmp:
Reduce Unsolicited report interval to 1s when using IGMPv3") and
2690048c01f32bf45d1c1e1ab3079bc10ad2aea7 ("net: igmp: Allow user-space
configuration of igmp unsolicited report interval") by William Manley made
igmp unsolicited report intervals configurable per interface and corrected
the interval of unsolicited igmpv3 report messages resendings to 1s.

Same needs to be done for IPv6:

MLDv1 (RFC2710 7.10.): 10 seconds
MLDv2 (RFC3810 9.11.): 1 second

Both intervals are configurable via new procfs knobs
mldv1_unsolicited_report_interval and mldv2_unsolicited_report_interval.

(also added .force_mld_version to ipv6_devconf_dflt to bring structs in
line without semantic changes)

v2:
a) Joined documentation update for IPv4 and IPv6 MLD/IGMP
unsolicited_report_interval procfs knobs.
b) incorporate stylistic feedback from William Manley

v3:
a) add new DEVCONF_* values to the end of the enum (thanks to David
Miller)

Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: William Manley <william.manley@youview.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9d4a0314642918cbda9ed4012df51e8df608fce6 26-Jul-2013 Hannes Frederic Sowa <hannes@stressinduktion.org> ipv4, ipv6: send igmpv3/mld packets with TC_PRIO_CONTROL

v2:
a) Also send ipv4 igmp messages with TC_PRIO_CONTROL

Cc: William Manley <william.manley@youview.com>
Cc: Lukas Tribus <luky-37@hotmail.com>
Acked-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
8965779d2c0e6ab246c82a405236b1fb2adae6b2 29-Jun-2013 Amerigo Wang <amwang@redhat.com> ipv6,mcast: always hold idev->lock before mca_lock

dingtianhong reported the following deadlock detected by lockdep:

======================================================
[ INFO: possible circular locking dependency detected ]
3.4.24.05-0.1-default #1 Not tainted
-------------------------------------------------------
ksoftirqd/0/3 is trying to acquire lock:
(&ndev->lock){+.+...}, at: [<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120

but task is already holding lock:
(&mc->mca_lock){+.+...}, at: [<ffffffff8149d130>] mld_send_report+0x40/0x150

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&mc->mca_lock){+.+...}:
[<ffffffff810a8027>] validate_chain+0x637/0x730
[<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
[<ffffffff810a8734>] lock_acquire+0x114/0x150
[<ffffffff814f691a>] rt_spin_lock+0x4a/0x60
[<ffffffff8149e4bb>] igmp6_group_added+0x3b/0x120
[<ffffffff8149e5d8>] ipv6_mc_up+0x38/0x60
[<ffffffff81480a4d>] ipv6_find_idev+0x3d/0x80
[<ffffffff81483175>] addrconf_notify+0x3d5/0x4b0
[<ffffffff814fae3f>] notifier_call_chain+0x3f/0x80
[<ffffffff81073471>] raw_notifier_call_chain+0x11/0x20
[<ffffffff813d8722>] call_netdevice_notifiers+0x32/0x60
[<ffffffff813d92d4>] __dev_notify_flags+0x34/0x80
[<ffffffff813d9360>] dev_change_flags+0x40/0x70
[<ffffffff813ea627>] do_setlink+0x237/0x8a0
[<ffffffff813ebb6c>] rtnl_newlink+0x3ec/0x600
[<ffffffff813eb4d0>] rtnetlink_rcv_msg+0x160/0x310
[<ffffffff814040b9>] netlink_rcv_skb+0x89/0xb0
[<ffffffff813eb357>] rtnetlink_rcv+0x27/0x40
[<ffffffff81403e20>] netlink_unicast+0x140/0x180
[<ffffffff81404a9e>] netlink_sendmsg+0x33e/0x380
[<ffffffff813c4252>] sock_sendmsg+0x112/0x130
[<ffffffff813c537e>] __sys_sendmsg+0x44e/0x460
[<ffffffff813c5544>] sys_sendmsg+0x44/0x70
[<ffffffff814feab9>] system_call_fastpath+0x16/0x1b

-> #0 (&ndev->lock){+.+...}:
[<ffffffff810a798e>] check_prev_add+0x3de/0x440
[<ffffffff810a8027>] validate_chain+0x637/0x730
[<ffffffff810a8417>] __lock_acquire+0x2f7/0x500
[<ffffffff810a8734>] lock_acquire+0x114/0x150
[<ffffffff814f6c82>] rt_read_lock+0x42/0x60
[<ffffffff8147f804>] ipv6_get_lladdr+0x74/0x120
[<ffffffff8149b036>] mld_newpack+0xb6/0x160
[<ffffffff8149b18b>] add_grhead+0xab/0xc0
[<ffffffff8149d03b>] add_grec+0x3ab/0x460
[<ffffffff8149d14a>] mld_send_report+0x5a/0x150
[<ffffffff8149f99e>] igmp6_timer_handler+0x4e/0xb0
[<ffffffff8105705a>] call_timer_fn+0xca/0x1d0
[<ffffffff81057b9f>] run_timer_softirq+0x1df/0x2e0
[<ffffffff8104e8c7>] handle_pending_softirqs+0xf7/0x1f0
[<ffffffff8104ea3b>] __do_softirq_common+0x7b/0xf0
[<ffffffff8104f07f>] __thread_do_softirq+0x1af/0x210
[<ffffffff8104f1c1>] run_ksoftirqd+0xe1/0x1f0
[<ffffffff8106c7de>] kthread+0xae/0xc0
[<ffffffff814fff74>] kernel_thread_helper+0x4/0x10

actually we can just hold idev->lock before taking pmc->mca_lock,
and avoid taking idev->lock again when iterating idev->addr_list,
since the upper callers of mld_newpack() already take
read_lock_bh(&idev->lock).

Reported-by: dingtianhong <dingtianhong@huawei.com>
Cc: dingtianhong <dingtianhong@huawei.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Tested-by: Ding Tianhong <dingtianhong@huawei.com>
Tested-by: Chen Weilong <chenweilong@huawei.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
b173ee488dcc545e77ed482158a2f0d06d7a5860 27-Jun-2013 Hannes Frederic Sowa <hannes@stressinduktion.org> ipv6: resend MLD report if a link-local address completes DAD

RFC3590/RFC3810 specifies we should resend MLD reports as soon as a
valid link-local address is available.

We now use the valid_ll_addr_cnt to check if it is necessary to resend
a new report.

Changes since Flavio Leitner's version:
a) adapt for valid_ll_addr_cnt
b) resend first reports directly in the path and just arm the timer for
mc_qrv-1 resends.

Reported-by: Flavio Leitner <fleitner@redhat.com>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: David Stevens <dlstevens@us.ibm.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
29a3cad5c6ae9e7fbf1509d01d39c3c3c38f11f9 28-May-2013 Simon Horman <horms@verge.net.au> ipv6: Correct comparisons and calculations using skb->tail and skb-transport_header

This corrects an regression introduced by "net: Use 16bits for *_headers
fields of struct skbuff" when NET_SKBUFF_DATA_USES_OFFSET is not set. In
that case skb->tail will be a pointer whereas skb->transport_header
will be an offset from head. This is corrected by using wrappers that
ensure that comparisons and calculations are always made using pointers.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
ece31ffd539e8e2b586b1ca5f50bc4f4591e3893 18-Feb-2013 Gao feng <gaofeng@cn.fujitsu.com> net: proc: change proc_net_remove to remove_proc_entry

proc_net_remove is only used to remove proc entries
that under /proc/net,it's not a general function for
removing proc entries of netns. if we want to remove
some proc entries which under /proc/net/stat/, we still
need to call remove_proc_entry.

this patch use remove_proc_entry to replace proc_net_remove.
we can remove proc_net_remove after this patch.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
d4beaa66add8aebf83ab16d2fde4e4de8dac36df 18-Feb-2013 Gao feng <gaofeng@cn.fujitsu.com> net: proc: change proc_net_fops_create to proc_create

Right now, some modules such as bonding use proc_create
to create proc entries under /proc/net/, and other modules
such as ipv4 use proc_net_fops_create.

It looks a little chaos.this patch changes all of
proc_net_fops_create to proc_create. we can remove
proc_net_fops_create after this patch.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ec16ef2228829698fdbd2f6b160ff52551a6bf8a 09-Feb-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ipv6 mcast: Do not join device multicast for interface-local multicasts.

RFC4291 (IPv6 addressing architecture) says that interface-Local scope
spans only a single interface on a node. We should not join L2 device
multicast list for addresses in interface-local (or smaller) scope.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
56db1c5f41c2ed0300ca570fea279a7c189c18fb 03-Feb-2013 Jean Sacren <sakiwit@gmail.com> mcast: do not check 'rv' twice in a row

With the loop, don't check 'rv' twice in a row. Without the loop, 'rv'
doesn't even need to be checked.

Make the comment more grammar-friendly.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
07c2fecc363c8f94a0bd2fdf9b7209ec6f5a0ab5 29-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ipv6 mcast: Use ipv6_addr_equal() in ip6_mc_source().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2576f17dfad402e2446244238ed22dddf35c2e53 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ipv6: Unshare ip6_nd_hdr() and change return type to void.

- move ip6_nd_hdr() to its users' source files.
In net/ipv6/mcast.c, it will be called ip6_mc_hdr().
- make return type to void since this function never fails.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12fd84f4383b15b0a12cfd50b7c527cd55d6f101 18-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ipv6: Remove unused neigh argument for icmp6_dst_alloc() and its callers.

Because of rt->n removal, we do not need neigh argument any more.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
daad151263cf334d57fcc0270e2483d4b4639650 13-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ipv6: Make ipv6_is_mld() inline and use it from ip6_mc_input().

Move generalized version of ipv6_is_mld() to header,
and use it from ip6_mc_input().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
0e1efe9d5e10921f1e2152b108e013605fca3c9f 05-Dec-2012 Eric Dumazet <edumazet@google.com> ipv6: avoid taking locks at socket dismantle

ipv6_sock_mc_close() is called for ipv6 sockets at close time, and most
of them don't use multicast.

Add a test to avoid contention on a shared spinlock.

Same heuristic applies for ipv6_sock_ac_close(), to avoid contention
on a shared rwlock.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
94e187c01512c9cf29e2ff54bf1a1b045f38293d 29-Oct-2012 Amerigo Wang <amwang@redhat.com> ipv6: introduce ip6_rt_put()

As suggested by Eric, we could introduce a helper function
for ipv6 too, to avoid checking if rt is NULL before
dst_release().

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a858d64b7709ca7bd2ee71d66ef3b7190cdcbb7d 17-Jul-2012 Li Wei <lw@cn.fujitsu.com> ipv6: fix unappropriate errno returned for non-multicast address

We need to check the passed in multicast address and return
appropriate errno(EINVAL) if it is not valid. And it's no need
to walk through the ipv6_mc_list in this situation.

Signed-off-by: Li Wei <lw@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a50feda546ac03415707a9bbcac8d6b20714db21 18-May-2012 Eric Dumazet <edumazet@google.com> ipv6: bool/const conversions phase2

Mostly bool conversions, some inline removals and const additions.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
f32138319ca6541e65f95f8e17c9cc88ac1baf94 15-May-2012 Joe Perches <joe@perches.com> net: ipv6: Standardize prefixes for message logging

Add #define pr_fmt(fmt) as appropriate.

Add "IPv6: " to appropriate files.

Convert printk(KERN_<LEVEL> to pr_<level> (but not KERN_DEBUG).
Standardize on "%s: " not "%s(): " when emitting __func__.
Use "%s: ", __func__ instead of embedding function name.
Coalesce formats, align arguments.

ADDRCONF output is now prefixed with "IPv6: "

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ce713ee5a10f3a171df94b0d501034aab2388c16 05-Apr-2012 RongQing.Li <roy.qing.li@gmail.com> net: replace continue with break to reduce unnecessary loop in xxx_xmarksources

The conditional which decides to skip inactive filters does not
change with the change of loop index, so it is unnecessary to
check them many times.

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
78d50217baf36093ab320f95bae0d6452daec85c 04-Apr-2012 RongQing.Li <roy.qing.li@gmail.com> ipv6: fix array index in ip6_mc_add_src()

Convert array index from the loop bound to the loop index.

And remove the void type conversion to ip6_mc_del1_src() return
code, seem it is unnecessary, since ip6_mc_del1_src() does not
use __must_check similar attribute, no compiler will report the
warning when it is removed.

v2: enrich the commit header

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c577923756b7fe9071f28a76b66b83b306d1d001 15-Mar-2012 RongQing.Li <roy.qing.li@gmail.com> ipv6: Don't dev_hold(dev) in ip6_mc_find_dev_rcu.

ip6_mc_find_dev_rcu() is called with rcu_read_lock(), so don't
need to dev_hold().
With dev_hold(), not corresponding dev_put(), will lead to leak.

[ bug introduced in 96b52e61be1 (ipv6: mcast: RCU conversions) ]

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
d191854282fd831da785a5a34bc6fd16049b8578 29-Dec-2011 David S. Miller <davem@davemloft.net> ipv6: Kill rt6i_dev and rt6i_expires defines.

It just obscures that the netdevice pointer and the expires value are
implemented in the dst_entry sub-object of the ipv6 route.

And it makes grepping for dst_entry member uses much harder too.

Signed-off-by: David S. Miller <davem@davemloft.net>
87a115783eca7a424eef599d6f10a499f85f59c8 06-Dec-2011 David S. Miller <davem@davemloft.net> ipv6: Move xfrm_lookup() call down into icmp6_dst_alloc().

And return error pointers.

Signed-off-by: David S. Miller <davem@davemloft.net>
99d2f47aa9d3ad40daa6ee0770e91b95b71082f0 30-Nov-2011 Jun Zhao <mypopydev@gmail.com> ipv6 : mcast : Delete useless parameter in ip6_mc_add1_src()

Need not to used 'delta' flag when add single-source to interface
filter source list.

Signed-off-by: Jun Zhao <mypopydev@gmail.com>
Signed-off-by: David S. Miller <davem@drr.davemloft.net>
4e3fd7a06dc20b2d8ec6892233ad2012968fe7b6 21-Nov-2011 Alexey Dobriyan <adobriyan@gmail.com> net: remove ipv6_addr_copy()

C assignment can handle struct in6_addr copying.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a7ae1992248e5cf9dc5bd35695ab846d27efe15f 18-Nov-2011 Herbert Xu <herbert@gondor.apana.org.au> ipv6: Remove all uses of LL_ALLOCATED_SPACE

ipv6: Remove all uses of LL_ALLOCATED_SPACE

The macro LL_ALLOCATED_SPACE was ill-conceived. It applies the
alignment to the sum of needed_headroom and needed_tailroom. As
the amount that is then reserved for head room is needed_headroom
with alignment, this means that the tail room left may be too small.

This patch replaces all uses of LL_ALLOCATED_SPACE in net/ipv6
with the macro LL_RESERVED_SPACE and direct reference to
needed_tailroom.

This also fixes the problem with needed_headroom changing between
allocating the skb and reserving the head room.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
e05c4ad3ed874ee4f5e2c969e55d318ec654332c 24-Aug-2011 Yan, Zheng <zheng.z.yan@intel.com> mcast: Fix source address selection for multicast listener report

Should check use count of include mode filter instead of total number
of include mode filters.

Signed-off-by: Zheng Yan <zheng.z.yan@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e3cbf28fa61456bd2f0ba39846ffd905257eb4b0 17-Mar-2011 Lai Jiangshan <laijs@cn.fujitsu.com> net,rcu: convert call_rcu(ipv6_mc_socklist_reclaim) to kfree_rcu()

The rcu callback ipv6_mc_socklist_reclaim() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(ipv6_mc_socklist_reclaim).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
b71d1d426d263b0b6cb5760322efebbfc89d4463 22-Apr-2011 Eric Dumazet <eric.dumazet@gmail.com> inet: constify ip headers and in6_addr

Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers
where possible, to make code intention more obvious.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4c9483b2fb5d2548c3cc1fe03cdd4484ceeb5d1c 12-Mar-2011 David S. Miller <davem@davemloft.net> ipv6: Convert to use flowi6 where applicable.

Signed-off-by: David S. Miller <davem@davemloft.net>
4b66fef9b591b95f447aea12242a1133deb0bd22 04-Mar-2011 Hagen Paul Pfeifer <hagen@jauu.net> mcast: net_device dev not used

ip6_mc_source(), ip6_mc_msfilter() as well as ip6_mc_msfget() declare
and assign dev but do not use the variable afterwards.

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
452edd598f60522c11f7f88fdbab27eb36509d1a 02-Mar-2011 David S. Miller <davem@davemloft.net> xfrm: Return dst directly from xfrm_lookup()

Instead of on the stack.

Signed-off-by: David S. Miller <davem@davemloft.net>
456b61bca8ee324ab6c18b065e632c9a8c88aa39 23-Nov-2010 Eric Dumazet <eric.dumazet@gmail.com> ipv6: mcast: RCU conversion

ipv6_sk_mc_lock rwlock becomes a spinlock.

readers (inet6_mc_check()) now takes rcu_read_lock() instead of read
lock. Writers dont need to disable BH anymore.

struct ipv6_mc_socklist objects are reclaimed after one RCU grace
period.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8a22c99a80b0926585cfcbcc423ee2c49c1fd820 14-Nov-2010 Joe Perches <joe@perches.com> net/ipv6/mcast.c: Remove unnecessary semicolons

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
d8d1f30b95a635dbd610dcc5eb641aca8f4768cf 11-Jun-2010 Changli Gao <xiaosuo@gmail.com> net-next: remove useless union keyword

remove useless union keyword in rtable, rt6_info and dn_route.

Since there is only one member in a union, the union keyword isn't useful.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
96b52e61be1ad4d4f8de39b9deaf253da804ea3b 07-Jun-2010 Eric Dumazet <eric.dumazet@gmail.com> ipv6: mcast: RCU conversions

- ipv6_sock_mc_join() : doesnt touch dev refcount

- ipv6_sock_mc_drop() : doesnt touch dev/idev refcounts

- ip6_mc_find_dev() becomes ip6_mc_find_dev_rcu() (called from rcu),
and doesnt touch dev/idev refcounts

- ipv6_sock_mc_close() : doesnt touch dev/idev refcounts

- ip6_mc_source() uses ip6_mc_find_dev_rcu()

- ip6_mc_msfilter() uses ip6_mc_find_dev_rcu()

- ip6_mc_msfget() uses ip6_mc_find_dev_rcu()

- ipv6_dev_mc_dec(), ipv6_chk_mcast_addr(),
igmp6_event_query(), igmp6_event_report(),
mld_sendpack(), igmp6_send() dont touch idev refcount

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
72e09ad107e78d69ff4d3b97a69f0aad2b77280f 05-Jun-2010 Eric Dumazet <eric.dumazet@gmail.com> ipv6: avoid high order allocations

With mtu=9000, mld_newpack() use order-2 GFP_ATOMIC allocations, that
are very unreliable, on machines where PAGE_SIZE=4K

Limit allocated skbs to be at most one page. (order-0 allocations)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5d55354f147bcf82b7a330bf9617b82a692b7d49 31-May-2010 Joe Perches <joe@perches.com> net/ipv6/mcast.c: Remove unnecessary kmalloc casts

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6e7cb8370760ec17e10098399822292def8d84f3 17-Apr-2010 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> ipv6 mcast: Introduce include/net/mld.h for MLD definitions.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
22bedad3ce112d5ca1eaf043d4990fa2ed698c87 01-Apr-2010 Jiri Pirko <jpirko@redhat.com> net: convert multicast list to list_head

Converts the list and the core manipulating with it to be the same as uc_list.

+uses two functions for adding/removing mc address (normal and "global"
variant) instead of a function parameter.
+removes dev_mcast.c completely.
+exposes netdev_hw_addr_list_* macros along with __hw_addr_* functions for
manipulation with lists on a sandbox (used in bonding and 80211 drivers)

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5a0e3ad6af8660be21ca98a971cd00f331318c05 24-Mar-2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
b2e0b385d77069031edb957839aaaa8441b47287 23-Mar-2010 Jan Engelhardt <jengelh@medozas.de> netfilter: ipv6: use NFPROTO values for NF_HOOK invocation

The semantic patch that was used:
// <smpl>
@@
@@
(NF_HOOK
|NF_HOOK_THRESH
|nf_hook
)(
-PF_INET6,
+NFPROTO_IPV6,
...)
// </smpl>

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
6457d26bd40077238799e31df2b800bcf4ef9177 18-Feb-2010 Stephen Hemminger <shemminger@vyatta.com> IPv6: convert mc_lock to spinlock

Only used for writing, so convert to spinlock

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2c8c1e7297e19bdef3c178c3ea41d898a7716e3e 17-Jan-2010 Alexey Dobriyan <adobriyan@gmail.com> net: spread __net_init, __net_exit

__net_init/__net_exit are apparently not going away, so use them
to full extent.

In some cases __net_init was removed, because it was called from
__net_exit code.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ce81b76a39835a721cd168e0c0bcfe7132f1f66b 11-Nov-2009 Eric Dumazet <eric.dumazet@gmail.com> ipv6: use RCU to walk list of network devices

No longer need read_lock(&dev_base_lock), use RCU instead.
We also can avoid taking references on inet6_dev structs.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
75c78500ddad74b229cd0691496b8549490496a2 15-Sep-2009 Moni Shoua <monis@voltaire.com> bonding: remap muticast addresses without using dev_close() and dev_open()

This patch fixes commit e36b9d16c6a6d0f59803b3ef04ff3c22c3844c10. The approach
there is to call dev_close()/dev_open() whenever the device type is changed in
order to remap the device IP multicast addresses to HW multicast addresses.
This approach suffers from 2 drawbacks:

*. It assumes tha the device is UP when calling dev_close(), or otherwise
dev_close() has no affect. It is worth to mention that initscripts (Redhat)
and sysconfig (Suse) doesn't act the same in this matter.
*. dev_close() has other side affects, like deleting entries from the routing
table, which might be unnecessary.

The fix here is to directly remap the IP multicast addresses to HW multicast
addresses for a bonding device that changes its type, and nothing else.

Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Moni Shoua <monis@voltaire.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3c2b8d180a47d041e9101ead27e49d237df0f1d1 21-Jul-2009 Gerrit Renker <gerrit@erg.abdn.ac.uk> mcastv6: Local variable shadows function argument

The local variable 'idev' shadows the function argument 'idev' to
ip6_mc_add_src(). Fixed by removing the local declaration, as pmc->idev
should be identical with 'idev' passed as argument.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
adf30907d63893e4208dfe3f5c88ae12bc2f25d5 02-Jun-2009 Eric Dumazet <eric.dumazet@gmail.com> net: skb->dst accessors

Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
edf391ff17232f097d72441c9ad467bcb3b5db18 27-Apr-2009 Neil Horman <nhorman@tuxdriver.com> snmp: add missing counters for RFC 4293

The IP MIB (RFC 4293) defines stats for InOctets, OutOctets, InMcastOctets and
OutMcastOctets:
http://tools.ietf.org/html/rfc4293
But it seems we don't track those in any way that easy to separate from other
protocols. This patch adds those missing counters to the stats file. Tested
successfully by me

With help from Eric Dumazet.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
448eb71f40120a8fd11ebd58153c271c63e6f862 15-Dec-2008 Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> ipv6/mcast: join error paths using goto

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
52479b623d3d41df84c499325b6a8c7915413032 26-Nov-2008 Alexey Dobriyan <adobriyan@gmail.com> netns xfrm: lookup in netns

Pass netns to xfrm_lookup()/__xfrm_lookup(). For that pass netns
to flow_cache_lookup() and resolver callback.

Take it from socket or netdevice. Stub DECnet to init_net.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
07f0757a6808f2f36a0e58c3a54867ccffdb8dc9 20-Nov-2008 Joe Perches <joe@perches.com> include/net net/ - csum_partial - remove unnecessary casts

The first argument to csum_partial is const void *
casts to char/u8 * are not necessary

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4b7a4274ca63dadd9c4f17fc953f3a5d19855c4c 29-Oct-2008 Harvey Harrison <harvey.harrison@gmail.com> net: replace %#p6 format specifier with %pi6

gcc warns when using the # modifier with the %p format specifier,
so we can't use this to omit the colons when needed, introduces
%pi6 instead.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b071195deba14b37ce896c26f20349b46e5f9fd2 29-Oct-2008 Harvey Harrison <harvey.harrison@gmail.com> net: replace all current users of NIP6_SEQFMT with %#p6

The define in kernel.h can be done away with at a later time.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5a57d4c7fdac0e227efe8c5739fcbb263d9ae993 08-Oct-2008 Denis V. Lunev <den@openvz.org> ipv6: added net argument to ICMP6MSGOUT_INC_STATS_BH

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5c5d244bd388fe498dd7f5f57cb7770aae40b9ab 08-Oct-2008 Denis V. Lunev <den@openvz.org> ipv6: added net argument to ICMP6MSGOUT_INC_STATS

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
e41b5368e029e79d11acb5952bc73284e5026c62 08-Oct-2008 Denis V. Lunev <den@openvz.org> ipv6: added net argument to ICMP6_INC_STATS_BH

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
a862f6a6dc89c57dd3a959a1636b59f0c27169c2 08-Oct-2008 Denis V. Lunev <den@openvz.org> ipv6: added net argument to ICMP6_INC_STATS

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
483a47d2fe794328d29950fe00ce26dd405d9437 08-Oct-2008 Denis V. Lunev <den@openvz.org> ipv6: added net argument to IP6_INC_STATS_BH

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3bd653c8455bc7991bae77968702b31c8f5df883 08-Oct-2008 Denis V. Lunev <den@openvz.org> netns: add net parameter to IP6_INC_STATS

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
a6ffb404dc03f806a257faaab831a6cb55c0b790 20-Jul-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> ipv6 mcast: Omit redundant address family checks in ip6_mc_source().

The caller has alredy checked for them.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
53b7997fd5c62408d10b9aafb38974ce90fd2356 20-Jul-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> ipv6 netns: Make several "global" sysctl variables namespace aware.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
0b040829952d84bf2a62526f0e24b624e0699447 11-Jun-2008 Adrian Bunk <bunk@kernel.org> net: remove CVS keywords

This patch removes CVS keywords that weren't updated for a long time
from comments.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9cba632e24ef18e2905c18997a8f24e8d7a29e71 23-Apr-2008 Rami Rosen <ramirose@gmail.com> ipv6 mcast: Remove unused macro (MLDV2_QQIC) from mcast.c.

This patch removes MLDV2_QQIC macro from mcast.c
as it is unused.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
f5184d267c1aedb9b7a8cc44e08ff6b8d382c3b5 13-May-2008 Johannes Berg <johannes@sipsolutions.net> net: Allow netdevices to specify needed head/tailroom

This patch adds needed_headroom/needed_tailroom members to struct
net_device and updates many places that allocate sbks to use them. Not
all of them can be converted though, and I'm sure I missed some (I
mostly grepped for LL_RESERVED_SPACE)

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
d7aabf22efb50e6d52ed953ed2a43996152a7fb0 10-Apr-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Use in6addr_any where appropriate.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
f3ee4010e84452aa133e5163e6cfabc52b194e94 10-Apr-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Define constants for link-local multicast addresses.

- Define link-local all-node / all-router multicast addresses.
- Remove ipv6_addr_all_nodes() and ipv6_addr_all_routers().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
9acd9f3ae92d0dc0ca7504fb48c1040e8bbc39fe 10-Apr-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Make address arguments const.

- net/ipv6/addrconf.c:
ipv6_get_ifaddr(), ipv6_dev_get_saddr()
- net/ipv6/mcast.c:
ipv6_sock_mc_join(), ipv6_sock_mc_drop(),
inet6_mc_check(),
ipv6_dev_mc_inc(), __ipv6_dev_mc_dec(), ipv6_dev_mc_dec(),
ipv6_chk_mcast_addr()
- net/ipv6/route.c:
rt6_lookup(), icmp6_dst_alloc()
- net/ipv6/ip6_output.c:
ip6_nd_hdr()
- net/ipv6/ndisc.c:
ndisc_send_ns(), ndisc_send_rs(), ndisc_send_redirect(),
ndisc_get_neigh(), __ndisc_send()

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
1ed8516f09e510e4595bc900ad9266c15aacfdd2 03-Apr-2008 Denis V. Lunev <den@openvz.org> [IPV6]: Simplify IPv6 control sockets creation.

Do this by replacing sock_create_kern with inet_ctl_sock_create.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1218854afa6f659be90b748cf1bc7badee954a35 25-Mar-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [NET] NETNS: Omit seq_net_private->net without CONFIG_NET_NS.

Without CONFIG_NET_NS, no namespace other than &init_net exists,
no need to store net in seq_net_private.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
3b1e0a655f8eba44ab1ee2a1068d169ccfb853b9 25-Mar-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [NET] NETNS: Omit sock->sk_net without CONFIG_NET_NS.

Introduce per-sock inlines: sock_net(), sock_net_set()
and per-inet_timewait_sock inlines: twsk_net(), twsk_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
c346dca10840a874240c78efe3f39acf4312a1f2 25-Mar-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS.

Introduce per-net_device inlines: dev_net(), dev_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
ea82edf704f6bf3c3a51d0dbee816c5bbc6d3974 21-Mar-2008 Daniel Lezcano <dlezcano@fr.ibm.com> [NETNS][IPV6] mcast - fix compilation warning when procfs is not compiled in

When CONFIG_PROC_FS=no, the out_sock_create label is not used because
the code using it is disabled and that leads to a warning at compile
time.

This patch fix that by making a specific function to initialize proc
for igmp6, and remove the annoying CONFIG_PROC_FS sections in
init/exit function.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b8ad0cbc58f703972e9e37c4e2a8081dd7e6a551 07-Mar-2008 Daniel Lezcano <dlezcano@fr.ibm.com> [NETNS][IPV6] mcast - handle several network namespace

This patch make use of the network namespace information at the right
places to handle the multicast for several network namespaces. It
makes the socket control to be per namespace too.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
606a2b4862d4be31fa55cad89871fe52a422d511 04-Mar-2008 Daniel Lezcano <dlezcano@fr.ibm.com> [NETNS][IPV6] route6 - Pass the network namespace parameter to rt6_lookup

Add a network namespace parameter to rt6_lookup().

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
419271780712091ff2738aee513fa4b5b051fbdc 07-Dec-2007 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] MCAST: Use standard path for sending MLD/MLDv2 messages.

This is changing the paths for sending MLD/MLDv2 messages
from dev_queue_xmit() to standard dst_output().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
9b0f976f27f00a81cf47643d90854659626795b4 29-Feb-2008 Denis V. Lunev <den@openvz.org> [INET]: Remove struct net_proto_family* from _init calls.

struct net_proto_family* is not used in icmp[v6]_init, ndisc_init,
igmp_init and tcp_v4_init. Remove it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9a429c4983deae020f1e757ecc8f547b6d4e2f2b 02-Jan-2008 Eric Dumazet <dada1@cosmosbay.com> [NET]: Add some acquires/releases sparse annotations.

Add __acquires() and __releases() annotations to suppress some sparse
warnings.

example of warnings :

net/ipv4/udp.c:1555:14: warning: context imbalance in 'udp_seq_start' - wrong
count at exit
net/ipv4/udp.c:1571:13: warning: context imbalance in 'udp_seq_stop' -
unexpected unlock

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6e23ae2a48750bda407a4a58f52a4865d7308bf5 20-Nov-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: Introduce NF_INET_ hook values

The IPv4 and IPv6 hook values are identical, yet some code tries to figure
out the "correct" value by looking at the address family. Introduce NF_INET_*
values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__
section for userspace compatibility.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
b24b8a247ff65c01b252025926fe564209fae4fc 24-Jan-2008 Pavel Emelyanov <xemul@openvz.org> [NET]: Convert init_timer into setup_timer

Many-many code in the kernel initialized the timer->function
and timer->data together with calling init_timer(timer). There
is already a helper for this. Use it for networking code.

The patch is HUGE, but makes the code 130 lines shorter
(98 insertions(+), 228 deletions(-)).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cf7732e4cc14b56d593ff53352673e1fd5e3ba52 10-Oct-2007 Pavel Emelyanov <xemul@openvz.org> [NET]: Make core networking code use seq_open_private

This concerns the ipv4 and ipv6 code mostly, but also the netlink
and unix sockets.

The netlink code is an example of how to use the __seq_open_private()
call - it saves the net namespace on this private.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
cfcabdcc2d5a810208e5bb3974121b7ed60119aa 09-Oct-2007 Stephen Hemminger <shemminger@linux-foundation.org> [NET]: sparse warning fixes

Fix a bunch of sparse warnings. Mostly about 0 used as
NULL pointer, and shadowed variable declarations.
One notable case was that hash size should have been unsigned.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
0c4e85813d0a94eeb8bf813397a4907bdd7bb610 09-Oct-2007 Stephen Hemminger <shemminger@linux-foundation.org> [NET]: Wrap netdevice hardware header creation.

Add inline for common usage of hardware header creation, and
fix bug in IPV6 mcast where the assumption about negative return is
an errno. Negative return from hard_header means not enough space
was available,(ie -N bytes).

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14878f75abd5bf1d38becb405801cd491ee215dc 17-Sep-2007 David L Stevens <dlstevens@us.ibm.com> [IPV6]: Add ICMPMsgStats MIB (RFC 4293) [rev 2]

Background: RFC 4293 deprecates existing individual, named ICMP
type counters to be replaced with the ICMPMsgStatsTable. This table
includes entries for both IPv4 and IPv6, and requires counting of all
ICMP types, whether or not the machine implements the type.

These patches "remove" (but not really) the existing counters, and
replace them with the ICMPMsgStats tables for v4 and v6.
It includes the named counters in the /proc places they were, but gets the
values for them from the new tables. It also counts packets generated
from raw socket output (e.g., OutEchoes, MLD queries, RA's from
radvd, etc).

Changes:
1) create icmpmsg_statistics mib
2) create icmpv6msg_statistics mib
3) modify existing counters to use these
4) modify /proc/net/snmp to add "IcmpMsg" with all ICMP types
listed by number for easy SNMP parsing
5) modify /proc/net/snmp printing for "Icmp" to get the named data
from new counters.
[new to 2nd revision]
6) support per-interface ICMP stats
7) use common macro for per-device stat macros

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
881d966b48b035ab3f3aeaae0f3d3f9b584f45b2 17-Sep-2007 Eric W. Biederman <ebiederm@xmission.com> [NET]: Make the device list and device lookups per namespace.

This patch makes most of the generic device layer network
namespace safe. This patch makes dev_base_head a
network namespace variable, and then it picks up
a few associated variables. The functions:
dev_getbyhwaddr
dev_getfirsthwbytype
dev_get_by_flags
dev_get_by_name
__dev_get_by_name
dev_get_by_index
__dev_get_by_index
dev_ioctl
dev_ethtool
dev_load
wireless_process_ioctl

were modified to take a network namespace argument, and
deal with it.

vlan_ioctl_set and brioctl_set were modified so their
hooks will receive a network namespace argument.

So basically anthing in the core of the network stack that was
affected to by the change of dev_base was modified to handle
multiple network namespaces. The rest of the network stack was
simply modified to explicitly use &init_net the initial network
namespace. This can be fixed when those components of the network
stack are modified to handle multiple network namespaces.

For now the ifindex generator is left global.

Fundametally ifindex numbers are per namespace, or else
we will have corner case problems with migration when
we get that far.

At the same time there are assumptions in the network stack
that the ifindex of a network device won't change. Making
the ifindex number global seems a good compromise until
the network stack can cope with ifindex changes when
you change namespaces, and the like.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
457c4cbc5a3dde259d2a1f15d5f9785290397267 12-Sep-2007 Eric W. Biederman <ebiederm@xmission.com> [NET]: Make /proc/net per network namespace

This patch makes /proc/net per network namespace. It modifies the global
variables proc_net and proc_net_stat to be per network namespace.
The proc_net file helpers are modified to take a network namespace argument,
and all of their callers are fixed to pass &init_net for that argument.
This ensures that all of the /proc/net files are only visible and
usable in the initial network namespace until the code behind them
has been updated to be handle multiple network namespaces.

Making /proc/net per namespace is necessary as at least some files
in /proc/net depend upon the set of network devices which is per
network namespace, and even more files in /proc/net have contents
that are relevant to a single network namespace.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
56b3d975bbce65f655c5612b4822da671f9fd9b2 11-Jul-2007 Philippe De Muyter <phdm@macqel.be> [NET]: Make all initialized struct seq_operations const.

Make all initialized struct seq_operations in net/ const

Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
7562f876cd93800f2f8c89445f2a563590b24e09 04-May-2007 Pavel Emelianov <xemul@openvz.org> [NET]: Rework dev_base via list_head (v3)

Cleanup of dev_base list use, with the aim to simplify making device
list per-namespace. In almost every occasion, use of dev_base variable
and dev->next pointer could be easily replaced by for_each_netdev
loop. A few most complicated places were converted to using
first_netdev()/next_netdev().

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
27a884dc3cb63b93c2b3b643f5b31eed5f8a4d26 20-Apr-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Convert skb->tail to sk_buff_data_t

So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
on 64bit architectures, allowing us to combine the 4 bytes hole left by the
layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
:-)

Many calculations that previously required that skb->{transport,network,
mac}_header be first converted to a pointer now can be done directly, being
meaningful as offsets or pointers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cfe1fc7759fdacb0c650b575daed1692bf3eaece 16-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce skb_network_header_len

For the common sequence "skb->h.raw - skb->nh.raw", similar to skb->mac_len,
that is precalculated tho, don't think we need to bloat skb with one more
member, so just use this new helper, reducing the number of non-skbuff.h
references to the layer headers even more.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
d10ba34b001944a8d1c8adb5646140ef089c432b 15-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: More skb_put related skb_reset_transport_header

This time we have to set it to skb->tail that is not anymore equal to
skb->data, so we either add a new helper or just add the skb->tail - skb->data
offset, for now do the later.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9c70220b73908f64792422a2c39c593c4792f2c5 26-Apr-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce skb_transport_header(skb)

For the places where we need a pointer to the transport header, it is
still legal to touch skb->h.raw directly if just adding to,
subtracting from or setting it to another layer header.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cc70ab261c9f997589546100ddec5da6bfd89c4e 13-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [ICMP6]: Introduce icmp6_hdr()

For consistency with all the other skb->h.raw accessors.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
0660e03f6b18f19b6bbafe7583265a51b90daf36 26-Apr-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce ipv6_hdr(), remove skb->nh.ipv6h

Now the skb->nh union has just one member, .raw, i.e. it is just like the
skb->mac union, strange, no? I'm just leaving it like that till the transport
layer is done with, when we'll rename skb->mac.raw to skb->mac_header (or
->mac_header_offset?), ditto for ->{h,nh}.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
95c385b4d5a71b8ad552aecaa968ea46d7da2f6a 26-Apr-2007 Neil Horman <nhorman@tuxdriver.com> [IPV6] ADDRCONF: Optimistic Duplicate Address Detection (RFC 4429) Support.

Nominally an autoconfigured IPv6 address is added to an interface in the
Tentative state (as per RFC 2462). Addresses in this state remain in this
state while the Duplicate Address Detection process operates on them to
determine their uniqueness on the network. During this period, these
tentative addresses may not be used for communication, increasing the time
before a node may be able to communicate on a network. Using Optimistic
Duplicate Address Detection, autoconfigured addresses may be used
immediately for communication on the network, as long as certain rules are
followed to avoid conflicts with other nodes during the Duplicate Address
Detection process.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9a32144e9d7b4e21341174b1a83b82a82353be86 12-Feb-2007 Arjan van de Ven <arjan@linux.intel.com> [PATCH] mark struct file_operations const 7

Many struct file_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1ab1457c42bc078e5a9becd82a7f9f940b55c53a 09-Feb-2007 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [NET] IPV6: Fix whitespace errors.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
cc63f70b8b410eb653449151821f6b8b9af6ca42 06-Feb-2007 Alexey Dobriyan <adobriyan@openvz.org> [IPV4/IPV6] multicast: Check add_grhead() return value

add_grhead() allocates memory with GFP_ATOMIC and in at least two places skb
from it passed to skb_put() without checking.

Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
d88ae4cc97b24783ee4480697fbdcc02ab4133a6 15-Jan-2007 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] MCAST: Fix joining all-node multicast group on device initialization.

Join all-node multicast group after assignment of dev->ip6_ptr
because it must be assigned when ipv6_dev_mc_inc() is called.
This fixes Bug#7817, reported by <gernoth@informatik.uni-erlangen.de>.

Closes: 7817
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
868c86bcb5bdea7ed8d45979b17bb919af9254db 15-Nov-2006 Al Viro <viro@zeniv.linux.org.uk> [NET]: annotate csum_ipv6_magic() callers in net/*

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
a11d206d0f88e092419877c7f706cafb5e1c2e57 04-Nov-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Per-interface statistics support.

For IP MIB (RFC4293).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
8a74ff7770f4795d7580392890fdf06821c78769 08-Nov-2006 Al Viro <viro@zeniv.linux.org.uk> [IPV6]: annotate ipv6 mcast

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
ab32ea5d8a760e7dd4339634e95d7be24ee5b842 22-Sep-2006 Brian Haley <brian.haley@hp.com> [NET/IPV4/IPV6]: Change some sysctl variables to __read_mostly

Change net/core, ipv4 and ipv6 sysctl variables to __read_mostly.

Couldn't actually measure any performance increase while testing (.3%
I consider noise), but seems like the right thing to do.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
acd6e00b8e4db542cb6bc9ddfbb4e18bbe29ce4d 18-Aug-2006 David L Stevens <dlstevens@us.ibm.com> [MCAST]: Fix filter leak on device removal.

This fixes source filter leakage when a device is removed and a
process leaves the group thereafter.

This also includes corresponding fixes for IPv6 multicast source
filters on device removal.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6ab3d5624e172c553004ecc862bfeac16d9d68b7 30-Jun-2006 Jörn Engel <joern@wohnheim.fh-wedel.de> Remove obsolete #include <linux/config.h>

Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
0c600eda4b1c9f8f6d1bd643d494db5a29842fbe 21-Mar-2006 Ingo Oeser <ioe-lkml@rameria.de> [IPV6]: Nearly complete kzalloc cleanup for net/ipv6

Stupidly use kzalloc() instead of kmalloc()/memset()
everywhere where this is possible in net/ipv6/*.c .

Signed-off-by: Ingo Oeser <ioe-lkml@rameria.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
e80e28b6b67ecc25fa89c9129a5f70de6389b2a6 04-Feb-2006 Al Viro <viro@zeniv.linux.org.uk> [PATCH] net/ipv6/mcast.c NULL noise removal

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
7add2a439868d636910fb6a216b12c7392778956 24-Jan-2006 David L Stevens <dlstevens@us.ibm.com> [IPV6] MLDv2: fix change records when transitioning to/from inactive

The following patch fixes these problems in MLDv2:

1) Add/remove "delete" records for sending change reports when
addition of a filter results in that filter transitioning to/from
inactive. [same as recent IPv4 IGMPv3 fix]
2) Remove 2 redundant "group_type" checks (can't be IPV6_ADDR_ANY
within that loop, so checks are always true)
3) change an is_in() "return 0" to "return type == MLD2_MODE_IS_INCLUDE".
It should always be "0" to get here, but it improves code locality
to not assume it, and if some race allowed otherwise, doing
the check would return the correct result.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9343e79a7bb2d3268d68997163608b87d58d8098 17-Jan-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Preserve procfs IPV6 address output format

Procfs always output IPV6 addresses without the colon
characters, and we cannot change that.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
46b86a2da0fd14bd49765330df63a62279833acb 13-Jan-2006 Joe Perches <joe@perches.com> [NET]: Use NIP6_FMT in kernel.h

There are errors and inconsistency in the display of NIP6 strings.
ie: net/ipv6/ip6_flowlabel.c

There are errors and inconsistency in the display of NIPQUAD strings too.
ie: net/netfilter/nf_conntrack_ftp.c

This patch:
adds NIP6_FMT to kernel.h
changes all code to use NIP6_FMT
fixes net/ipv6/ip6_flowlabel.c
adds NIPQUAD_FMT to kernel.h
fixes net/netfilter/nf_conntrack_ftp.c
changes a few uses of "%u.%u.%u.%u" to NIPQUAD_FMT for symmetry to NIP6_FMT

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8b3a70058bfe711b2d05ba2134178bae623183ce 12-Jan-2006 Kris Katterjohn <kjak@users.sourceforge.net> [NET]: Remove more unneeded typecasts on *malloc()

This removes more unneeded casts on the return value for kmalloc(),
sock_kmalloc(), and vmalloc().

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
196433c5b788eb732fdcf92449274e302f089ce4 04-Jan-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Use macro for rwlock_t initialization.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5ab4a6c81eb3dbe32361791d1535f9153f79b0ed 27-Dec-2005 David L Stevens <dlstevens@us.ibm.com> [IPV6] mcast: Fix multiple issues in MLDv2 reports.

The below "jumbo" patch fixes the following problems in MLDv2.

1) Add necessary "ntohs" to recent "pskb_may_pull" check [breaks
all nonzero source queries on little-endian (!)]

2) Add locking to source filter list [resend of prior patch]

3) fix "mld_marksources()" to
a) send nothing when all queried sources are excluded
b) send full exclude report when source queried sources are
not excluded
c) don't schedule a timer when there's nothing to report

NOTE: RFC 3810 specifies the source list should be saved and each
source reported individually as an IS_IN. This is an obvious DOS
path, requiring the host to store and then multicast as many sources
as are queried (e.g., millions...). This alternative sends a full,
relevant report that's limited to number of sources present on the
machine.

4) fix "add_grec()" to send empty-source records when it should
The original check doesn't account for a non-empty source
list with all sources inactive; the new code keeps that
short-circuit case, and also generates the group header
with an empty list if needed.

5) fix mca_crcount decrement to be after add_grec(), which needs
its original value

These issues (other than item #1 ;-) ) were all found by Yan Zheng,
much thanks!

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6f4353d891b5e477528cd0b996e0263fecdf5d5f 27-Dec-2005 David L Stevens <dlstevens@us.ibm.com> [IPV6]: Increase default MLD_MAX_MSF to 64.

The existing default of 10 is just way too low.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
24c6927505ca77ee4ac25fb31dcd56f6506979ed 03-Dec-2005 David Stevens <dlstevens@us.ibm.com> [IGMP]: workaround for IGMP v1/v2 bug

From: David Stevens <dlstevens@us.ibm.com>

As explained at:

http://www.cs.ucsb.edu/~krishna/igmp_dos/

With IGMP version 1 and 2 it is possible to inject a unicast
report to a client which will make it ignore multicast
reports sent later by the router.

The fix is to only accept the report if is was sent to a
multicast or unicast address.

Signed-off-by: David S. Miller <davem@davemloft.net>
8713dbf05754aa777f31bf491cb60a111f7ad828 28-Oct-2005 Yan Zheng <yanzheng@21cn.com> [MCAST]: ip[6]_mc_add_src should be called when number of sources is zero

And filter mode is exclude.

Further explanation by David Stevens:

Multicast source filters aren't widely used yet, and that's really the only
feature that's affected if an application actually exercises this bug, as far
as I can tell. An ordinary filter-less multicast join should still work, and
only forwarded multicast traffic making use of filters and doing empty-source
filters with the MSFILTER ioctl would be at risk of not getting multicast
traffic forwarded to them because the reports generated would not be based on
the correct counts.

Signed-off-by: Yan Zheng <yanzheng@21cn.com
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
97300b5fdfe28c6edae926926f9467a27cf5889c 31-Oct-2005 Yan Zheng <yanzheng@21cn.com> [MCAST] IPv6: Check packet size when process Multicast

Signed-off-by: Yan Zheng <yanzheng@21cn.com
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
f12baeab9d65e2fe1b43b09b666f5efcb81b9369 28-Oct-2005 Yan Zheng <yanzheng@21cn.com> [MCAST] IPv6: Fix algorithm to compute Querier's Query Interval

5.1.3. Maximum Response Code

The Maximum Response Code field specifies the maximum time allowed
before sending a responding Report. The actual time allowed, called
the Maximum Response Delay, is represented in units of milliseconds,
and is derived from the Maximum Response Code as follows:

If Maximum Response Code < 32768,
Maximum Response Delay = Maximum Response Code

If Maximum Response Code >=32768, Maximum Response Code represents a
floating-point value as follows:

0 1 2 3 4 5 6 7 8 9 A B C D E F
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1| exp | mant |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Maximum Response Delay = (mant | 0x1000) << (exp+3)


5.1.9. QQIC (Querier's Query Interval Code)

The Querier's Query Interval Code field specifies the [Query
Interval] used by the Querier. The actual interval, called the
Querier's Query Interval (QQI), is represented in units of seconds,
and is derived from the Querier's Query Interval Code as follows:

If QQIC < 128, QQI = QQIC

If QQIC >= 128, QQIC represents a floating-point value as follows:

0 1 2 3 4 5 6 7
+-+-+-+-+-+-+-+-+
|1| exp | mant |
+-+-+-+-+-+-+-+-+

QQI = (mant | 0x10) << (exp + 3)

-- rfc3810

#define MLDV2_QQIC(value) MLDV2_EXP(0x80, 4, 3, value)
#define MLDV2_MRC(value) MLDV2_EXP(0x8000, 12, 3, value)

Above macro are defined in mcast.c. but 1 << 4 == 0x10 and 1 << 12 == 0x1000.
So the result computed by original Macro is larger.

Signed-off-by: Yan Zheng <yanzheng@21cn.com>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
fab10fe37ad8dc4388fc444c89ef5aefe906354f 05-Oct-2005 Yan Zheng <yanzheng@21cn.com> [MCAST] ipv6: Fix address size in grec_size

Signed-Off-By: Yan Zheng <yanzheng@21cn.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
de9daad90ecb54f3c37c3f8967d581e20d927539 15-Sep-2005 Denis Lukianov <denis@voxelsoft.com> [MCAST]: Fix MCAST_EXCLUDE line dupes

This patch fixes line dupes at /ipv4/igmp.c and /ipv6/mcast.c in the
2.6 kernel, where MCAST_EXCLUDE is mistakenly used instead of
MCAST_INCLUDE.

Signed-off-by: Denis Lukianov <denis@voxelsoft.com>
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9c05989bb2264f0fa4fc95f81d2c4e6aa2eaa24d 09-Jul-2005 David S. Miller <davem@davemloft.net> [IPV6]: Fix warning in ip6_mc_msfilter.

Signed-off-by: David S. Miller <davem@davemloft.net>
9951f036fe8a4e6b21962559c64ff13b290ff01a 09-Jul-2005 David L Stevens <dlstevens@us.ibm.com> [IPV4]: (INCLUDE,empty)/leave-group equivalence for full-state MSF APIs & errno fix

1) Adds (INCLUDE, empty)/leave-group equivalence to the full-state
multicast source filter APIs (IPv4 and IPv6)

2) Fixes an incorrect errno in the IPv6 leave-group (ENOENT should be
EADDRNOTAVAIL)

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
917f2f105ea4bbba8604e3ed55233eebda7afe6a 09-Jul-2005 David L Stevens <dlstevens@us.ibm.com> [IPV4]: multicast API "join" issues

1) In the full-state API when imsf_numsrc == 0
errno should be "0", but returns EADDRNOTAVAIL

2) An illegal filter mode change
errno should be EINVAL, but returns EADDRNOTAVAIL

3) Trying to do an any-source option without IP_ADD_MEMBERSHIP
errno should be EINVAL, but returns EADDRNOTAVAIL

4) Adds comments for the less obvious error return values

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e98231858bbfd2aca42f93d55133c2fca6df00f9 21-Jun-2005 Patrick McHardy <kaber@trash.net> [NETFILTER]: Restore netfilter assumptions in IPv6 multicast

Netfilter assumes that skb->data == skb->nh.ipv6h

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
c9e3e8b6958e02230079e6817862ea2968509866 21-Jun-2005 David L Stevens <dlstevens@us.ibm.com> [IPV6]: multicast join and misc

Here is a simplified version of the patch to fix a bug in IPv6
multicasting. It:

1) adds existence check & EADDRINUSE error for regular joins
2) adds an exception for EADDRINUSE in the source-specific multicast
join (where a prior join is ok)
3) adds a missing/needed read_lock on sock_mc_list; would've raced
with destroying the socket on interface down without
4) adds a "leave group" in the (INCLUDE, empty) source filter case.
This frees unneeded socket buffer memory, but also prevents
an inappropriate interaction among the 8 socket options that
mess with this. Some would fail as if in the group when you
aren't really.

Item #4 had a locking bug in the last version of this patch; rather than
removing the idev->lock read lock only, I've simplified it to remove
all lock state in the path and treat it as a direct "leave group" call for
the (INCLUDE,empty) case it covers. Tested on an MP machine. :-)

Much thanks to HoerdtMickael <hoerdt@clarinet.u-strasbg.fr> who
reported the original bug.

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 17-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org> Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!