History log of /net/ipv6/ndisc.c
Revision Date Author Comments
8ceb9f4421983a8036855fc2510e3109a799e4c1 04-Feb-2015 Erik Kline <ek@google.com> net: ipv6: allow explicitly choosing optimistic addresses

RFC 4429 ("Optimistic DAD") states that optimistic addresses
should be treated as deprecated addresses. From section 2.1:

Unless noted otherwise, components of the IPv6 protocol stack
should treat addresses in the Optimistic state equivalently to
those in the Deprecated state, indicating that the address is
available for use but should not be used if another suitable
address is available.

Optimistic addresses are indeed avoided when other addresses are
available (i.e. at source address selection time), but they have
not heretofore been available for things like explicit bind() and
sendmsg() with struct in6_pktinfo, etc.

This change makes optimistic addresses treated more like
deprecated addresses than tentative ones.

Signed-off-by: Erik Kline <ek@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4c83acbc565d53296f1731034c5041a0fbabcaeb 24-Aug-2014 Ian Morris <ipm@chirality.org.uk> ipv6: White-space cleansing : gaps between function and symbol export

This patch makes no changes to the logic of the code but simply addresses
coding style issues as detected by checkpatch.

Both objdump and diff -w show no differences.

This patch removes some blank lines between the end of a function
definition and the EXPORT_SYMBOL_GPL macro in order to prevent
checkpatch warning that EXPORT_SYMBOL must immediately follow
a function.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
67ba4152e8b77eada6a9c64e3c2c84d6112794fc 24-Aug-2014 Ian Morris <ipm@chirality.org.uk> ipv6: White-space cleansing : Line Layouts

This patch makes no changes to the logic of the code but simply addresses
coding style issues as detected by checkpatch.

Both objdump and diff -w show no differences.

A number of items are addressed in this patch:
* Multiple spaces converted to tabs
* Spaces before tabs removed.
* Spaces in pointer typing cleansed (char *)foo etc.
* Remove space after sizeof
* Ensure spacing around comparators such as if statements.

Signed-off-by: Ian Morris <ipm@chirality.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
56ec0fb10c9f8b35a2991871ea879840d1a0cbde 24-Jul-2014 Himangi Saraogi <himangi774@gmail.com> neigh: remove exceptional & on function name

In this file, function names are otherwise used as pointers without &.

A simplified version of the Coccinelle semantic patch that makes this
change is as follows:

// <smpl>
@r@
identifier f;
@@

f(...) { ... }

@@
identifier r.f;
@@

- &f
+ f
// </smpl>

Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
b6428817190c5444294e0cc45bd571bfafbbb537 10-Jul-2014 Li RongQing <roy.qing.li@gmail.com> ipv6: fix the check when handle RA

d9333196572(ipv6: Allow accepting RA from local IP addresses.) made the wrong
check, whether or not to accept RA with source-addr found on local machine, when
accept_ra_from_local is 0.

Fixes: d9333196572(ipv6: Allow accepting RA from local IP addresses.)
Cc: Ben Greear <greearb@candelatech.com>
Cc: Hannes Frederic Sowa <hannes@redhat.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
d93331965729850303f6111381c1a4a9e9b8ae5a 25-Jun-2014 Ben Greear <greearb@candelatech.com> ipv6: Allow accepting RA from local IP addresses.

This can be used in virtual networking applications, and
may have other uses as well. The option is disabled by
default.

A specific use case is setting up virtual routers, bridges, and
hosts on a single OS without the use of network namespaces or
virtual machines. With proper use of ip rules, routing tables,
veth interface pairs and/or other virtual interfaces,
and applications that can bind to interfaces and/or IP addresses,
it is possibly to create one or more virtual routers with multiple
hosts attached. The host interfaces can act as IPv6 systems,
with radvd running on the ports in the virtual routers. With the
option provided in this patch enabled, those hosts can now properly
obtain IPv6 addresses from the radvd.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
f2a762d8a97032e58c09c5798832e25268e1c111 25-Jun-2014 Ben Greear <greearb@candelatech.com> ipv6: Add more debugging around accept-ra logic.

This is disabled by default, just like similar debug info
already in this module. But, makes it easier to find out
why RA is not being accepted when debugging strange behaviour.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
be7a010d6fa33dca9327ad8e91844278dfd1e712 15-May-2014 Duan Jiong <duanj.fnst@cn.fujitsu.com> ipv6: update Destination Cache entries when gateway turn into host

RFC 4861 states in 7.2.5:

The IsRouter flag in the cache entry MUST be set based on the
Router flag in the received advertisement. In those cases
where the IsRouter flag changes from TRUE to FALSE as a result
of this update, the node MUST remove that router from the
Default Router List and update the Destination Cache entries
for all destinations using that neighbor as a router as
specified in Section 7.3.3. This is needed to detect when a
node that is used as a router stops forwarding packets due to
being configured as a host.

Currently, when dealing with NA Message which IsRouter flag changes from
TRUE to FALSE, the kernel only removes router from the Default Router List,
and don't update the Destination Cache entries.

Now in order to update those Destination Cache entries, i introduce
function rt6_clean_tohost().

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
73af614aedd221df8495fc8c9993c50e87f899f2 07-Dec-2013 Jiri Pirko <jiri@resnulli.us> neigh: use tbl->family to distinguish ipv4 from ipv6

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
cb5b09c17fe60056bc8f127ffc987d361c40ed4b 07-Dec-2013 Jiri Pirko <jiri@resnulli.us> neigh: wrap proc dointvec functions

This will be needed later on to provide better management of default values.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
1f9248e5606afc6485255e38ad57bdac08fa7711 07-Dec-2013 Jiri Pirko <jiri@resnulli.us> neigh: convert parms to an array

This patch converts the neigh param members to an array. This allows easier
manipulation which will be needed later on to provide better management of
default values.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
30e56918dd1e6d64350661f186657f6a6f2646e6 26-Nov-2013 Duan Jiong <duanj.fnst@cn.fujitsu.com> ipv6: judge the accept_ra_defrtr before calling rt6_route_rcv

when dealing with a RA message, if accept_ra_defrtr is false,
the kernel will not add the default route, and then deal with
the following route information options. Unfortunately, those
options maybe contain default route, so let's judge the
accept_ra_defrtr before calling rt6_route_rcv.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bcd081a3aef1f7f3786067ae8dd26aaa1cf85153 16-Nov-2013 Fabio Estevam <fabio.estevam@freescale.com> net: ipv6: ndisc: Fix warning when CONFIG_SYSCTL=n

When CONFIG_SYSCTL=n the following build warning happens:

net/ipv6/ndisc.c:1730:1: warning: label 'out' defined but not used [-Wunused-label]

The 'out' label is only used when CONFIG_SYSCTL=y, so move it inside the
'ifdef CONFIG_SYSCTL' block.

Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2c861cc65ef4604011a0082e4dcdba2819aa191a 09-Sep-2013 Michal Kubeček <mkubecek@suse.cz> ipv6: don't call fib6_run_gc() until routing is ready

When loading the ipv6 module, ndisc_init() is called before
ip6_route_init(). As the former registers a handler calling
fib6_run_gc(), this opens a window to run the garbage collector
before necessary data structures are initialized. If a network
device is initialized in this window, adding MAC address to it
triggers a NETDEV_CHANGEADDR event, leading to a crash in
fib6_clean_all().

Take the event handler registration out of ndisc_init() into a
separate function ndisc_late_init() and move it after
ip6_route_init().

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
b55b76b22144ab97cefcb3862bab61f088adf411 04-Sep-2013 Duan Jiong <duanj.fnst@cn.fujitsu.com> ipv6:introduce function to find route for redirect

RFC 4861 says that the IP source address of the Redirect is the
same as the current first-hop router for the specified ICMP
Destination Address, so the gateway should be taken into
consideration when we find the route for redirect.

There was once a check in commit
a6279458c534d01ccc39498aba61c93083ee0372 ("NDISC: Search over
all possible rules on receipt of redirect.") and the check
went away in commit b94f1c0904da9b8bf031667afc48080ba7c3e8c9
("ipv6: Use icmpv6_notify() to propagate redirect, instead of
rt6_redirect()").

The bug is only "exploitable" on layer-2 because the source
address of the redirect is checked to be a valid link-local
address but it makes spoofing a lot easier in the same L2
domain nonetheless.

Thanks very much for Hannes's help.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
25a6e6b84fba601eff7c28d30da8ad7cfbef0d43 03-Sep-2013 Thomas Graf <tgraf@suug.ch> ipv6: Don't depend on per socket memory for neighbour discovery messages

Allocating skbs when sending out neighbour discovery messages
currently uses sock_alloc_send_skb() based on a per net namespace
socket and thus share a socket wmem buffer space.

If a netdevice is temporarily unable to transmit due to carrier
loss or for other reasons, the queued up ndisc messages will cosnume
all of the wmem space and will thus prevent from any more skbs to
be allocated even for netdevices that are able to transmit packets.

The number of neighbour discovery messages sent is very limited,
use of alloc_skb() bypasses the socket wmem buffer size enforcement
while the manual call to skb_set_owner_w() maintains the socket
reference needed for the IPv6 output path.

This patch has orginally been posted by Eric Dumazet in a modified
form.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Stephen Warren <swarren@wwwdotorg.org>
Cc: Fabio Estevam <festevam@gmail.com>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3e25c65ed085b361cc91a8f02e028f1158c9f255 29-Aug-2013 Tim Gardner <tim.gardner@canonical.com> net: neighbour: Remove CONFIG_ARPD

This config option is superfluous in that it only guards a call
to neigh_app_ns(). Enabling CONFIG_ARPD by default has no
change in behavior. There will now be call to __neigh_notify()
for each ARP resolution, which has no impact unless there is a
user space daemon waiting to receive the notification, i.e.,
the case for which CONFIG_ARPD was designed anyways.

Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Gao feng <gaofeng@cn.fujitsu.com>
Cc: Joe Perches <joe@perches.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
f564f45c451809aa3b74f577754528520d315ac1 31-Aug-2013 Cong Wang <amwang@redhat.com> vxlan: add ipv6 proxy support

This patch adds the IPv6 version of "arp_reduce", ndisc_send_na()
will be needed.

Cc: David S. Miller <davem@davemloft.net>
Cc: David Stevens <dlstevens@us.ibm.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
25ad6117e73656071b38fd19fa67ae325471c758 30-Aug-2013 David S. Miller <davem@davemloft.net> Revert "ipv6: Don't depend on per socket memory for neighbour discovery messages"

This reverts commit 1f324e38870cc09659cf23bc626f1b8869e201f2.

It seems to cause regressions, and in particular the output path
really depends upon there being a socket attached to skb->sk for
checks such as sk_mc_loop(skb->sk) for example. See ip6_output_finish2().

Reported-by: Stephen Warren <swarren@wwwdotorg.org>
Reported-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
816c5b5b016acde51f0547cf1657d87cf22a6d47 28-Aug-2013 Thomas Graf <tgraf@suug.ch> ipv6: Remove redundant sk variable

A sk variable initialized to ndisc_sk is already available outside
of the branch.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
1f324e38870cc09659cf23bc626f1b8869e201f2 28-Aug-2013 Thomas Graf <tgraf@suug.ch> ipv6: Don't depend on per socket memory for neighbour discovery messages

Allocating skbs when sending out neighbour discovery messages
currently uses sock_alloc_send_skb() based on a per net namespace
socket and thus share a socket wmem buffer space.

If a netdevice is temporarily unable to transmit due to carrier
loss or for other reasons, the queued up ndisc messages will cosnume
all of the wmem space and will thus prevent from any more skbs to
be allocated even for netdevices that are able to transmit packets.

The number of neighbour discovery messages sent is very limited,
simply use alloc_skb() and don't depend on any socket wmem space any
longer.

This patch has orginally been posted by Eric Dumazet in a modified
form.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b800c3b966bcf004bd8592293a49ed5cb7ea67a9 27-Aug-2013 Hannes Frederic Sowa <hannes@stressinduktion.org> ipv6: drop fragmented ndisc packets by default (RFC 6980)

This patch implements RFC6980: Drop fragmented ndisc packets by
default. If a fragmented ndisc packet is received the user is informed
that it is possible to disable the check.

Cc: Fernando Gont <fernando@gont.com.ar>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
c92a59eca86f5d13ae4d481c3bae6b54609fe006 21-Aug-2013 Duan Jiong <duanj.fnst@cn.fujitsu.com> ipv6: handle Redirect ICMP Message with no Redirected Header option

rfc 4861 says the Redirected Header option is optional, so
the kernel should not drop the Redirect Message that has no
Redirected Header option. In this patch, the function
ip6_redirect_no_header() is introduced to deal with that
condition.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
2ac3ac8f86f2fe065d746d9a9abaca867adec577 01-Aug-2013 Michal Kubeček <mkubecek@suse.cz> ipv6: prevent fib6_run_gc() contention

On a high-traffic router with many processors and many IPv6 dst
entries, soft lockup in fib6_run_gc() can occur when number of
entries reaches gc_thresh.

This happens because fib6_run_gc() uses fib6_gc_lock to allow
only one thread to run the garbage collector but ip6_dst_gc()
doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
returns. On a system with many entries, this can take some time
so that in the meantime, other threads pass the tests in
ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
the lock. They then have to run the garbage collector one after
another which blocks them for quite long.

Resolve this by replacing special value ~0UL of expire parameter
to fib6_run_gc() by explicit "force" parameter to choose between
spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
force=false if gc_thresh is reached but not max_size.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
f2f79cca13e36972978ffd1fb6483c8a1141b510 13-Jul-2013 Daniel Baluta <dbaluta@ixiacom.com> ndisc: bool initializations should use true and false

Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
33be081a812098567898bbe23b581460331a986f 31-May-2013 Matthias Schiffer <mschiffer@universe-factory.net> ipv6: ndisc: fix ndisc_send_redirect writing to the wrong skb

Since some refactoring in 5f5a011, ndisc_send_redirect called
ndisc_fill_redirect_hdr_option on the wrong skb, leading to data corruption or
in the worst case a panic when the skb_put failed.

Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
29a3cad5c6ae9e7fbf1509d01d39c3c3c38f11f9 28-May-2013 Simon Horman <horms@verge.net.au> ipv6: Correct comparisons and calculations using skb->tail and skb-transport_header

This corrects an regression introduced by "net: Use 16bits for *_headers
fields of struct skbuff" when NET_SKBUFF_DATA_USES_OFFSET is not set. In
that case skb->tail will be a pointer whereas skb->transport_header
will be an offset from head. This is corrected by using wrappers that
ensure that comparisons and calculations are always made using pointers.

Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
351638e7deeed2ec8ce451b53d33921b3da68f83 28-May-2013 Jiri Pirko <jiri@resnulli.us> net: pass info struct via netdevice notifier

So far, only net_device * could be passed along with netdevice notifier
event. This patch provides a possibility to pass custom structure
able to provide info that event listener needs to know.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>

v2->v3: fix typo on simeth
shortened dev_getter
shortened notifier_info struct name
v1->v2: fix notifier_call parameter in call_netdevice_notifier()
Signed-off-by: David S. Miller <davem@davemloft.net>
80580d4b20e13383009b4fb2043235a7d0a42589 08-Mar-2013 Thomas Graf <tgraf@suug.ch> ipv6: ndisc: remove redundant check for !dev->addr_len

send_sllao is already initialized with the value of dev->addr_len

Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4d5c152e8694b57094e320060be4621dee5779dd 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Use compound literals to build redirect message.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
1cb3fe513f62ca7c5963d551e01d75aa4f6ca72a 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Break down ndisc_build_skb() and build message directly.

Construct NS/NA/RS message directly using C99 compound literals.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
b44b5f4ae96de87e458ef73fc01e7ec0dbac10c0 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Break down __ndisc_send().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7b3d9b06d8b6407fe25e00e627b48ec0d4a2a076 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Fill in ICMPv6 checksum and IPv6 header in ndisc_send_skb().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
f4de84c64e3d801965ee4c6dadd19b76b9424bbf 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Use ndisc_send_skb() for redirect.

Reuse dst if one is attached with skb.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
aa4bdd4b3f43274252c1280f6464850b0f004fc8 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Remove icmp6h argument from ndisc_send_skb().

skb_transport_header() (thus icmp6_hdr()) is available here,
use it.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5f5a0115635bf678d267cbc72169a718e8724d03 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Make ndisc_fill_xxx_option() for sk_buff.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2ce13576144ade3e9340efce9620a6dd338aedc7 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Calculate message body length and option length separately.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5135e633f92ab4deb3600a30cbbec6e0929fc8a4 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Reset skb->trasport_headner inside ndisc_alloc_send_skb().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
527a150fb2292a59ca0545dace8d482581253532 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Defer building IPv6 header.

Build ICMPv6 message first and make buffer management easier;
we can use skb->len when filling checksum in ICMPv6 header,
and then build IP header with length field.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
af9a997629488ba5f92dc3d76e1ec51fcd759c5d 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Remove dev argument for ndisc_send_skb().

Since we have skb->dev, use it.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
f382d03ad003815be6dc268711659d4475fb7d28 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Set skb->dev and skb->protocol inside ndisc_alloc_skb().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
c8d6c380d9463e1405d821db7071e39f63acfb28 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Simplify arguments for ip6_nd_hdr().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2576f17dfad402e2446244238ed22dddf35c2e53 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ipv6: Unshare ip6_nd_hdr() and change return type to void.

- move ip6_nd_hdr() to its users' source files.
In net/ipv6/mcast.c, it will be called ip6_mc_hdr().
- make return type to void since this function never fails.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
de09334b9326632bbf1a74bfd8b01866cbbf2f61 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Introduce ndisc_alloc_skb() helper.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9c86dafe94f03679b77d85915e65da1304005a7c 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Introduce ndisc_fill_redirect_hdr_option().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6bce6b4e16e46cc860175b9e10a283194ef9f004 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Use skb_linearize() instead of pskb_may_pull(skb, skb->len).

Suggested by Eric Dumazet <edumazet@google.com>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
c558e9fca876d6d0463a7c337d2291774b9c6f96 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Move ndisc_opt_addr_space() to include/net/ndisc.h.

This also makes ndisc_opt_addr_data() and ndisc_fill_addr_option()
use ndisc_opt_addr_space().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
315ff09dbaebc6061948102d7ed8781d4ee46f36 21-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Reduce number of arguments for ndisc_fill_addr_option().

Add pointer to struct net_device (dev) and remove
data_len (= dev->addr_len) and addr_type (= dev->type).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
fb568637e58a8459edd551af1160f27ea53e8367 20-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Make several arguments for ndisc_send_na() boolean.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
ca97a644d752b46e5e08526e36705c3b0dd03f5f 20-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ipv6: Introduce ipv6_addr_is_solict_mult() to check Solicited Node Multicast Addresses.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
115b0aa6b444e8dd89b7f67b77b8c472763fbc1a 18-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Check NS message length before access.

Check message length before accessing "target" field,
as we do for other types.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12fd84f4383b15b0a12cfd50b7c527cd55d6f101 18-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ipv6: Remove unused neigh argument for icmp6_dst_alloc() and its callers.

Because of rt->n removal, we do not need neigh argument any more.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
71bcdba06db91ceaaffe019b6c958b5faf06012a 05-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Use struct rd_msg for redirect message.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
b7dc8c3959fd43bfa0dbcf65375628c86665cb94 04-Jan-2013 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Remove unused space at tail of skb for ndisc messages. (TAKE 3)

Currently, the size of skb allocated for NDISC is MAX_HEADER +
LL_RESERVED_SPACE(dev) + packet length + dev->needed_tailroom,
but only LL_RESERVED_SPACE(dev) bytes is "reserved" for headers.
As a result, the skb looks like this (after construction of the
message):

head data tail end
+--------------------------------------------------------------+
+ | | | |
+--------------------------------------------------------------+
|<-hlen---->|<---ipv6 packet------>|<--tlen-->|<--MAX_HEADER-->|
=LL_ = dev
RESERVED_ ->needed_
SPACE(dev) tailroom

As the name implies, "MAX_HEADER" is used for headers, and should
be "reserved" in prior to packet construction. Or, if some space
is really required at the tail of ther skb, it should be
explicitly documented.

We have several option after construction of NDISC message:

Option 1:

head data tail end
+---------------------------------------------+
+ | | |
+---------------------------------------------+
|<-hlen---->|<---ipv6 packet------>|<--tlen-->|
=LL_ = dev
RESERVED_ ->needed_
SPACE(dev) tailroom

Option 2:

head data tail end
+--------------------------------------------------+
+ | | |
+--------------------------------------------------+
|<--MAX_HEADER-->|<---ipv6 packet------>|<--tlen-->|
= dev
->needed_
tailroom

Option 3:

head data tail end
+--------------------------------------------------------------+
+ | | | |
+--------------------------------------------------------------+
|<--MAX_HEADER-->|<-hlen---->|<---ipv6 packet------>|<--tlen-->|
=LL_ = dev
RESERVED_ ->needed_
SPACE(dev) tailroom

Our tunnel drivers try expanding headroom and the space for tunnel
encapsulation was not a mandatory space -- so we are not seeing
bugs here --, but just for optimization for performance critial
situations.

Since NDISC messages are not performance critical unlike TCP,
and as we know outgoing device, LL_RESERVED_SPACE(dev) should be
just enough for the device in most (if not all) cases:
LL_RESERVED_SPACE(dev) <= LL_MAX_HEADER <= MAX_HEADER
Note that LL_RESERVED_SPACE(dev) is also enough for NDISC over
SIT (e.g., ISATAP).

So, I think Option 1 is just fine here.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
093d04d42fa094f6740bb188f0ad0c215ff61e2c 14-Dec-2012 Duan Jiong <djduanjiong@gmail.com> ipv6: Change skb->data before using icmpv6_notify() to propagate redirect

In function ndisc_redirect_rcv(), the skb->data points to the transport
header, but function icmpv6_notify() need the skb->data points to the
inner IP packet. So before using icmpv6_notify() to propagate redirect,
change skb->data to point the inner IP packet that triggered the sending
of the Redirect, and introduce struct rd_msg to make it easy.

Signed-off-by: Duan Jiong <djduanjiong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7bdc1b4abab3af0a803300f706c9814ef4e20a3e 13-Dec-2012 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ndisc: Fix padding error in link-layer address option.

If a natural number n exists where 2 + data_len <= 8n < 2 + data_len + pad,
post padding is not initialized correctly.

(Un)fortunately, the only type that requires pad is Infiniband,
whose pad is 2 and data_len is 20, and this logical error has not
become obvious, but it is better to fix.

Note that ndisc_opt_addr_space() handles the situation described
above correctly.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
fd0ea7dbfae16015e72c4bbc6b1b43fffc3b914f 12-Dec-2012 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> ndisc: Unexport ndisc_{build,send}_skb().

These symbols were exported for bonding device by commit 305d552a
("bonding: send IPv6 neighbor advertisement on failover").

It bacame obsolete by commit 7c899432 ("bonding, ipv4, ipv6, vlan: Handle
NETDEV_BONDING_FAILOVER like NETDEV_NOTIFY_PEERS") and removed by
commit 4f5762ec ("bonding: Remove obsolete source file 'bond_ipv6.c'").

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
aeaf6e9d2f49d793d3eb8c1af4095cf25e061b94 30-Nov-2012 Shmulik Ladkani <shmulik.ladkani@gmail.com> ipv6: unify logic evaluating inet6_dev's accept_ra property

As of 026359b [ipv6: Send ICMPv6 RSes only when RAs are accepted], the
logic determining whether to send Router Solicitations is identical
to the logic determining whether kernel accepts Router Advertisements.

However the condition itself is repeated in several code locations.

Unify it by introducing 'ipv6_accept_ra()' accessor.

Also, simplify the condition expression, making it more readable.
No semantic change.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5cb04436eef62aa8f5c482f8ec8deba391dea465 06-Nov-2012 Hannes Frederic Sowa <hannes@stressinduktion.org> ipv6: add knob to send unsolicited ND on link-layer address change

This patch introduces a new knob ndisc_notify. If enabled, the kernel
will transmit an unsolicited neighbour advertisement on link-layer address
change to update the neighbour tables of the corresponding hosts more quickly.

This is the equivalent to arp_notify in ipv4 world.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9fafd65ad407d4e0c96919a325f568dd95d032af 12-Nov-2012 YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org> ipv6 ndisc: Use pre-defined in6addr_linklocal_allnodes.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
60713a0ca7fd6651b951cc1b4dbd528d1fc0281b 06-Nov-2012 Hannes Frederic Sowa <hannes@stressinduktion.org> ipv6: send unsolicited neighbour advertisements to all-nodes

As documented in RFC4861 (Neighbor Discovery for IP version 6) 7.2.6.,
unsolicited neighbour advertisements should be sent to the all-nodes
multicast address.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
b20b6d972624ff024023012e38a067cb5086270e 07-Nov-2012 Nicolas Dichtel <nicolas.dichtel@6wind.com> ndisc: fix a typo in a comment in ndisc_recv_na()

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
94e187c01512c9cf29e2ff54bf1a1b045f38293d 29-Oct-2012 Amerigo Wang <amwang@redhat.com> ipv6: introduce ip6_rt_put()

As suggested by Eric, we could introduce a helper function
for ipv6 too, to avoid checking if rt is NULL before
dst_release().

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b94f1c0904da9b8bf031667afc48080ba7c3e8c9 12-Jul-2012 David S. Miller <davem@davemloft.net> ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect().

And delete rt6_redirect(), since it is no longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
e8599ff4b1d6b0d61e1074ae4ba9fca8dd0c41d0 12-Jul-2012 David S. Miller <davem@davemloft.net> ipv6: Move bulk of redirect handling into rt6_redirect().

This sets things up so that we can have the protocol error handlers
call down into the ipv6 route code for redirects just as ipv4 already
does.

Signed-off-by: David S. Miller <davem@davemloft.net>
30f2a5f379d0b4b4e733df138a49e054ebf75ff8 12-Jul-2012 David S. Miller <davem@davemloft.net> ipv6: Export ndisc option parsing from ndisc.c

This is going to be used internally by the rt6 redirect code.

Signed-off-by: David S. Miller <davem@davemloft.net>
1d861aa4b3fb08822055345f480850205ffe6170 10-Jul-2012 David S. Miller <davem@davemloft.net> inet: Minimize use of cached route inetpeer.

Only use it in the absolutely required cases:

1) COW'ing metrics

2) ipv4 PMTU

3) ipv4 redirects

Signed-off-by: David S. Miller <davem@davemloft.net>
fbfe95a42e90b3dd079cc9019ba7d7700feee0f6 09-Jun-2012 David S. Miller <davem@davemloft.net> inet: Create and use rt{,6}_get_peer_create().

There's a lot of places that open-code rt{,6}_get_peer() only because
they want to set 'create' to one. So add an rt{,6}_get_peer_create()
for their sake.

There were also a few spots open-coding plain rt{,6}_get_peer() and
those are transformed here as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
a50feda546ac03415707a9bbcac8d6b20714db21 18-May-2012 Eric Dumazet <edumazet@google.com> ipv6: bool/const conversions phase2

Mostly bool conversions, some inline removals and const additions.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
675418d5187785d3d996ca15fd700f5e02901cbc 16-May-2012 Joe Perches <joe@perches.com> net: ipv6: ndisc: Neaten ND_PRINTx macros

Why use several macros when one will do?

Convert the multiple ND_PRINTKx macros to a single
ND_PRINTK macro. Use the new net_<level>_ratelimited
mechanism too.

Add pr_fmt with "ICMPv6: " as prefix.
Remove embedded ICMPv6 prefixes from messages.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
f32138319ca6541e65f95f8e17c9cc88ac1baf94 15-May-2012 Joe Perches <joe@perches.com> net: ipv6: Standardize prefixes for message logging

Add #define pr_fmt(fmt) as appropriate.

Add "IPv6: " to appropriate files.

Convert printk(KERN_<LEVEL> to pr_<level> (but not KERN_DEBUG).
Standardize on "%s: " not "%s(): " when emitting __func__.
Use "%s: ", __func__ instead of embedding function name.
Coalesce formats, align arguments.

ADDRCONF output is now prefixed with "IPv6: "

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
211ed865108e24697b44bee5daac502ee6bdd4a4 10-May-2012 Paul Gortmaker <paul.gortmaker@windriver.com> net: delete all instances of special processing for token ring

We are going to delete the Token ring support. This removes any
special processing in the core networking for token ring, (aside
from net/tr.c itself), leaving the drivers and remaining tokenring
support present but inert.

The mass removal of the drivers and net/tr.c will be in a separate
commit, so that the history of these files that we still care
about won't have the giant deletion tied into their history.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
1716a96101c49186bb0b8491922fd3e69030235f 06-Apr-2012 Gao feng <gaofeng@cn.fujitsu.com> ipv6: fix problem with expired dst cache

If the ipv6 dst cache which copy from the dst generated by ICMPV6 RA packet.
this dst cache will not check expire because it has no RTF_EXPIRES flag.
So this dst cache will always be used until the dst gc run.

Change the struct dst_entry,add a union contains new pointer from and expires.
When rt6_info.rt6i_flags has no RTF_EXPIRES flag,the dst.expires has no use.
we can use this field to point to where the dst cache copy from.
The dst.from is only used in IPV6.

rt6_check_expired check if rt6_info.dst.from is expired.

ip6_rt_copy only set dst.from when the ort has flag RTF_ADDRCONF
and RTF_DEFAULT.then hold the ort.

ip6_dst_destroy release the ort.

Add some functions to operate the RTF_EXPIRES flag and expires(from) together.
and change the code to use these new adding functions.

Changes from v5:
modify ip6_route_add and ndisc_router_discovery to use new adding functions.

Only set dst.from when the ort has flag RTF_ADDRCONF
and RTF_DEFAULT.then hold the ort.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e35f30c131a562bafd069820a6983fd4023e606e 06-Apr-2012 Alexey I. Froloff <raorn@raorn.name> Treat ND option 31 as userland (DNSSL support)

As specified in RFC6106, DNSSL option contains one or more domain names
of DNS suffixes. 8-bit identifier of the DNSSL option type as assigned
by the IANA is 31. This option should also be treated as userland.

Signed-off-by: Alexey I. Froloff <raorn@raorn.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
c78679e8f31b86c7a46e77a3096011f911854187 02-Apr-2012 David S. Miller <davem@davemloft.net> ipv6: Stop using NLA_PUT*().

These macros contain a hidden goto, and are thus extremely error
prone and make code hard to audit.

Signed-off-by: David S. Miller <davem@davemloft.net>
5095d64db1b978bdb31d30fed9e47dbf04f729be 21-Feb-2012 RongQing.Li <roy.qing.li@gmail.com> ipv6: ip6_route_output() never returns NULL.

ip6_route_output() never returns NULL, so it is wrong to
check if the return value is NULL.

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4991969a1027826c3db19dd3e600e145603e6928 28-Jan-2012 David S. Miller <davem@davemloft.net> ipv6: Remove neigh argument from ndisc_send_redirect()

Instead, compute it as-needed inside of that function using
dst_neigh_lookup().

Signed-off-by: David S. Miller <davem@davemloft.net>
eb857186eb771998fc9ab4bfd398a6fedb5a295c 28-Jan-2012 David S. Miller <davem@davemloft.net> ipv6: ndisc: Convert to dst_neigh_lookup()

Now all code paths grab a local reference to the neigh, so if neigh
is not NULL we unconditionally release it at the end. The old logic
would only release if we didn't have a non-NULL 'rt'.

Signed-off-by: David S. Miller <davem@davemloft.net>
e6bff995f8fe78f74cbe8f14bf6a31f3560b9ce4 04-Jan-2012 Neil Horman <nhorman@tuxdriver.com> ipv6: Check RA for sllao when configuring optimistic ipv6 address (v2)

Recently Dave noticed that a test we did in ipv6_add_addr to see if we next hop
route for the interface we're adding an addres to was wrong (see commit
7ffbcecbeed91e5874e9a1cfc4c0cbb07dac3069). for one, it never triggers, and two,
it was completely wrong to begin with. This test was meant to cover this
section of RFC 4429:

3.3 Modifications to RFC 2462 Stateless Address Autoconfiguration

* (modifies section 5.5) A host MAY choose to configure a new address
as an Optimistic Address. A host that does not know the SLLAO
of its router SHOULD NOT configure a new address as Optimistic.
A router SHOULD NOT configure an Optimistic Address.

This patch should bring us into proper compliance with the above clause. Since
we only add a SLAAC address after we've received a RA which may or may not
contain a source link layer address option, we can pass a pointer to that option
to addrconf_prefix_rcv (which may be null if the option is not present), and
only set the optimistic flag if the option was found in the RA.

Change notes:
(v2) modified the new parameter to addrconf_prefix_rcv to be a bool rather than
a pointer to make its use more clear as per request from davem.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
d191854282fd831da785a5a34bc6fd16049b8578 29-Dec-2011 David S. Miller <davem@davemloft.net> ipv6: Kill rt6i_dev and rt6i_expires defines.

It just obscures that the netdevice pointer and the expires value are
implemented in the dst_entry sub-object of the ipv6 route.

And it makes grepping for dst_entry member uses much harder too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2c2aba6c561ac425602f4a0be61422224cb87151 28-Dec-2011 David S. Miller <davem@davemloft.net> ipv6: Use universal hash for NDISC.

In order to perform a proper universal hash on a vector of integers,
we have to use different universal hashes on each vector element.

Which means we need 4 different hash randoms for ipv6.

Signed-off-by: David S. Miller <davem@davemloft.net>
87a115783eca7a424eef599d6f10a499f85f59c8 06-Dec-2011 David S. Miller <davem@davemloft.net> ipv6: Move xfrm_lookup() call down into icmp6_dst_alloc().

And return error pointers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2721745501a26d0dc3b88c0d2f3aa11471891388 02-Dec-2011 David Miller <davem@davemloft.net> net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}.

To reflect the fact that a refrence is not obtained to the
resulting neighbour entry.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Roland Dreier <roland@purestorage.com>
76cc714ed5fe6ed90aad5c52ff3030f1f4e22a48 25-Jul-2011 David Miller <davem@davemloft.net> neigh: Do not set tbl->entry_size in ipv4/ipv6 neigh tables.

Let the core self-size the neigh entry based upon the key length.

Signed-off-by: David S. Miller <davem@davemloft.net>
4d65a2465f6f2694de67777a8aedb1272f473979 23-Nov-2011 Li Wei <lw@cn.fujitsu.com> ipv6: fix a bug in ndisc_send_redirect

Release skb when transmit rate limit _not_ allow

Signed-off-by: Li Wei <lw@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4e3fd7a06dc20b2d8ec6892233ad2012968fe7b6 21-Nov-2011 Alexey Dobriyan <adobriyan@gmail.com> net: remove ipv6_addr_copy()

C assignment can handle struct in6_addr copying.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a7ae1992248e5cf9dc5bd35695ab846d27efe15f 18-Nov-2011 Herbert Xu <herbert@gondor.apana.org.au> ipv6: Remove all uses of LL_ALLOCATED_SPACE

ipv6: Remove all uses of LL_ALLOCATED_SPACE

The macro LL_ALLOCATED_SPACE was ill-conceived. It applies the
alignment to the sum of needed_headroom and needed_tailroom. As
the amount that is then reserved for head room is needed_headroom
with alignment, this means that the tail room left may be too small.

This patch replaces all uses of LL_ALLOCATED_SPACE in net/ipv6
with the macro LL_RESERVED_SPACE and direct reference to
needed_tailroom.

This also fixes the problem with needed_headroom changing between
allocating the skb and reserving the head room.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
8b5c171bb3dc0686b2647a84e990199c5faa9ef8 09-Nov-2011 Eric Dumazet <eric.dumazet@gmail.com> neigh: new unresolved queue limits

Le mercredi 09 novembre 2011 à 16:21 -0500, David Miller a écrit :
> From: David Miller <davem@davemloft.net>
> Date: Wed, 09 Nov 2011 16:16:44 -0500 (EST)
>
> > From: Eric Dumazet <eric.dumazet@gmail.com>
> > Date: Wed, 09 Nov 2011 12:14:09 +0100
> >
> >> unres_qlen is the number of frames we are able to queue per unresolved
> >> neighbour. Its default value (3) was never changed and is responsible
> >> for strange drops, especially if IP fragments are used, or multiple
> >> sessions start in parallel. Even a single tcp flow can hit this limit.
> > ...
> >
> > Ok, I've applied this, let's see what happens :-)
>
> Early answer, build fails.
>
> Please test build this patch with DECNET enabled and resubmit. The
> decnet neigh layer still refers to the removed ->queue_len member.
>
> Thanks.

Ouch, this was fixed on one machine yesterday, but not the other one I
used this morning, sorry.

[PATCH V5 net-next] neigh: new unresolved queue limits

unres_qlen is the number of frames we are able to queue per unresolved
neighbour. Its default value (3) was never changed and is responsible
for strange drops, especially if IP fragments are used, or multiple
sessions start in parallel. Even a single tcp flow can hit this limit.

$ arp -d 192.168.20.108 ; ping -c 2 -s 8000 192.168.20.108
PING 192.168.20.108 (192.168.20.108) 8000(8028) bytes of data.
8008 bytes from 192.168.20.108: icmp_seq=2 ttl=64 time=0.322 ms

Signed-off-by: David S. Miller <davem@davemloft.net>
9f56220fad0d13f8b0ebe7592f41fdb49874d143 25-Oct-2011 Andreas Hofmeister <andi@collax.com> ipv6: Do not use routes from locally generated RAs

When hybrid mode is enabled (accept_ra == 2), the kernel also sees RAs
generated locally. This is useful since it allows the kernel to auto-configure
its own interface addresses.

However, if 'accept_ra_defrtr' and/or 'accept_ra_rtr_pref' are set and the
locally generated RAs announce the default route and/or other route information,
the kernel happily inserts bogus routes with its own address as gateway.

With this patch, adding routes from an RA will be skiped when the RAs source
address matches any local address, just as if 'accept_ra_defrtr' and
'accept_ra_rtr_pref' were set to 0.

Signed-off-by: Andreas Hofmeister <andi@collax.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
01b7806cdce3d3cf1626a1e79389f30512703069 03-Oct-2011 Roy.Li <rongqing.li@windriver.com> ipv6: remove a rcu_read_lock in ndisc_constructor

in6_dev_get(dev) takes a reference on struct inet6_dev, we dont need
rcu locking in ndisc_constructor()

Signed-off-by: Roy.Li <rongqing.li@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cfdf76474e1d8a56ac6cfae39f8559cfe9dfd7fd 27-Jul-2011 Eric Dumazet <eric.dumazet@gmail.com> ipv6: some RCU conversions

ICMP and ND are not fast path, but still we can avoid changing idev
refcount, using RCU.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
69cce1d1404968f78b177a0314f5822d5afdbbfb 18-Jul-2011 David S. Miller <davem@davemloft.net> net: Abstract dst->neighbour accesses behind helpers.

dst_{get,set}_neighbour()

Signed-off-by: David S. Miller <davem@davemloft.net>
9cbb7ecbcff85077bb12301aaf4c9b5a56c5993d 18-Jul-2011 David S. Miller <davem@davemloft.net> ipv6: Get rid of rt6i_nexthop macro.

It just makes it harder to see 1) what the code is doing
and 2) grep for all users of dst{->,.}neighbour

Signed-off-by: David S. Miller <davem@davemloft.net>
8f40b161de4f27402b4c0659ad2ae83fad5a0cdd 17-Jul-2011 David S. Miller <davem@davemloft.net> neigh: Pass neighbour entry to output ops.

This will get us closer to being able to do "neigh stuff"
completely independent of the underlying dst_entry for
protocols (ipv4/ipv6) that wish to do so.

We will also be able to make dst entries neigh-less.

Signed-off-by: David S. Miller <davem@davemloft.net>
542d4d685febf3110d1a08d0bcb9f6ef060b76f7 17-Jul-2011 David S. Miller <davem@davemloft.net> neigh: Kill ndisc_ops->queue_xmit

It is always dev_queue_xmit().

Signed-off-by: David S. Miller <davem@davemloft.net>
47ec132a40d788d45e2f088545dea68798034dab 17-Jul-2011 David S. Miller <davem@davemloft.net> neigh: Kill neigh_ops->hh_output

It's always dev_queue_xmit().

Signed-off-by: David S. Miller <davem@davemloft.net>
ad246c992bea6d33c6421ba1f03e2b405792adf9 26-Apr-2011 Ben Hutchings <bhutchings@solarflare.com> ipv4, ipv6, bonding: Restore control over number of peer notifications

For backward compatibility, we should retain the module parameters and
sysfs attributes to control the number of peer notifications
(gratuitous ARPs and unsolicited NAs) sent after bonding failover.
Also, it is possible for failover to take place even though the new
active slave does not have link up, and in that case the peer
notification should be deferred until it does.

Change ipv4 and ipv6 so they do not automatically send peer
notifications on bonding failover.

Change the bonding driver to send separate NETDEV_NOTIFY_PEERS
notifications when the link is up, as many times as requested. Since
it does not directly control which protocols send notifications, make
num_grat_arp and num_unsol_na aliases for a single parameter. Bump
the bonding version number and update its documentation.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Acked-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b71d1d426d263b0b6cb5760322efebbfc89d4463 22-Apr-2011 Eric Dumazet <eric.dumazet@gmail.com> inet: constify ip headers and in6_addr

Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers
where possible, to make code intention more obvious.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7c89943236750537d26421d9bbb6f6575e2d1e1b 15-Apr-2011 Ben Hutchings <bhutchings@solarflare.com> bonding, ipv4, ipv6, vlan: Handle NETDEV_BONDING_FAILOVER like NETDEV_NOTIFY_PEERS

It is undesirable for the bonding driver to be poking into higher
level protocols, and notifiers provide a way to avoid that. This does
mean removing the ability to configure reptitition of gratuitous ARPs
and unsolicited NAs.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
f47b94646f30529624c82ab0f9cd5bd3f25ef9d2 15-Apr-2011 Ben Hutchings <bhutchings@solarflare.com> ipv6: Send unsolicited neighbour advertismements when notified

The NETDEV_NOTIFY_PEERS notifier is a request to send such
advertisements following migration to a different physical link,
e.g. virtual machine migration.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bd015928bb1713691068c4d0d159afccbaf0f8c0 13-Apr-2011 Daniel Walter <sahne@0x90.at> ipv6: ignore looped-back NA while dad is running

[ipv6] Ignore looped-back NAs while in Duplicate Address Detection

If we send an unsolicited NA shortly after bringing up an
IPv6 address, the duplicate address detection algorithm
fails and the ip stays in tentative mode forever.
This is due a missing check if the NA is looped-back to us.

Signed-off-by: Daniel Walter <dwalter@barracuda.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
93ca3bb5df9bc8b2c60485e1cc6507c3d7c8e1fa 29-Mar-2011 Timo Teräs <timo.teras@iki.fi> net: gre: provide multicast mappings for ipv4 and ipv6

My commit 6d55cb91a0020ac0 (gre: fix hard header destination
address checking) broke multicast.

The reason is that ip_gre used to get ipgre_header() calls with
zero destination if we have NOARP or multicast destination. Instead
the actual target was decided at ipgre_tunnel_xmit() time based on
per-protocol dissection.

Instead of allowing the "abuse" of ->header() calls with invalid
destination, this creates multicast mappings for ip_gre. This also
fixes "ip neigh show nud noarp" to display the proper multicast
mappings used by the gre device.

Reported-by: Doug Kehn <rdkehn@yahoo.com>
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Acked-by: Doug Kehn <rdkehn@yahoo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4c9483b2fb5d2548c3cc1fe03cdd4484ceeb5d1c 12-Mar-2011 David S. Miller <davem@davemloft.net> ipv6: Convert to use flowi6 where applicable.

Signed-off-by: David S. Miller <davem@davemloft.net>
452edd598f60522c11f7f88fdbab27eb36509d1a 02-Mar-2011 David S. Miller <davem@davemloft.net> xfrm: Return dst directly from xfrm_lookup()

Instead of on the stack.

Signed-off-by: David S. Miller <davem@davemloft.net>
92d8682926342d2b6aa5b2ecc02221e00e1573a0 05-Feb-2011 David S. Miller <davem@davemloft.net> inetpeer: Move ICMP rate limiting state into inet_peer entries.

Like metrics, the ICMP rate limiting bits are cached state about
a destination. So move it into the inet_peer entries.

If an inet_peer cannot be bound (the reason is memory allocation
failure or similar), the policy is to allow.

Signed-off-by: David S. Miller <davem@davemloft.net>
defb3519a64141608725e2dac5a5aa9a3c644bae 09-Dec-2010 David S. Miller <davem@davemloft.net> net: Abstract away all dst_entry metrics accesses.

Use helper functions to hide all direct accesses, especially writes,
to dst_entry metrics values.

This will allow us to:

1) More easily change how the metrics are stored.

2) Implement COW for metrics.

In particular this will help us put metrics into the inetpeer
cache if that is what we end up doing. We can make the _metrics
member a pointer instead of an array, initially have it point
at the read-only metrics in the FIB, and then on the first set
grab an inetpeer entry and point the _metrics member there.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
b672083ed36a49c323737b7c7e1d5264a7c193af 01-Dec-2010 Shan Wei <shanwei@cn.fujitsu.com> ipv6: use ND_REACHABLE_TIME and ND_RETRANS_TIMER instead of magic number

ND_REACHABLE_TIME and ND_RETRANS_TIMER have defined
since v2.6.12-rc2, but never been used.
So use them instead of magic number.

This patch also changes original code style to read comfortably .

Thank YOSHIFUJI Hideaki for pointing it out.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
d6bf781712a1d25cc8987036b3a48535b331eb91 04-Oct-2010 Eric Dumazet <eric.dumazet@gmail.com> net neigh: RCU conversion of neigh hash table

David

This is the first step for RCU conversion of neigh code.

Next patches will convert hash_buckets[] and "struct neighbour" to RCU
protected objects.

Thanks

[PATCH net-next] net neigh: RCU conversion of neigh hash table

Instead of storing hash_buckets, hash_mask and hash_rnd in "struct
neigh_table", a new structure is defined :

struct neigh_hash_table {
struct neighbour **hash_buckets;
unsigned int hash_mask;
__u32 hash_rnd;
struct rcu_head rcu;
};

And "struct neigh_table" has an RCU protected pointer to such a
neigh_hash_table.

This means the signature of (*hash)() function changed: We need to add a
third parameter with the actual hash_rnd value, since this is not
anymore a neigh_table field.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a02cec2155fbea457eca8881870fd2de1a4c4c76 22-Sep-2010 Eric Dumazet <eric.dumazet@gmail.com> net: return operator cleanup

Change "return (EXPR);" to "return EXPR;"

return is not a function, parentheses are not required.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
65e9b62d4503849b10bedfc29bff0473760cc597 03-Sep-2010 Thomas Graf <tgraf@infradead.org> ipv6: add special mode accept_ra=2 to accept RA while configured as router

The current IPv6 behavior is to not accept router advertisements while
forwarding, i.e. configured as router.

This does make sense, a router is typically not supposed to be auto
configured. However there are exceptions and we should allow the
current behavior to be overwritten.

Therefore this patch enables the user to overrule the "if forwarding
enabled then don't listen to RAs" rule by setting accept_ra to the
special value of 2.

An alternative would be to ignore the forwarding switch alltogether
and solely accept RAs based on the value of accept_ra. However, I
found that if not intended, accepting RAs as a router can lead to
strange unwanted behavior therefore we it seems wise to only do so
if the user explicitely asks for this behavior.

Signed-off-by: Thomas Graf <tgraf@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
9f888160bdcccf0565dd2774956b8d9456e610be 21-Jun-2010 stephen hemminger <shemminger@vyatta.com> ipv6: fix NULL reference in proxy neighbor discovery

The addition of TLLAO option created a kernel OOPS regression
for the case where neighbor advertisement is being sent via
proxy path. When using proxy, ipv6_get_ifaddr() returns NULL
causing the NULL dereference.

Change causing the bug was:
commit f7734fdf61ec6bb848e0bafc1fb8bad2c124bb50
Author: Octavian Purdila <opurdila@ixiacom.com>
Date: Fri Oct 2 11:39:15 2009 +0000

make TLLAO option for NA packets configurable

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
d8d1f30b95a635dbd610dcc5eb641aca8f4768cf 11-Jun-2010 Changli Gao <xiaosuo@gmail.com> net-next: remove useless union keyword

remove useless union keyword in rtable, rt6_info and dn_route.

Since there is only one member in a union, the union keyword isn't useful.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3fa21e07e6acefa31f974d57fba2b6920a7ebd1a 18-May-2010 Joe Perches <joe@perches.com> net: Remove unnecessary returns from void function()s

This patch removes from net/ (but not any netfilter files)
all the unnecessary return; statements that precede the
last closing brace of void functions.

It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.

Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5a0e3ad6af8660be21ca98a971cd00f331318c05 24-Mar-2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
b2e0b385d77069031edb957839aaaa8441b47287 23-Mar-2010 Jan Engelhardt <jengelh@medozas.de> netfilter: ipv6: use NFPROTO values for NF_HOOK invocation

The semantic patch that was used:
// <smpl>
@@
@@
(NF_HOOK
|NF_HOOK_THRESH
|nf_hook
)(
-PF_INET6,
+NFPROTO_IPV6,
...)
// </smpl>

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
54716e3beb0ab20c49471348dfe399a71bfc8fd3 14-Feb-2010 Eric W. Biederman <ebiederm@xmission.com> net neigh: Decouple per interface neighbour table controls from binary sysctls

Stop computing the number of neighbour table settings we have by
counting the number of binary sysctls. This behaviour was silly
and meant that we could not add another neighbour table setting
without also adding another binary sysctl.

Don't pass the binary sysctl path for neighour table entries
into neigh_sysctl_register. These parameters are no longer
used and so are just dead code.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2c8c1e7297e19bdef3c178c3ea41d898a7716e3e 17-Jan-2010 Alexey Dobriyan <adobriyan@gmail.com> net: spread __net_init, __net_exit

__net_init/__net_exit are apparently not going away, so use them
to full extent.

In some cases __net_init was removed, because it was called from
__net_exit code.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
f8572d8f2a2ba75408b97dc24ef47c83671795d7 05-Nov-2009 Eric W. Biederman <ebiederm@xmission.com> sysctl net: Remove unused binary sysctl code

Now that sys_sysctl is a compatiblity wrapper around /proc/sys
all sysctl strategy routines, and all ctl_name and strategy
entries in the sysctl tables are unused, and can be
revmoed.

In addition neigh_sysctl_register has been modified to no longer
take a strategy argument and it's callers have been modified not
to pass one.

Cc: "David Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
f7734fdf61ec6bb848e0bafc1fb8bad2c124bb50 02-Oct-2009 Octavian Purdila <opurdila@ixiacom.com> make TLLAO option for NA packets configurable

On Friday 02 October 2009 20:53:51 you wrote:

> This is good although I would have shortened the name.

Ah, I knew I forgot something :) Here is v4.

tavi

>From 24d96d825b9fa832b22878cc6c990d5711968734 Mon Sep 17 00:00:00 2001
From: Octavian Purdila <opurdila@ixiacom.com>
Date: Fri, 2 Oct 2009 00:51:15 +0300
Subject: [PATCH] ipv6: new sysctl for sending TLLAO with unicast NAs

Neighbor advertisements responding to unicast neighbor solicitations
did not include the target link-layer address option. This patch adds
a new sysctl option (disabled by default) which controls whether this
option should be sent even with unicast NAs.

The need for this arose because certain routers expect the TLLAO in
some situations even as a response to unicast NS packets.

Moreover, RFC 2461 recommends sending this to avoid a race condition
(section 4.4, Target link-layer address)

Signed-off-by: Cosmin Ratiu <cratiu@ixiacom.com>
Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
d1f8297a96b0d70f17704296a6666468f2087ce6 27-Sep-2009 Sascha Hlusiak <contact@saschahlusiak.de> Revert "sit: stateless autoconf for isatap"

This reverts commit 645069299a1c7358cf7330afe293f07552f11a5d.

While the code does not actually break anything, it does not completely follow
RFC5214 yet. After talking back with Fred L. Templin, I agree that completing the
ISATAP specific RS/RA code, would pollute the kernel a lot with code that is better
implemented in userspace.

The kernel should not send RS packages for ISATAP at all.

Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de>
Acked-by: Fred L. Templin <Fred.L.Templin@boeing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8d65af789f3e2cf4cfbdbf71a0f7a61ebcd41d38 24-Sep-2009 Alexey Dobriyan <adobriyan@gmail.com> sysctl: remove "struct file *" argument of ->proc_handler

It's unused.

It isn't needed -- read or write flag is already passed and sysctl
shouldn't care about the rest.

It _was_ used in two places at arch/frv for some reason.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
89d69d2b75a8f7e258f4b634cd985374cfd3202e 01-Sep-2009 Stephen Hemminger <shemminger@vyatta.com> net: make neigh_ops constant

These tables are never modified at runtime. Move to read-only
section.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
31ce8c71a3bdab12debb5899b1f6dac13e54c71d 29-Aug-2009 David Ward <david.ward@ll.mit.edu> ipv6: Update Neighbor Cache when IPv6 RA is received on a router

When processing a received IPv6 Router Advertisement, the kernel
creates or updates an IPv6 Neighbor Cache entry for the sender --
but presently this does not occur if IPv6 forwarding is enabled
(net.ipv6.conf.*.forwarding = 1), or if IPv6 Router Advertisements
are not accepted (net.ipv6.conf.*.accept_ra = 0), because in these
cases processing of the Router Advertisement has already halted.

This patch allows the Neighbor Cache to be updated in these cases,
while still avoiding any modification to routes or link parameters.

This continues to satisfy RFC 4861, since any entry created in the
Neighbor Cache as the result of a received Router Advertisement is
still placed in the STALE state.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
a6fa32866567351503db8a5c3466a676ba08595f 13-Aug-2009 Jens Rosenboom <jens@mcbone.net> ipv6: Log the explicit address that triggered DAD failure

If an interface has multiple addresses, the current message for DAD
failure isn't really helpful, so this patch adds the address itself to
the printk.

Signed-off-by: Jens Rosenboom <jens@mcbone.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
adf30907d63893e4208dfe3f5c88ae12bc2f25d5 02-Jun-2009 Eric Dumazet <eric.dumazet@gmail.com> net: skb->dst accessors

Define three accessors to get/set dst attached to a skb

struct dst_entry *skb_dst(const struct sk_buff *skb)

void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)

void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;

Delete skb->dst field

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dae9de8e13a8a5154688e4c788e65399b4718707 02-Jun-2009 Brian Haley <brian.haley@hp.com> IPv6: Print error value when skb allocation fails

Print-out the error value when sock_alloc_send_skb() fails in
the IPv6 neighbor discovery code - can be useful for debugging.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
645069299a1c7358cf7330afe293f07552f11a5d 19-May-2009 Sascha Hlusiak <contact@saschahlusiak.de> sit: stateless autoconf for isatap

be sent periodically. The rs_delay can be speficied when adding the
PRL entry and defaults to 15 minutes.

The RS is sent from every link local adress that's assigned to the
tunnel interface. It's directed to the (guessed) linklocal address
of the router and is sent through the tunnel.

Better: send to ff02::2 encapsuled in unicast directed to router-v4.

Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
edf391ff17232f097d72441c9ad467bcb3b5db18 27-Apr-2009 Neil Horman <nhorman@tuxdriver.com> snmp: add missing counters for RFC 4293

The IP MIB (RFC 4293) defines stats for InOctets, OutOctets, InMcastOctets and
OutMcastOctets:
http://tools.ietf.org/html/rfc4293
But it seems we don't track those in any way that easy to separate from other
protocols. This patch adds those missing counters to the stats file. Tested
successfully by me

With help from Eric Dumazet.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1ce85fe402137824246bad03ff85f3913d565c17 25-Feb-2009 Pablo Neira Ayuso <pablo@netfilter.org> netlink: change nlmsg_notify() return value logic

This patch changes the return value of nlmsg_notify() as follows:

If NETLINK_BROADCAST_ERROR is set by any of the listeners and
an error in the delivery happened, return the broadcast error;
else if there are no listeners apart from the socket that
requested a change with the echo flag, return the result of the
unicast notification. Thus, with this patch, the unicast
notification is handled in the same way of a broadcast listener
that has set the NETLINK_BROADCAST_ERROR socket flag.

This patch is useful in case that the caller of nlmsg_notify()
wants to know the result of the delivery of a netlink notification
(including the broadcast delivery) and take any action in case
that the delivery failed. For example, ctnetlink can drop packets
if the event delivery failed to provide reliable logging and
state-synchronization at the cost of dropping packets.

This patch also modifies the rtnetlink code to ignore the return
value of rtnl_notify() in all callers. The function rtnl_notify()
(before this patch) returned the error of the unicast notification
which makes rtnl_set_sk_err() reports errors to all listeners. This
is not of any help since the origin of the change (the socket that
requested the echoing) notices the ENOBUFS error if the notification
fails and should resync itself.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
d73f08011bc30c03a2bcb1ccd880e4be84aea269 07-Feb-2009 Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> ipv6/ndisc: join error paths

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
24fc7b86dc0470616803be2f921c8cd5c459175d 10-Dec-2008 Jan Sembera <jsembera@suse.cz> ipv6: silence log messages for locally generated multicast

This patch fixes minor annoyance during transmission of unsolicited
neighbor advertisements from userspace to multicast addresses (as
far as I can see in RFC, this is allowed and the similar functionality
for IPv4 has been in arping for a long time).

Outgoing multicast packets get reinserted into local processing as if they
are received from the network. The machine thus sees its own NA and fills
the logs with error messages. This patch removes the message if NA has been
generated locally.

Signed-off-by: Jan Sembera <jsembera@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
52479b623d3d41df84c499325b6a8c7915413032 26-Nov-2008 Alexey Dobriyan <adobriyan@gmail.com> netns xfrm: lookup in netns

Pass netns to xfrm_lookup()/__xfrm_lookup(). For that pass netns
to flow_cache_lookup() and resolver callback.

Take it from socket or netdevice. Stub DECnet to init_net.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
07f0757a6808f2f36a0e58c3a54867ccffdb8dc9 20-Nov-2008 Joe Perches <joe@perches.com> include/net net/ - csum_partial - remove unnecessary casts

The first argument to csum_partial is const void *
casts to char/u8 * are not necessary

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
305d552accae6afb859c493ebc7d98ca3371dae2 05-Nov-2008 Brian Haley <brian.haley@hp.com> bonding: send IPv6 neighbor advertisement on failover

This patch adds better IPv6 failover support for bonding devices,
especially when in active-backup mode and there are only IPv6 addresses
configured, as reported by Alex Sidorenko.

- Creates a new file, net/drivers/bonding/bond_ipv6.c, for the
IPv6-specific routines. Both regular bonds and VLANs over bonds
are supported.

- Adds a new tunable, num_unsol_na, to limit the number of unsolicited
IPv6 Neighbor Advertisements that are sent on a failover event.
Default is 1.

- Creates two new IPv6 neighbor discovery functions:

ndisc_build_skb()
ndisc_send_skb()

These were required to support VLANs since we have to be able to
add the VLAN id to the skb since ndisc_send_na() and friends
shouldn't be asked to do this. These two routines are basically
__ndisc_send() split into two pieces, in a slightly different order.

- Updates Documentation/networking/bonding.txt and bumps the rev of bond
support to 3.4.0.

On failover, this new code will generate one packet:

- An unsolicited IPv6 Neighbor Advertisement, which helps the switch
learn that the address has moved to the new slave.

Testing has shown that sending just the NA results in pretty good
behavior when in active-back mode, I saw no lost ping packets for example.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
5b095d98928fdb9e3b75be20a54b7a6cbf6ca9ad 29-Oct-2008 Harvey Harrison <harvey.harrison@gmail.com> net: replace %p6 with %pI6

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
0c6ce78abf6e228d44c3840edb8a4ae0c1299825 29-Oct-2008 Harvey Harrison <harvey.harrison@gmail.com> net: replace uses of NIP6_FMT with %p6

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
f221e726bf4e082a05dcd573379ac859bfba7126 16-Oct-2008 Alexey Dobriyan <adobriyan@gmail.com> sysctl: simplify ->strategy

name and nlen parameters passed to ->strategy hook are unused, remove
them. In general ->strategy hook should know what it's doing, and don't
do something tricky for which, say, pointer to original userspace array
may be needed (name).

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: David S. Miller <davem@davemloft.net> [ networking bits ]
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
22441cfa0c70dcd457f3c081fcf285c3bd155824 16-Oct-2008 Pedro Ribeiro <pribeiro@net.ipl.pt> IPV6: Fix default gateway criteria wrt. HIGH/LOW preference radv option

Problem observed:
In IPv6, in the presence of multiple routers candidates to
default gateway in one segment, each sending a different
value of preference, the Linux hosts connected to the
segment weren't selecting the right one in all the
combinations possible of LOW/MEDIUM/HIGH preference.

This patch changes two files:
include/linux/icmpv6.h
Get the "router_pref" bitfield in the right place
(as RFC4191 says), named the bit left with this fix as
"home_agent" (RFC3775 say that's his function)

net/ipv6/ndisc.c
Corrects the binary logic behind the updating of the
router preference in the flags of the routing table

Result:
With this two fixes applied, the default route used by
the system was to consistent with the rules mentioned
in RFC4191 in case of changes in the value of preference
in router advertisements

Signed-off-by: Pedro Ribeiro <pribeiro@net.ipl.pt>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5c5d244bd388fe498dd7f5f57cb7770aae40b9ab 08-Oct-2008 Denis V. Lunev <den@openvz.org> ipv6: added net argument to ICMP6MSGOUT_INC_STATS

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
a862f6a6dc89c57dd3a959a1636b59f0c27169c2 08-Oct-2008 Denis V. Lunev <den@openvz.org> ipv6: added net argument to ICMP6_INC_STATS

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3bd653c8455bc7991bae77968702b31c8f5df883 08-Oct-2008 Denis V. Lunev <den@openvz.org> netns: add net parameter to IP6_INC_STATS

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
191cd582500f49b32a63040fedeebb0168c720af 15-Aug-2008 Brian Haley <brian.haley@hp.com> netns: Add network namespace argument to rt6_fill_node() and ipv6_dev_get_saddr()

ipv6_dev_get_saddr() blindly de-references dst_dev to get the network
namespace, but some callers might pass NULL. Change callers to pass a
namespace pointer instead.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
53b7997fd5c62408d10b9aafb38974ce90fd2356 20-Jul-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> ipv6 netns: Make several "global" sysctl variables namespace aware.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
0686caa35ed17cf5b9043f453957e702a7eb588d 20-May-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> ndisc: Add missing strategies for per-device retrans timer/reachable time settings.

Noticed from Al Viro <viro@ftp.linux.org.uk> via David Miller
<davem@davemloft.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
f5184d267c1aedb9b7a8cc44e08ff6b8d382c3b5 13-May-2008 Johannes Berg <johannes@sipsolutions.net> net: Allow netdevices to specify needed head/tailroom

This patch adds needed_headroom/needed_tailroom members to struct
net_device and updates many places that allocate sbks to use them. Not
all of them can be converted though, and I'm sure I missed some (I
mostly grepped for LL_RESERVED_SPACE)

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
f3ee4010e84452aa133e5163e6cfabc52b194e94 10-Apr-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Define constants for link-local multicast addresses.

- Define link-local all-node / all-router multicast addresses.
- Remove ipv6_addr_all_nodes() and ipv6_addr_all_routers().

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
9acd9f3ae92d0dc0ca7504fb48c1040e8bbc39fe 10-Apr-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Make address arguments const.

- net/ipv6/addrconf.c:
ipv6_get_ifaddr(), ipv6_dev_get_saddr()
- net/ipv6/mcast.c:
ipv6_sock_mc_join(), ipv6_sock_mc_drop(),
inet6_mc_check(),
ipv6_dev_mc_inc(), __ipv6_dev_mc_dec(), ipv6_dev_mc_dec(),
ipv6_chk_mcast_addr()
- net/ipv6/route.c:
rt6_lookup(), icmp6_dst_alloc()
- net/ipv6/ip6_output.c:
ip6_nd_hdr()
- net/ipv6/ndisc.c:
ndisc_send_ns(), ndisc_send_rs(), ndisc_send_redirect(),
ndisc_get_neigh(), __ndisc_send()

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
1ed8516f09e510e4595bc900ad9266c15aacfdd2 03-Apr-2008 Denis V. Lunev <den@openvz.org> [IPV6]: Simplify IPv6 control sockets creation.

Do this by replacing sock_create_kern with inet_ctl_sock_create.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
de357cc01334a468e4d5b7ba66a17b0d3ca9d63e 16-Mar-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] NDISC: Don't rely on node-type hint from L2 unless required.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
6294e000736401d4415ad41f408e56e14aaaf7b4 16-Mar-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] NDISC: Ignore route information with /0 prefix from interior router.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
fadf6bf06069138f8e97c9a963be38348ba2708b 11-Mar-2008 Templin, Fred L <Fred.L.Templin@boeing.com> [IPV6] SIT: Add PRL management for ISATAP.

This patch updates the Linux the Intra-Site Automatic Tunnel Addressing
Protocol (ISATAP) implementation. It places the ISATAP potential router
list (PRL) in the kernel and adds three new private ioctls for PRL
management.

[Add several changes of structure name, constant names etc. - yoshfuji]

Signed-off-by: Fred L. Templin <fred.l.templin@boeing.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
0736ffc04eec239cce9fd3c6ae5dce54e14c25c7 28-Mar-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] NEIGH: Optimize is_router check.

Our interest is not the whole entry of proxy neighbor but the
NTF_ROUTER flag. Let's test it explicitly.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
6ab57e7e7fa316552d0f94eaebf1def1d49f18da 27-Mar-2008 Daniel Lezcano <dlezcano@fr.ibm.com> [NETNS][IPV6] anycast - handle several network namespace

Make use of the network namespace information to have this protocol to
handle several network namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
c346dca10840a874240c78efe3f39acf4312a1f2 25-Mar-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [NET] NETNS: Omit net_device->nd_net without CONFIG_NET_NS.

Introduce per-net_device inlines: dev_net(), dev_net_set().
Without CONFIG_NET_NS, no namespace other than &init_net exists.
Let's explicitly define them to help compiler optimizations.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
7cbca67c073263c179f605bdbbdc565ab29d801d 25-Mar-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Support Source Address Selection API (RFC5014).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
fa86d322d89995fef1bfb5cc768b89d8c22ea0d9 24-Mar-2008 Pavel Emelyanov <xemul@openvz.org> [NEIGH]: Fix race between pneigh deletion and ipv6's ndisc_recv_ns (v3).

Proxy neighbors do not have any reference counting, so any caller
of pneigh_lookup (unless it's a netlink triggered add/del routine)
should _not_ perform any actions on the found proxy entry.

There's one exception from this rule - the ipv6's ndisc_recv_ns()
uses found entry to check the flags for NTF_ROUTER.

This creates a race between the ndisc and pneigh_delete - after
the pneigh is returned to the caller, the nd_tbl.lock is dropped
and the deleting procedure may proceed.

One of the fixes would be to add a reference counting, but this
problem exists for ndisc only. Besides such a patch would be too
big for -rc4.

So I propose to introduce a __pneigh_lookup() which is supposed
to be called with the lock held and use it in ndisc code to check
the flags on alive pneigh entry.


Changes from v2:
As David noticed, Exported the __pneigh_lookup() to ipv6 module.
The checkpatch generates a warning on it, since the EXPORT_SYMBOL
does not follow the symbol itself, but in this file all the
exports come at the end, so I decided no to break this harmony.

Changes from v1:
Fixed comments from YOSHIFUJI - indentation of prototype in header
and the pndisc_check_router() name - and a compilation fix, pointed
by Daniel - the is_routed was (falsely) considered as uninitialized
by gcc.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
061964fb988ca51087948975da66ff523b3a5852 24-Mar-2008 Rami Rosen <ramirose@gmail.com> [IPV6]: Remove unused code in ndisc_send_redirect().

This patches removes unused code in ndisc_send_redirect() method in
net/ipv6/ndisc.c.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
421f099bc555c5f1516fdf5060de1d6bb5f51002 23-Mar-2008 Julia Lawall <julia@diku.dk> [IPV6] net/ipv6/ndisc.c: remove unused variable

The variable hlen is initialized but never used otherwise.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
type T;
identifier i;
constant C;
@@

(
extern T i;
|
- T i;
<+... when != i
- i = C;
...+>
)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
1762f7e88eb34f653b4a915be99a102e347dd45e 07-Mar-2008 Daniel Lezcano <dlezcano@fr.ibm.com> [NETNS][IPV6] ndisc - make socket control per namespace

Make ndisc socket control per namespace.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
a18bc6959d9793c8352d2177b456d93868953874 07-Mar-2008 Daniel Lezcano <dlezcano@fr.ibm.com> [NETNS][IPV6] ndisc - make ndisc handle multiple network namespaces

Make ndisc handle multiple network namespaces:
Remove references to init_net, add network namespace parameters and add
pernet_operations for ndisc

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
0dc47877a3de00ceadea0005189656ae8dc52669 06-Mar-2008 Harvey Harrison <harvey.harrison@gmail.com> net: replace remaining __FUNCTION__ occurrences

__FUNCTION__ is gcc-specific, use __func__

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4591db4f37618f37a9f1f25d291c3c7a43a15a21 05-Mar-2008 Daniel Lezcano <dlezcano@fr.ibm.com> [NETNS][IPV6] route6 - add netns parameter to ip6_route_output

Add an netns parameter to ip6_route_output. That will allow to access
to the right routing table for outgoing traffic.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
5b7c931dff03621ae7ac524c4fa280d4e5f187a4 04-Mar-2008 Daniel Lezcano <dlezcano@fr.ibm.com> [NETNS][IPV6] ip6_fib - add net to gc timer parameter

The fib tables are now relative to the network namespace. When the
garbage collector timer expires, we must have a network namespace
parameter in order to retrieve the tables. For now this is the
init_net, but we should be able to have a timer per namespace and use
the timer callback parameter to pass the network namespace from the
expired timer.

The timer callback, fib6_run_gc, is actually used to be called
synchronously by some functions and asynchronously when the timer
expires.

When the timer expires, the delay specified for fib6_run_gc parameter
is always zero. So, I changed fib6_run_gc to not be a timer callback
but a function called by the timer callback and I added a timer
callback where its work is just to retrieve from the data arg of the
timer the network namespace and call fib6_run_gc with zero expiring
time and the network namespace parameters. That makes the code cleaner
for the fib6_run_gc callers.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
3b00944c5c73c49ef52bf17b66557c43c1d945fe 07-Dec-2007 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Make ndisc_dst_alloc() common for later use.

For later use, this patch is renaming ndisc_dst_alloc()
(and related function/structures) to icmp6_dst_alloc()
(and so on). This patch also removing unused function-
pointer argument for it.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
95e41e93e18d8e1e272ce23d96bae4f17ce11d42 07-Dec-2007 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Make ndisc_flow_init() common for later use.

For later use, this patch is renaming ndisc_flow_init() to
icmpv6_flow_init() and putting it in common place.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
9b0f976f27f00a81cf47643d90854659626795b4 29-Feb-2008 Denis V. Lunev <den@openvz.org> [INET]: Remove struct net_proto_family* from _init calls.

struct net_proto_family* is not used in icmp[v6]_init, ndisc_init,
igmp_init and tcp_v4_init. Remove it.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
61cf46ad581ba43073d3bcb0be549eb60fbbf9f8 22-Jan-2008 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] NDISC: Sparse: Use different variable name for local use.

Fix the following sparse warnings:
| net/ipv6/ndisc.c:1300:21: warning: symbol 'opt' shadows an earlier one
| net/ipv6/ndisc.c:1078:7: originally declared here

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
1cab3da6be6c7659f62d0d297b389cc0e48b2178 11-Jan-2008 Daniel Lezcano <dlezcano@fr.ibm.com> [NETNS][IPV6]: inet6_addr - ipv6_get_ifaddr namespace aware

The inet6_addr_lst is browsed taking into account the network
namespace specified as parameter. If an address does not belong
to the specified namespace, it is ignored.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
bfeade087005278fc8cafe230b7658a4f40c5acb 11-Jan-2008 Daniel Lezcano <dlezcano@fr.ibm.com> [NETNS][IPV6]: inet6_addr - check ipv6 address per namespace

When a new address is added, we must check if the new address does not
already exists. This patch makes this check to be aware of a network
namespace, so the check will look if the address already exists for
the specified network namespace. While the addresses are browsed, the
addresses which do not belong to the namespace are discarded.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
426b5303eb435d98b9bee37a807be386bc2b3320 24-Jan-2008 Eric W. Biederman <ebiederm@xmission.com> [NETNS]: Modify the neighbour table code so it handles multiple network namespaces

I'm actually surprised at how much was involved. At first glance it
appears that the neighbour table data structures are already split by
network device so all that should be needed is to modify the user
interface commands to filter the set of neighbours by the network
namespace of their devices.

However a couple things turned up while I was reading through the
code. The proxy neighbour table allows entries with no network
device, and the neighbour parms are per network device (except for the
defaults) so they now need a per network namespace default.

So I updated the two structures (which surprised me) with their very
own network namespace parameter. Updated the relevant lookup and
destroy routines with a network namespace parameter and modified the
code that interacts with users to filter out neighbour table entries
for devices of other namespaces.

I'm a little concerned that we can modify and display the global table
configuration and from all network namespaces. But this appears good
enough for now.

I keep thinking modifying the neighbour table to have per network
namespace instances of each table type would should be cleaner. The
hash table is already dynamically sized so there are it is not a
limiter. The default parameter would be straight forward to take care
of. However when I look at the how the network table is built and
used I still find some assumptions that there is only a single
neighbour table for each type of table in the kernel. The netlink
operations, neigh_seq_start, the non-core network users that call
neigh_lookup. So while it might be doable it would require more
refactoring than my current approach of just doing a little extra
filtering in the code.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
97c53cacf00d1f5aa04adabfebcc806ca8b22b10 20-Nov-2007 Denis V. Lunev <den@openvz.org> [NET]: Make rtnetlink infrastructure network namespace aware (v3)

After this patch none of the netlink callback support anything
except the initial network namespace but the rtnetlink infrastructure
now handles multiple network namespaces.

Changes from v2:
- IPv6 addrlabel processing

Changes from v1:
- no need for special rtnl_unlock handling
- fixed IPv6 ndisc

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6e23ae2a48750bda407a4a58f52a4865d7308bf5 20-Nov-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: Introduce NF_INET_ hook values

The IPv4 and IPv6 hook values are identical, yet some code tries to figure
out the "correct" value by looking at the address family. Introduce NF_INET_*
values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__
section for userspace compatibility.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
a9e527e3f9f4510e9f3450ca3bc51bc3ef2854fd 10-Dec-2007 Rolf Manderscheid <rvm@obsidianresearch.com> IPoIB: improve IPv4/IPv6 to IB mcast mapping functions

An IPoIB subnet on an IB fabric that spans multiple IB subnets can't
use link-local scope in multicast GIDs. The existing routines that
map IP/IPv6 multicast addresses into IB link-level addresses hard-code
the scope to link-local, and they also leave the partition key field
uninitialised. This patch adds a parameter (the link-level broadcast
address) to the mapping routines, allowing them to initialise both the
scope and the P_Key appropriately, and fixes up the call sites.

The next step will be to add a way to configure the scope for an IPoIB
interface.

Signed-off-by: Rolf Manderscheid <rvm@obsidianresearch.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
bea851954717ebb0dee557a951e28bb277e1cc1d 20-Dec-2007 Joe Perches <joe@perches.com> [IPV6]: Spelling fixes

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dbb2ed24851a290616d66212dc75373fd863d636 13-Nov-2007 Pierre Ynard <linkfanel@yahoo.fr> [IPV6]: Add ifindex field to ND user option messages.

Userland neighbor discovery options are typically heavily involved with
the interface on which thay are received: add a missing ifindex field to
the original struct. Thanks to R�mi Denis-Courmont.

Signed-off-by: Pierre Ynard <linkfanel@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
ad02ac145d49067a94bf8f3357c527020d5893ed 29-Oct-2007 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] NDISC: Fix setting base_reachable_time_ms variable.

This bug was introduced by the commit
d12af679bcf8995a237560bdf7a4d734f8df5dbb (sysctl: fix neighbour table
sysctls).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
d12af679bcf8995a237560bdf7a4d734f8df5dbb 18-Oct-2007 Eric W. Biederman <ebiederm@xmission.com> sysctl: fix neighbour table sysctls.

- In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary
sysctl names for a function that works with proc.

- In neighbour.c reorder the table to put the possibly unused entries
at the end so we can remove them by terminating the table early.

- In neighbour.c kill the entries with questionable binary sysctl
handling behavior.

- In neighbour.c if we don't have a strategy routine remove the
binary path. So we don't the default sysctl strategy routine
on data that is not ready for it.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Alexey Dobriyan <adobriyan@sw.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
31910575a9de61e78065e93846e8e7a4894a18bf 11-Oct-2007 Pierre Ynard <linkfanel@yahoo.fr> [IPv6]: Export userland ND options through netlink (RDNSS support)

As discussed before, this patch provides userland with a way to access
relevant options in Router Advertisements, after they are processed
and validated by the kernel. Extra options are processed in a generic
way; this patch only exports RDNSS options described in RFC5006, but
support to control which options are exported could be easily added.

A new rtnetlink message type is defined, to transport Neighbor
Discovery options, along with optional context information. At the
moment only the address of the router sending an RDNSS option is
included, but additional attributes may be later defined, if needed by
new use cases.

Signed-off-by: Pierre Ynard <linkfanel@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
cfcabdcc2d5a810208e5bb3974121b7ed60119aa 09-Oct-2007 Stephen Hemminger <shemminger@linux-foundation.org> [NET]: sparse warning fixes

Fix a bunch of sparse warnings. Mostly about 0 used as
NULL pointer, and shadowed variable declarations.
One notable case was that hash size should have been unsigned.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3b04ddde02cf1b6f14f2697da5c20eca5715017f 09-Oct-2007 Stephen Hemminger <shemminger@linux-foundation.org> [NET]: Move hardware header operations out of netdevice.

Since hardware header operations are part of the protocol class
not the device instance, make them into a separate object and
save memory.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14878f75abd5bf1d38becb405801cd491ee215dc 17-Sep-2007 David L Stevens <dlstevens@us.ibm.com> [IPV6]: Add ICMPMsgStats MIB (RFC 4293) [rev 2]

Background: RFC 4293 deprecates existing individual, named ICMP
type counters to be replaced with the ICMPMsgStatsTable. This table
includes entries for both IPv4 and IPv6, and requires counting of all
ICMP types, whether or not the machine implements the type.

These patches "remove" (but not really) the existing counters, and
replace them with the ICMPMsgStats tables for v4 and v6.
It includes the named counters in the /proc places they were, but gets the
values for them from the new tables. It also counts packets generated
from raw socket output (e.g., OutEchoes, MLD queries, RA's from
radvd, etc).

Changes:
1) create icmpmsg_statistics mib
2) create icmpv6msg_statistics mib
3) modify existing counters to use these
4) modify /proc/net/snmp to add "IcmpMsg" with all ICMP types
listed by number for easy SNMP parsing
5) modify /proc/net/snmp printing for "Icmp" to get the named data
from new counters.
[new to 2nd revision]
6) support per-interface ICMP stats
7) use common macro for per-device stat macros

Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e9dc86534051b78e41e5b746cccc291b57a3a311 12-Sep-2007 Eric W. Biederman <ebiederm@xmission.com> [NET]: Make device event notification network namespace safe

Every user of the network device notifiers is either a protocol
stack or a pseudo device. If a protocol stack that does not have
support for multiple network namespaces receives an event for a
device that is not in the initial network namespace it quite possibly
can get confused and do the wrong thing.

To avoid problems until all of the protocol stacks are converted
this patch modifies all netdev event handlers to ignore events on
devices that are not in the initial network namespace.

As the rest of the code is made network namespace aware these
checks can be removed.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bf0b48dfc368c07c42b5a3a5658c8ee81b4283ac 08-Oct-2007 Brian Haley <brian.haley@hp.com> [IPv6]: Fix ICMPv6 redirect handling with target multicast address

When the ICMPv6 Target address is multicast, Linux processes the
redirect instead of dropping it. The problem is in this code in
ndisc_redirect_rcv():

if (ipv6_addr_equal(dest, target)) {
on_link = 1;
} else if (!(ipv6_addr_type(target) & IPV6_ADDR_LINKLOCAL)) {
ND_PRINTK2(KERN_WARNING
"ICMPv6 Redirect: target address is not
link-local.\n");
return;
}

This second check will succeed if the Target address is, for example,
FF02::1 because it has link-local scope. Instead, it should be checking
if it's a unicast link-local address, as stated in RFC 2461/4861 Section
8.1:

- The ICMP Target Address is either a link-local address (when
redirected to a router) or the same as the ICMP Destination
Address (when redirected to the on-link destination).

I know this doesn't explicitly say unicast link-local address, but it's
implied.

This bug is preventing Linux kernels from achieving IPv6 Logo Phase II
certification because of a recent error that was found in the TAHI test
suite - Neighbor Disovery suite test 206 (v6LC.2.3.6_G) had the
multicast address in the Destination field instead of Target field, so
we were passing the test. This won't be the case anymore.

The patch below fixes this problem, and also fixes ndisc_send_redirect()
to not send an invalid redirect with a multicast address in the Target
field. I re-ran the TAHI Neighbor Discovery section to make sure Linux
passes all 245 tests now.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9e3be4b34364a670bd6e57d2e8c3caabdd8d89f8 11-Sep-2007 Denis V. Lunev <den@openvz.org> [IPV6]: Freeing alive inet6 address

From: Denis V. Lunev <den@openvz.org>

addrconf_dad_failure calls addrconf_dad_stop which takes referenced address
and drops the count. So, in6_ifa_put perrformed at out: is extra. This
results in message: "Freeing alive inet6 address" and not released dst entries.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Alexey Dobriyan <adobriyan@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6d5b78cdd5a17665674429400b3ed10e3ec60684 23-Jun-2007 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] NDISC: Fix thinko to control Router Preference support.

Bug reported by Haruhito Watanabe <haruhito@sfc.keio.ac.jp>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
e1ec7842df5db897516d73c76bd2a568b4abc33b 24-Apr-2007 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] NDISC: Unify main process of sending ND messages.

Because ndisc_send_na(), ndisc_send_ns() and ndisc_send_rs()
are almost identical, so let's unify their common part.

With gcc (GCC) 3.3.5 (Debian 1:3.3.5-13) on i386,
Before:
text data bss dec hex filename
14689 364 24 15077 3ae5 net/ipv6/ndisc.o
After:
text data bss dec hex filename
12317 364 24 12705 31a1 net/ipv6/ndisc.o

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
3ff50b7997fe06cd5d276b229967bb52d6b3b6c1 21-Apr-2007 Stephen Hemminger <shemminger@linux-foundation.org> [NET]: cleanup extra semicolons

Spring cleaning time...

There seems to be a lot of places in the network code that have
extra bogus semicolons after conditionals. Most commonly is a
bogus semicolon after: switch() { }

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
27a884dc3cb63b93c2b3b643f5b31eed5f8a4d26 20-Apr-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Convert skb->tail to sk_buff_data_t

So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
on 64bit architectures, allowing us to combine the 4 bytes hole left by the
layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
:-)

Many calculations that previously required that skb->{transport,network,
mac}_header be first converted to a pointer now can be done directly, being
meaningful as offsets or pointers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
d10ba34b001944a8d1c8adb5646140ef089c432b 15-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: More skb_put related skb_reset_transport_header

This time we have to set it to skb->tail that is not anymore equal to
skb->data, so we either add a new helper or just add the skb->tail - skb->data
offset, for now do the later.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9c70220b73908f64792422a2c39c593c4792f2c5 26-Apr-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce skb_transport_header(skb)

For the places where we need a pointer to the transport header, it is
still legal to touch skb->h.raw directly if just adding to,
subtracting from or setting it to another layer header.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cc70ab261c9f997589546100ddec5da6bfd89c4e 13-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [ICMP6]: Introduce icmp6_hdr()

For consistency with all the other skb->h.raw accessors.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
0660e03f6b18f19b6bbafe7583265a51b90daf36 26-Apr-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce ipv6_hdr(), remove skb->nh.ipv6h

Now the skb->nh union has just one member, .raw, i.e. it is just like the
skb->mac union, strange, no? I'm just leaving it like that till the transport
layer is done with, when we'll rename skb->mac.raw to skb->mac_header (or
->mac_header_offset?), ditto for ->{h,nh}.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
98e399f82ab3a6d863d1d4a7ea48925cc91c830e 19-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Introduce skb_mac_header()

For the places where we need a pointer to the mac header, it is still legal to
touch skb->mac.raw directly if just adding to, subtracting from or setting it
to another layer header.

This one also converts some more cases to skb_reset_mac_header() that my
regex missed as it had no spaces before nor after '=', ugh.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ca043569390c528de4cd5ec9e07502f2bf4ecd1f 28-Feb-2007 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] ADDRCONF: Fix possible inet6_ifaddr leakage with CONFIG_OPTIMISTIC_DAD.

The inet6_ifaddr for source address of RS is leaked if the address
is not an optimistic address.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
95c385b4d5a71b8ad552aecaa968ea46d7da2f6a 26-Apr-2007 Neil Horman <nhorman@tuxdriver.com> [IPV6] ADDRCONF: Optimistic Duplicate Address Detection (RFC 4429) Support.

Nominally an autoconfigured IPv6 address is added to an interface in the
Tentative state (as per RFC 2462). Addresses in this state remain in this
state while the Duplicate Address Detection process operates on them to
determine their uniqueness on the network. During this period, these
tentative addresses may not be used for communication, increasing the time
before a node may be able to communicate on a network. Using Optimistic
Duplicate Address Detection, autoconfigured addresses may be used
immediately for communication on the network, as long as certain rules are
followed to avoid conflicts with other nodes during the Duplicate Address
Detection process.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7159039a128fa0a73ca7b532f6e1d30d9885277f 22-Feb-2007 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Decentralize EXPORT_SYMBOLs.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
1ab1457c42bc078e5a9becd82a7f9f940b55c53a 09-Feb-2007 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [NET] IPV6: Fix whitespace errors.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
29556526b970c2e7d4ca808b6082c33981adfdff 30-Jan-2007 Li Yewang <lyw@nanjing-fnst.com> [IPV6]: fix BUG of ndisc_send_redirect()

When I tested IPv6 redirect function about kernel 2.6.19.1, and found
that the kernel can send redirect packets whose target address is global
address, and the target is not the actual endpoint of communication.

But the criteria conform to RFC2461, the target address defines as
following:

Target Address An IP address that is a better first hop to use for
he ICMP Destination Address. When the target is
the actual endpoint of communication, i.e., the
destination is a neighbor, the Target Address field
MUST contain the same value as the ICMP Destination
Address field. Otherwise the target is a better
first-hop router and the Target Address MUST be the
router's link-local address so that hosts can
uniquely identify routers.

According to this definition, when a router redirect to a host, the
target address either the better first-hop router's link-local address
or the same as the ICMP destination address field. But the function of
ndisc_send_redirect() in net/ipv6/ndisc.c, does not check the target
address correctly.

There is another definition about receive Redirect message in RFC2461:

8.1. Validation of Redirect Messages

A host MUST silently discard any received Redirect message that does
not satisfy all of the following validity checks:
......
- The ICMP Target Address is either a link-local address (when
redirected to a router) or the same as the ICMP Destination
Address (when redirected to the on-link destination).
......

And the receive redirect function of ndisc_redirect_rcv() implemented
this definition, checks the target address correctly.
if (ipv6_addr_equal(dest, target)) {
on_link = 1;
} else if (!(ipv6_addr_type(target) & IPV6_ADDR_LINKLOCAL)) {
ND_PRINTK2(KERN_WARNING
"ICMPv6 Redirect: target address is not link-local.\n");
return;
}

So, I think the send redirect function must check the target address
also.

Signed-off-by: Li Yewang <lyw@nanjing-fnst.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
1f29bcd739972f71f2fd5d5d265daf3e1208fa5e 10-Dec-2006 Alexey Dobriyan <adobriyan@gmail.com> [PATCH] sysctl: remove unused "context" param

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: David Howells <dhowells@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
e69a4adc669fe210817ec50ae3f9a7a5ad62d4e8 15-Nov-2006 Al Viro <viro@zeniv.linux.org.uk> [IPV6]: Misc endianness annotations.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
a11d206d0f88e092419877c7f706cafb5e1c2e57 04-Nov-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Per-interface statistics support.

For IP MIB (RFC4293).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
d54a81d341af80875c201890500f727c8188dd9b 03-Dec-2006 David S. Miller <davem@sunset.davemloft.net> [IPV6] NDISC: Calculate packet length correctly for allocation.

MAX_HEADER does not include the ipv6 header length in it,
so we need to add it in explicitly.

With help from YOSHIFUJI Hideaki.

Signed-off-by: David S. Miller <davem@davemloft.net>
36f73d0c3b7efa72cd8b89f2d429ff39bc12f15c 04-Nov-2006 Dmitry Mishin <dim@openvz.org> [IPV6]: Add ndisc_netdev_notifier unregister.

If inet6_init() fails later than ndisc_init() call, or IPv6 module is
unloaded, ndisc_netdev_notifier call remains in the list and will follows in
oops later.

Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
f1a95859a86fcdfd94f8b6dc3255d70d037e1caf 14-Oct-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Remove bogus WARN_ON in Proxy-NA handling.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
fc26d0abd5afd2b5268a7dbdbf8be1095ce5703e 22-Sep-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] NDISC: Fix is_router flag setting.

We did not send appropriate IsRouter flag if the forwarding setting is
positive even value. Let's give 1/0 value to ndisc_send_na().

Also, existing users of ndisc_send_na() give 0/1 to override,
we can omit redundant operation in that function.

Bug hinted by Nicolas Dichtel <nicolas.dichtel@6wind.com>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
fbea49e1e2404baa2d88ab47e2db89e49551b53b 22-Sep-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] NDISC: Add proxy_ndp sysctl.

We do not always need proxy NDP functionality even we
enable forwarding.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
62dd93181aaa1d5a501a9cebcb254f44b8a48af7 22-Sep-2006 Ville Nuorvala <vnuorval@tcs.hut.fi> [IPV6] NDISC: Set per-entry is_router flag in Proxy NA.

We have sent NA with router flag from the node-wide forwarding
configuration. This is not appropriate for proxy NA, and it should be
set according to each proxy entry's configuration.

This is used by Mobile IPv6 home agent to support physical home link
in acting as a proxy router for mobile node which is not a router,
for example.

Based on MIPL2 kernel patch.

Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
5f3e6e9e19f50a6910aec2dbd479187aabba04b7 22-Sep-2006 Ville Nuorvala <vnuorval@tcs.hut.fi> [IPV6] NDISC: Avoid updating neighbor cache for proxied address in receiving NA.

This aims at proxying router not updating neighbor cache entry for proxied
address when it receives NA because either the proxied node is off link or
it has already sent a NA to the proxied router.

Based on MIPL2 kernel patch.

Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
af184765848c280c7e6190f45c827c5ea3881126 24-Aug-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] NDISC: Initialize fl with outbound interface to lookup rules properly.

Based on MIPL2 kernel patch.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
5e032e32ecc2e6cb0385dc115ca9bfe5e19a9539 24-Aug-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] NDISC: Take source address into account for redirects.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
e0a1ad73d34fd6dfdb630479400511e9879069c0 22-Aug-2006 Thomas Graf <tgraf@suug.ch> [IPv6] route: Simplify ip6_del_rt()

Provide a simple ip6_del_rt() for the majority of users and
an alternative for the exception via netlink. Avoids code
obfuscation.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
1823730fbc89fadde72a7bb3b7bdf03cc7b8835c 05-Aug-2006 Thomas Graf <tgraf@suug.ch> [IPv4]: Move interface address bits to linux/if_addr.h

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
beb8d13bed80f8388f1a9a107d07ddd342e627e8 05-Aug-2006 Venkat Yekkirala <vyekkirala@TrustedCS.com> [MLSXFRM]: Add flow labeling

This labels the flows that could utilize IPSec xfrms at the points the
flows are defined so that IPSec policy and SAs at the right label can
be used.

The following protos are currently not handled, but they should
continue to be able to use single-labeled IPSec like they currently
do.

ipmr
ip_gre
ipip
igmp
sit
sctp
ip6_tunnel (IPv6 over IPv6 tunnel device)
decnet

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6ab3d5624e172c553004ecc862bfeac16d9d68b7 30-Jun-2006 Jörn Engel <joern@wohnheim.fh-wedel.de> Remove obsolete #include <linux/config.h>

Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
09c884d4c3b45cda904c2291d4723074ff523611 21-Mar-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: ROUTE: Add accept_ra_rt_info_max_plen sysctl.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
70ceb4f53929f73746be72f73707cd9f8753e2fc 21-Mar-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: ROUTE: Add experimental support for Route Information Option in RA (RFC4191).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
930d6ff2e2a5f1538448d3b0b2652a8f0c0f6cba 21-Mar-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: ROUTE: Add accept_ra_rtr_pref sysctl.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
ebacaaa0fdf4402cdf4c8e569f54af36b6f0aa2d 21-Mar-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: ROUTE: Add support for Router Preference (RFC4191).

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
c4fd30eb18666972230689eb30e8f90844bce635 21-Mar-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: ADDRCONF: Add accept_ra_pinfo sysctl.

This controls whether we accept Prefix Information in RAs.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
65f5c7c1143fb8eed5bc7e7d8c926346e00fe3c0 21-Mar-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: ROUTE: Add accept_ra_defrtr sysctl.

This controls whether we accept default router information
in RAs.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
46b86a2da0fd14bd49765330df63a62279833acb 13-Jan-2006 Joe Perches <joe@perches.com> [NET]: Use NIP6_FMT in kernel.h

There are errors and inconsistency in the display of NIP6 strings.
ie: net/ipv6/ip6_flowlabel.c

There are errors and inconsistency in the display of NIPQUAD strings too.
ie: net/netfilter/nf_conntrack_ftp.c

This patch:
adds NIP6_FMT to kernel.h
changes all code to use NIP6_FMT
fixes net/ipv6/ip6_flowlabel.c
adds NIPQUAD_FMT to kernel.h
fixes net/netfilter/nf_conntrack_ftp.c
changes a few uses of "%u.%u.%u.%u" to NIPQUAD_FMT for symmetry to NIP6_FMT

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
140e26fcd559f6988e5a9056385eecade19d9b49 05-Oct-2005 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6]: Fix NS handing for proxy/anycast address

Timer set up by pneigh_enqueue() ended up calling ndisc_rcv()
via pndisc_redo(), which clears LOCALLY_ENQUEUED flag in
NEIGH_CB(skb) and NS was queued again.
Let's call ndisc_recv_ns() directly to avoid the loop.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
e104411b82f5c4d19752c335492036abdbf5880d 09-Sep-2005 Patrick McHardy <kaber@trash.net> [XFRM]: Always release dst_entry on error in xfrm_lookup

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
a61bbcf28a8cb0ba56f8193d512f7222e711a294 15-Aug-2005 Patrick McHardy <kaber@trash.net> [NET]: Store skb->timestamp as offset to a base timestamp

Reduces skb size by 8 bytes on 64-bit.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
0d51aa80a9b1db43920c0770c3bb842dd823c005 21-Jun-2005 Jamal Hadi Salim <hadi@cyberus.ca> [IPV6]: V6 route events reported with wrong netlink PID and seq number

Essentially netlink at the moment always reports a pid and sequence of 0
always for v6 route activities.
To understand the repurcassions of this look at:
http://lists.quagga.net/pipermail/quagga-dev/2005-June/003507.html

While fixing this, i took the liberty to resolve the outstanding issue
of IPV6 routes inserted via ioctls to have the correct pids as well.

This patch tries to behave as close as possible to the v4 routes i.e
maintains whatever PID the socket issuing the command owns as opposed to
the process. That made the patch a little bulky.

I have tested against both netlink derived utility to add/del routes as
well as ioctl derived one. The Quagga folks have tested against quagga.
This fixes the problem and so far hasnt been detected to introduce any
new issues.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 17-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org> Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!