Cross Reference: /net/ipv6/xfrm6

History log of /net/ipv6/xfrm6_policy.c
Revision	Date	Author	Comments
789f202326640814c52f82e80cef3584d8254623	22-Oct-2014	Li RongQing <roy.qing.li@gmail.com>	xfrm6: fix a potential use after free in xfrm6_policy.c pskb_may_pull() maybe change skb->data and make nh and exthdr pointer oboslete, so recompute the nd and exthdr Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
67ba4152e8b77eada6a9c64e3c2c84d6112794fc	24-Aug-2014	Ian Morris <ipm@chirality.org.uk>	ipv6: White-space cleansing : Line Layouts This patch makes no changes to the logic of the code but simply addresses coding style issues as detected by checkpatch. Both objdump and diff -w show no differences. A number of items are addressed in this patch: * Multiple spaces converted to tabs * Spaces before tabs removed. * Spaces in pointer typing cleansed (char )foo etc. Remove space after sizeof * Ensure spacing around comparators such as if statements. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
7e14ea1521d9249d9de7f0ea39c9af054745eebd	14-Mar-2014	Steffen Klassert <steffen.klassert@secunet.com>	xfrm6: Add IPsec protocol multiplexer This patch adds an IPsec protocol multiplexer for ipv6. With this it is possible to add alternative protocol handlers, as needed for IPsec virtual tunnel interfaces. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
84502b5ef9849a9694673b15c31bd3ac693010ae	30-Oct-2013	Steffen Klassert <steffen.klassert@secunet.com>	xfrm: Fix null pointer dereference when decoding sessions On some codepaths the skb does not have a dst entry when xfrm_decode_session() is called. So check for a valid skb_dst() before dereferencing the device interface index. We use 0 as the device index if there is no valid skb_dst(), or at reverse decoding we use skb_iif as device interface index. Bug was introduced with git commit bafd4bd4dc ("xfrm: Decode sessions with output interface."). Reported-by: Meelis Roos <mroos@linux.ee> Tested-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
eeb1b73378b560e00ff1da2ef09fed9254f4e128	25-Oct-2013	Steffen Klassert <steffen.klassert@secunet.com>	xfrm: Increase the garbage collector threshold With the removal of the routing cache, we lost the option to tweak the garbage collector threshold along with the maximum routing cache size. So git commit 703fb94ec ("xfrm: Fix the gc threshold value for ipv4") moved back to a static threshold. It turned out that the current threshold before we start garbage collecting is much to small for some workloads, so increase it from 1024 to 32768. This means that we start the garbage collector if we have more than 32768 dst entries in the system and refuse new allocations if we are above 65536. Reported-by: Wolfgang Walter <linux@stwm.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
bafd4bd4dcfa13145db7f951251eef3e10f8c278	09-Sep-2013	Steffen Klassert <steffen.klassert@secunet.com>	xfrm: Decode sessions with output interface. The output interface matching does not work on forward policy lookups, the output interface of the flowi is always 0. Fix this by setting the output interface when we decode the session. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
84c4a9dfbf430861e7588d95ae3ff61535dca351	10-May-2013	Cong Wang <amwang@redhat.com>	xfrm6: release dev before returning error We forget to call dev_put() on error path in xfrm6_fill_dst(), its caller doesn't handle this. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
18cf0d0784b4a634472ed24d0d7ca1c721d93e90	18-Feb-2013	Romain KUNTZ <r.kuntz@ipflavors.com>	xfrm: release neighbor upon dst destruction Neighbor is cloned in xfrm6_fill_dst but seems to never be released. Neighbor entry should be released when XFRM6 dst entry is destroyed in xfrm6_dst_destroy, otherwise references may be kept forever on the device pointed by the neighbor entry. I may not have understood all the subtleties of XFRM & dst so I would be happy to receive comments on this patch. Signed-off-by: Romain Kuntz <r.kuntz@ipflavors.com> Signed-off-by: David S. Miller <davem@davemloft.net>
8d068875caca3b507ffa8a57d521483fd4eebcc7	06-Feb-2013	Michal Kubecek <mkubecek@suse.cz>	xfrm: make gc_thresh configurable in all namespaces The xfrm gc threshold can be configured via xfrm{4,6}_gc_thresh sysctl but currently only in init_net, other namespaces always use the default value. This can substantially limit the number of IPsec tunnels that can be effectively used. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
887c95cc1da53f66a5890fdeab13414613010097	17-Jan-2013	YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org>	ipv6: Complete neighbour entry removal from dst_entry. CC: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
0afe21fdf6cfe0fe8a184d82a399773cc331bf40	16-Nov-2012	Steffen Klassert <steffen.klassert@secunet.com>	xfrm6: Remove commented out function call to xfrm6_input_fini xfrm6_input_fini() is not in the tree since more than 10 years, so remove the commented out function call. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
c38132865959c47d36b5db1c9289c7391895be6b	15-Nov-2012	Steffen Klassert <steffen.klassert@secunet.com>	xfrm: Use a static gc threshold value for ipv6 Unlike ipv4 did, ipv6 does not handle the maximum number of cached routes dynamically. So no need to try to handle the IPsec gc threshold value dynamically. This patch sets the IPsec gc threshold value back to 1024 routes, as it is for non-IPsec routes. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
07a936260a94ae4798527ce8f79d4f3b589ab8a3	29-Oct-2012	Amerigo Wang <amwang@redhat.com>	ipv6: use IS_ENABLED() #if defined(CONFIG_FOO) \|\| defined(CONFIG_FOO_MODULE) can be replaced by #if IS_ENABLED(CONFIG_FOO) Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
9d7b0fc1ef1f17aff57c0dc9a59453d8fca255c3	20-Aug-2012	Patrick McHardy <kaber@trash.net>	net: ipv6: fix oops in inet_putpeer() Commit 97bab73f (inet: Hide route peer accesses behind helpers.) introduced a bug in xfrm6_policy_destroy(). The xfrm_dst's _rt6i_peer member is not initialized, causing a false positive result from inetpeer_ptr_is_peer(), which in turn causes a NULL pointer dereference in inet_putpeer(). Pid: 314, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #17 To Be Filled By O.E.M. To Be Filled By O.E.M./P4S800D-X EIP: 0060:[<c03abf93>] EFLAGS: 00010246 CPU: 0 EIP is at inet_putpeer+0xe/0x16 EAX: 00000000 EBX: f3481700 ECX: 00000000 EDX: 000dd641 ESI: f3481700 EDI: c05e949c EBP: f551def4 ESP: f551def4 DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068 CR0: 8005003b CR2: 00000070 CR3: 3243d000 CR4: 00000750 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 f551df04 c0423de1 00000000 f3481700 f551df18 c038d5f7 f254b9f8 f551df28 f34f85d8 f551df20 c03ef48d f551df3c c0396870 f30697e8 f24e1738 c05e98f4 f5509540 c05cd2b4 f551df7c c0142d2b c043feb5 f5509540 00000000 c05cd2e8 [<c0423de1>] xfrm6_dst_destroy+0x42/0xdb [<c038d5f7>] dst_destroy+0x1d/0xa4 [<c03ef48d>] xfrm_bundle_flo_delete+0x2b/0x36 [<c0396870>] flow_cache_gc_task+0x85/0x9f [<c0142d2b>] process_one_work+0x122/0x441 [<c043feb5>] ? apic_timer_interrupt+0x31/0x38 [<c03967eb>] ? flow_cache_new_hashrnd+0x2b/0x2b [<c0143e2d>] worker_thread+0x113/0x3cc Fix by adding a init_dst() callback to struct xfrm_policy_afinfo to properly initialize the dst's peer pointer. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
6700c2709c08d74ae2c3c29b84a30da012dbc7f1	17-Jul-2012	David S. Miller <davem@davemloft.net>	net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}() This will be used so that we can compose a full flow key. Even though we have a route in this context, we need more. In the future the routes will be without destination address, source address, etc. keying. One ipv4 route will cover entire subnets, etc. In this environment we have to have a way to possess persistent storage for redirects and PMTU information. This persistent storage will exist in the FIB tables, and that's why we'll need to be able to rebuild a full lookup flow key here. Using that flow key will do a fib_lookup() and create/update the persistent entry. Signed-off-by: David S. Miller <davem@davemloft.net>
ec18d9a2691d69cd14b48f9b919fddcef28b7f5c	12-Jul-2012	David S. Miller <davem@davemloft.net>	ipv6: Add redirect support to all protocol icmp error handlers. Signed-off-by: David S. Miller <davem@davemloft.net>
97cac0821af4474ec4ba3a9e7a36b98ed9b6db88	03-Jul-2012	David S. Miller <davem@davemloft.net>	ipv6: Store route neighbour in rt6_info struct. This makes for a simplified conversion away from dst_get_neighbour(). All code outside of ipv6 will use neigh lookups via dst_neigh_lookup(). Signed-off-by: David S. Miller <davem@davemloft.net>
97bab73f987e2781129cd6f4b6379bf44d808cc6	10-Jun-2012	David S. Miller <davem@davemloft.net>	inet: Hide route peer accesses behind helpers. We encode the pointer(s) into an unsigned long with one state bit. The state bit is used so we can store the inetpeer tree root to use when resolving the peer later. Later the peer roots will be per-FIB table, and this change works to facilitate that. Signed-off-by: David S. Miller <davem@davemloft.net>
ec8f23ce0f4005b74013d4d122e0d540397a93c9	19-Apr-2012	Eric W. Biederman <ebiederm@xmission.com>	net: Convert all sysctl registrations to register_net_sysctl This results in code with less boiler plate that is a bit easier to read. Additionally stops us from using compatibility code in the sysctl core, hastening the day when the compatibility code can be removed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
4e3fd7a06dc20b2d8ec6892233ad2012968fe7b6	21-Nov-2011	Alexey Dobriyan <adobriyan@gmail.com>	net: remove ipv6_addr_copy() C assignment can handle struct in6_addr copying. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
b71d1d426d263b0b6cb5760322efebbfc89d4463	22-Apr-2011	Eric Dumazet <eric.dumazet@gmail.com>	inet: constify ip headers and in6_addr Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers where possible, to make code intention more obvious. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
1958b856c1a59c0f1e892b92debb8c9fe4f364dc	12-Mar-2011	David S. Miller <davem@davemloft.net>	net: Put fl6_* macros to struct flowi6 and use them again. Signed-off-by: David S. Miller <davem@davemloft.net>
4c9483b2fb5d2548c3cc1fe03cdd4484ceeb5d1c	12-Mar-2011	David S. Miller <davem@davemloft.net>	ipv6: Convert to use flowi6 where applicable. Signed-off-by: David S. Miller <davem@davemloft.net>
7e1dc7b6f709dfc1a9ab4b320dbe723f45992693	12-Mar-2011	David S. Miller <davem@davemloft.net>	net: Use flowi4 and flowi6 in xfrm layer. Signed-off-by: David S. Miller <davem@davemloft.net>
6281dcc94a96bd73017b2baa8fa83925405109ef	12-Mar-2011	David S. Miller <davem@davemloft.net>	net: Make flowi ports AF dependent. Create two sets of port member accessors, one set prefixed by fl4_* and the other prefixed by fl6_* This will let us to create AF optimal flow instances. It will work because every context in which we access the ports, we have to be fully aware of which AF the flowi is anyways. Signed-off-by: David S. Miller <davem@davemloft.net>
1d28f42c1bd4bb2363d88df74d0128b4da135b4a	12-Mar-2011	David S. Miller <davem@davemloft.net>	net: Put flowi_* prefix on AF independent members of struct flowi I intend to turn struct flowi into a union of AF specific flowi structs. There will be a common structure that each variant includes first, much like struct sock_common. This is the first step to move in that direction. Signed-off-by: David S. Miller <davem@davemloft.net>
2774c131b1d19920b4587db1cfbd6f0750ad1f15	01-Mar-2011	David S. Miller <davem@davemloft.net>	xfrm: Handle blackhole route creation via afinfo. That way we don't have to potentially do this in every xfrm_lookup() caller. Signed-off-by: David S. Miller <davem@davemloft.net>
5e6b930f21b0a442f9d5db97c8314b4d91be1c27	24-Feb-2011	David S. Miller <davem@davemloft.net>	xfrm: Const'ify address arguments to ->dst_lookup() Signed-off-by: David S. Miller <davem@davemloft.net>
0c7b3eefb4ab8df245e94feb0d83c1c3450a3d87	23-Feb-2011	David S. Miller <davem@davemloft.net>	xfrm: Mark flowi arg to ->fill_dst() const. Signed-off-by: David S. Miller <davem@davemloft.net>
05d8402576c9c1b85bfc9e4f9d6a21c27ccbd5b1	23-Feb-2011	David S. Miller <davem@davemloft.net>	xfrm: Mark flowi arg to ->get_tos() const. Signed-off-by: David S. Miller <davem@davemloft.net>
62fa8a846d7de4b299232e330c74b7783539df76	27-Jan-2011	David S. Miller <davem@davemloft.net>	net: Implement read-only protection and COW'ing of metrics. Routing metrics are now copy-on-write. Initially a route entry points it's metrics at a read-only location. If a routing table entry exists, it will point there. Else it will point at the all zero metric place-holder called 'dst_default_metrics'. The writeability state of the metrics is stored in the low bits of the metrics pointer, we have two bits left to spare if we want to store more states. For the initial implementation, COW is implemented simply via kmalloc. However future enhancements will change this to place the writable metrics somewhere else, in order to increase sharing. Very likely this "somewhere else" will be the inetpeer cache. Note also that this means that metrics updates may transiently fail if we cannot COW the metrics successfully. But even by itself, this patch should decrease memory usage and increase cache locality especially for routing workloads. In those cases the read-only metric copies stay in place and never get written to. TCP workloads where metrics get updated, and those rare cases where PMTU triggers occur, will take a very slight performance hit. But that hit will be alleviated when the long-term writable metrics move to a more sharable location. Since the metrics storage went from a u32 array of RTAX_MAX entries to what is essentially a pointer, some retooling of the dst_entry layout was necessary. Most importantly, we need to preserve the alignment of the reference count so that it doesn't share cache lines with the read-mostly state, as per Eric Dumazet's alignment assertion checks. The only non-trivial bit here is the move of the 'flags' member into the writeable cacheline. This is OK since we are always accessing the flags around the same moment when we made a modification to the reference count. Signed-off-by: David S. Miller <davem@davemloft.net>
7cc2edb83447775a34ed3bf9d29d8295a434b523	26-Jan-2011	David S. Miller <davem@davemloft.net>	xfrm6: Don't forget to propagate peer into ipsec route. Like ipv4, we have to propagate the ipv6 route peer into the ipsec top-level route during instantiation. Signed-off-by: David S. Miller <davem@davemloft.net>
fc66f95c68b6d4535a0ea2ea15d5cf626e310956	08-Oct-2010	Eric Dumazet <eric.dumazet@gmail.com>	net dst: use a percpu_counter to track entries struct dst_ops tracks number of allocated dst in an atomic_t field, subject to high cache line contention in stress workload. Switch to a percpu_counter, to reduce number of time we need to dirty a central location. Place it on a separate cache line to avoid dirtying read only fields. Stress test : (Sending 160.000.000 UDP frames, IP route cache disabled, dual E5540 @2.53GHz, 32bit kernel, FIB_TRIE, SLUB/NUMA) Before: real 0m51.179s user 0m15.329s sys 10m15.942s After: real 0m45.570s user 0m15.525s sys 9m56.669s With a small reordering of struct neighbour fields, subject of a following patch, (to separate refcnt from other read mostly fields) real 0m41.841s user 0m15.261s sys 8m45.949s Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
a02cec2155fbea457eca8881870fd2de1a4c4c76	22-Sep-2010	Eric Dumazet <eric.dumazet@gmail.com>	net: return operator cleanup Change "return (EXPR);" to "return EXPR;" return is not a function, parentheses are not required. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
44b451f1633896de15d2d52e1a2bd462e80b7814	02-Jul-2010	Peter Kosyh <p.kosyh@gmail.com>	xfrm: fix xfrm by MARK logic While using xfrm by MARK feature in 2.6.34 - 2.6.35 kernels, the mark is always cleared in flowi structure via memset in _decode_session4 (net/ipv4/xfrm4_policy.c), so the policy lookup fails. IPv6 code is affected by this bug too. Signed-off-by: Peter Kosyh <p.kosyh@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bc8e4b954e463716a57d8113dd50ae9d47b682a7	22-Apr-2010	Nicolas Dichtel <nicolas.dichtel@6wind.com>	xfrm6: ensure to use the same dev when building a bundle When building a bundle, we set dst.dev and rt6.rt6i_idev. We must ensure to set the same device for both fields. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
80c802f3073e84c956846e921e8a0b02dfa3755f	07-Apr-2010	Timo Teräs <timo.teras@iki.fi>	xfrm: cache bundles instead of policies for outgoing flows __xfrm_lookup() is called for each packet transmitted out of system. The xfrm_find_bundle() does a linear search which can kill system performance depending on how many bundles are required per policy. This modifies __xfrm_lookup() to store bundles directly in the flow cache. If we did not get a hit, we just create a new bundle instead of doing slow search. This means that we can now get multiple xfrm_dst's for same flow (on per-cpu basis). Signed-off-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
87c1e12b5eeb7b30b4b41291bef8e0b41fc3dde9	02-Mar-2010	Herbert Xu <herbert@gondor.apana.org.au>	ipsec: Fix bogus bundle flowi When I merged the bundle creation code, I introduced a bogus flowi value in the bundle. Instead of getting from the caller, it was instead set to the flow in the route object, which is totally different. The end result is that the bundles we created never match, and we instead end up with an ever growing bundle list. Thanks to Jamal for find this problem. Reported-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
d7c7544c3d5f59033d1bf3236bc7b289f5f26b75	25-Jan-2010	Alexey Dobriyan <adobriyan@gmail.com>	netns xfrm: deal with dst entries in netns GC is non-existent in netns, so after you hit GC threshold, no new dst entries will be created until someone triggers cleanup in init_net. Make xfrm4_dst_ops and xfrm6_dst_ops per-netns. This is not done in a generic way, because it woule waste (AF_MAX - 2) * sizeof(struct dst_ops) bytes per-netns. Reorder GC threshold initialization so it'd be done before registering XFRM policies. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
f8572d8f2a2ba75408b97dc24ef47c83671795d7	05-Nov-2009	Eric W. Biederman <ebiederm@xmission.com>	sysctl net: Remove unused binary sysctl code Now that sys_sysctl is a compatiblity wrapper around /proc/sys all sysctl strategy routines, and all ctl_name and strategy entries in the sysctl tables are unused, and can be revmoed. In addition neigh_sysctl_register has been modified to no longer take a strategy argument and it's callers have been modified not to pass one. Cc: "David Miller" <davem@davemloft.net> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: netdev@vger.kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>