History log of /include/net/flow.h
Revision Date Author Comments
ba3d8d3f9f65807763b2e0e1ea7645d74a962248 31-Mar-2014 Lorenzo Colitti <lorenzo@google.com> net: core: Support UID-based routing.

This contains the following commits:

1. cc2f522 net: core: Add a UID range to fib rules.
2. d7ed2bd net: core: Use the socket UID in routing lookups.
3. 2f9306a net: core: Add a RTA_UID attribute to routes.
This is so that userspace can do per-UID route lookups.
4. 8e46efb net: ipv6: Use the UID in IPv6 PMTUD
IPv4 PMTUD already does this because ipv4_sk_update_pmtu
uses __build_flow_key, which includes the UID.

Bug: 15413527
Change-Id: I81bd31dae655de9cce7d7a1f9a905dc1c2feba7c
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
6a662719c9868b3d6c7d26b3a085f0cd3cc15e64 16-Apr-2014 Cong Wang <cwang@twopensource.com> ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif

As suggested by Julian:

Simply, flowi4_iif must not contain 0, it does not
look logical to ignore all ip rules with specified iif.

because in fib_rule_match() we do:

if (rule->iifindex && (rule->iifindex != fl->flowi_iif))
goto out;

flowi4_iif should be LOOPBACK_IFINDEX by default.

We need to move LOOPBACK_IFINDEX to include/net/flow.h:

1) It is mostly used by flowi_iif

2) Fix the following compile error if we use it in flow.h
by the patches latter:

In file included from include/linux/netfilter.h:277:0,
from include/net/netns/netfilter.h:5,
from include/net/net_namespace.h:21,
from include/linux/netdevice.h:43,
from include/linux/icmpv6.h:12,
from include/linux/ipv6.h:61,
from include/net/ipv6.h:16,
from include/linux/sunrpc/clnt.h:27,
from include/linux/nfs_fs.h:30,
from init/do_mounts.c:32:
include/net/flow.h: In function ‘flowi4_init_output’:
include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function)

Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4a93f5095a628d812b0b30c16d7bacea1efd783c 12-Mar-2014 Steffen Klassert <steffen.klassert@secunet.com> flowcache: Fix resource leaks on namespace exit.

We leak an active timer, the hotcpu notifier and all allocated
resources when we exit a namespace. Fix this by introducing a
flow_cache_fini() function where we release the resources before
we exit.

Fixes: ca925cf1534e ("flowcache: Make flow cache name space aware")
Reported-by: Jakub Kicinski <moorray3@wp.pl>
Tested-by: Jakub Kicinski <moorray3@wp.pl>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ca925cf1534ebcec332c08719a7dee6ee1782ce4 18-Jan-2014 Fan Du <fan.du@windriver.com> flowcache: Make flow cache name space aware

Inserting a entry into flowcache, or flushing flowcache should be based
on per net scope. The reason to do so is flushing operation from fat
netns crammed with flow entries will also making the slim netns with only
a few flow cache entries go away in original implementation.

Since flowcache is tightly coupled with IPsec, so it would be easier to
put flow cache global parameters into xfrm namespace part. And one last
thing needs to do is bumping flow cache genid, and flush flow cache should
also be made in per net style.

Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
0e0d44ab4275549998567cd4700b43f7496eb62b 28-Aug-2013 Steffen Klassert <steffen.klassert@secunet.com> net: Remove FLOWI_FLAG_CAN_SLEEP

FLOWI_FLAG_CAN_SLEEP was used to notify xfrm about the posibility
to sleep until the needed states are resolved. This code is gone,
so FLOWI_FLAG_CAN_SLEEP is not needed anymore.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
4787342c390a8e7570a81a62c9b361d7a4380893 20-Sep-2013 Joe Perches <joe@perches.com> flow.h/flow_keys.h: Remove extern from function prototypes

There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.

Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c92b96553a80c1dbe2ebe128bbe37c8f98f148bf 08-Oct-2012 Julian Anastasov <ja@ssi.bg> ipv4: Add FLOWI_FLAG_KNOWN_NH

Add flag to request that output route should be
returned with known rt_gateway, in case we want to use
it as nexthop for neighbour resolving.

The returned route can be cached as follows:

- in NH exception: because the cached routes are not shared
with other destinations
- in FIB NH: when using gateway because all destinations for
NH share same gateway

As last option, to return rt_gateway!=0 we have to
set DST_NOCACHE.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
ba3f7f04ef2b19aace38f855aedd17fe43035d50 17-Jul-2012 David S. Miller <davem@davemloft.net> ipv4: Kill FLOWI_FLAG_RT_NOCACHE and associated code.

Signed-off-by: David S. Miller <davem@davemloft.net>
3e12939a2a67fbb4cbd962c3b9bc398c73319766 10-Jul-2012 David S. Miller <davem@davemloft.net> inet: Kill FLOWI_FLAG_PRECOW_METRICS.

No longer needed. TCP writes metrics, but now in it's own special
cache that does not dirty the route metrics. Therefore there is no
longer any reason to pre-cow metrics in this way.

Signed-off-by: David S. Miller <davem@davemloft.net>
7586eceb0abc0ea1c2b023e3e5d4dfd4ff40930a 20-Jun-2012 Eric Dumazet <edumazet@google.com> ipv4: tcp: dont cache output dst for syncookies

Don't cache output dst for syncookies, as this adds pressure on IP route
cache and rcu subsystem for no gain.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e6b45241c57a83197e5de9166b3b0d32ac562609 04-Feb-2012 Julian Anastasov <ja@ssi.bg> ipv4: reset flowi parameters on route connect

Eric Dumazet found that commit 813b3b5db83
(ipv4: Use caller's on-stack flowi as-is in output
route lookups.) that comes in 3.0 added a regression.
The problem appears to be that resulting flowi4_oif is
used incorrectly as input parameter to some routing lookups.
The result is that when connecting to local port without
listener if the IP address that is used is not on a loopback
interface we incorrectly assign RTN_UNICAST to the output
route because no route is matched by oif=lo. The RST packet
can not be sent immediately by tcp_v4_send_reset because
it expects RTN_LOCAL.

So, change ip_route_connect and ip_route_newports to
update the flowi4 fields that are input parameters because
we do not want unnecessary binding to oif.

To make it clear what are the input parameters that
can be modified during lookup and to show which fields of
floiw4 are reused add a new function to update the flowi4
structure: flowi4_update_output.

Thanks to Yurij M. Plotnikov for providing a bug report including a
program to reproduce the problem.

Thanks to Eric Dumazet for tracking the problem down to
tcp_v4_send_reset and providing initial fix.

Reported-by: Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
747465ef7a082033e086dedc8189febfda43b015 16-Jan-2012 Eric Dumazet <eric.dumazet@gmail.com> net: fix some sparse errors

make C=2 CF="-D__CHECK_ENDIAN__" M=net

And fix flowi4_init_output() prototype for sport

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c0ed1c14a72ca9ebacd51fb94a8aca488b0d361e 21-Dec-2011 Steffen Klassert <steffen.klassert@secunet.com> net: Add a flow_cache_flush_deferred function

flow_cach_flush() might sleep but can be called from
atomic context via the xfrm garbage collector. So add
a flow_cache_flush_deferred() function and use this if
the xfrm garbage colector is invoked from within the
packet path.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
84f9307c5da84a7c8513e7607dc8427d2cbd63e3 30-Nov-2011 Eric Dumazet <eric.dumazet@gmail.com> ipv4: use a 64bit load/store in output path

gcc compiler is smart enough to use a single load/store if we
memcpy(dptr, sptr, 8) on x86_64, regardless of
CONFIG_CC_OPTIMIZE_FOR_SIZE

In IP header, daddr immediately follows saddr, this wont change in the
future. We only need to make sure our flowi4 (saddr,daddr) fields wont
break the rule.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
aa1c366e4febc7f5c2b84958a2dd7cd70e28f9d0 05-Sep-2011 dpward <david.ward@ll.mit.edu> net: Handle different key sizes between address families in flow cache

With the conversion of struct flowi to a union of AF-specific structs, some
operations on the flow cache need to account for the exact size of the key.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
728871bc05afc8ff310b17dba3e57a2472792b13 05-Sep-2011 David Ward <david.ward@ll.mit.edu> net: Align AF-specific flowi structs to long

AF-specific flowi structs are now passed to flow_key_compare, which must
also be aligned to a long.

Signed-off-by: David Ward <david.ward@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
60063497a95e716c9a689af3be2687d261f115b4 27-Jul-2011 Arun Sharma <asharma@fb.com> atomic: use <linux/atomic.h>

This allows us to move duplicated code in <asm/atomic.h>
(atomic_inc_not_zero() for now) to <linux/atomic.h>

Signed-off-by: Arun Sharma <asharma@fb.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
9b12c75bf4d58dd85c987ee7b6a4356fdc7c1222 01-Apr-2011 David S. Miller <davem@davemloft.net> net: Order ports in same order as addresses in flow objects.

For consistency.

Signed-off-by: David S. Miller <davem@davemloft.net>
83229aa5e2c242163599266a686483e3b91ec07e 31-Mar-2011 David S. Miller <davem@davemloft.net> net: Add helper flowi4_init_output().

On-stack initialization via assignment of flow structures are
expensive because GCC emits a memset() to clear the entire
structure out no matter what.

Add a helper for ipv4 output flow key setup which we can use to avoid
the memset.

Signed-off-by: David S. Miller <davem@davemloft.net>
bef55aebd560c5a6f8883c421abccee39978c58c 12-Mar-2011 David S. Miller <davem@davemloft.net> decnet: Convert to use flowidn where applicable.

Signed-off-by: David S. Miller <davem@davemloft.net>
1958b856c1a59c0f1e892b92debb8c9fe4f364dc 12-Mar-2011 David S. Miller <davem@davemloft.net> net: Put fl6_* macros to struct flowi6 and use them again.

Signed-off-by: David S. Miller <davem@davemloft.net>
9cce96df5b76691712dba22e83ff5efe900361e1 12-Mar-2011 David S. Miller <davem@davemloft.net> net: Put fl4_* macros to struct flowi4 and use them again.

Signed-off-by: David S. Miller <davem@davemloft.net>
2032656e76b5355151effdff14de4a1a58643915 12-Mar-2011 David S. Miller <davem@davemloft.net> net: Add flowi6_* member helper macros.

Signed-off-by: David S. Miller <davem@davemloft.net>
22bd5b9b13f2931ac80949f8bfbc40e8cab05be7 12-Mar-2011 David S. Miller <davem@davemloft.net> ipv4: Pass ipv4 flow objects into fib_lookup() paths.

To start doing these conversions, we need to add some temporary
flow4_* macros which will eventually go away when all the protocol
code paths are changed to work on AF specific flowi objects.

Signed-off-by: David S. Miller <davem@davemloft.net>
59b1a94c9a034e63a5e030a5154be1d4d84677d9 12-Mar-2011 David S. Miller <davem@davemloft.net> net: Add flowiX_to_flowi() shorthands.

This is just a shorthand which will help in passing around AF
specific flow structures as generic ones.

Signed-off-by: David S. Miller <davem@davemloft.net>
56bb8059e1a8bf291054c26367564dc302f6fd8f 12-Mar-2011 David S. Miller <davem@davemloft.net> net: Break struct flowi out into AF specific instances.

Now we have struct flowi4, flowi6, and flowidn for each address
family. And struct flowi is just a union of them all.

It might have been troublesome to convert flow_cache_uli_match() but
as it turns out this function is completely unused and therefore can
be simply removed.

Signed-off-by: David S. Miller <davem@davemloft.net>
6281dcc94a96bd73017b2baa8fa83925405109ef 12-Mar-2011 David S. Miller <davem@davemloft.net> net: Make flowi ports AF dependent.

Create two sets of port member accessors, one set prefixed by fl4_*
and the other prefixed by fl6_*

This will let us to create AF optimal flow instances.

It will work because every context in which we access the ports,
we have to be fully aware of which AF the flowi is anyways.

Signed-off-by: David S. Miller <davem@davemloft.net>
08704bcbf022786532b5f188935ab6619906049f 12-Mar-2011 David S. Miller <davem@davemloft.net> net: Create union flowi_uli

This will be used when we have seperate flowi types.

Signed-off-by: David S. Miller <davem@davemloft.net>
806566cc78390b1565ded91712cd28619cea5f57 12-Mar-2011 David S. Miller <davem@davemloft.net> net: Create struct flowi_common

Pull out the AF independent members of struct flowi into a
new struct flowi_common

Signed-off-by: David S. Miller <davem@davemloft.net>
1d28f42c1bd4bb2363d88df74d0128b4da135b4a 12-Mar-2011 David S. Miller <davem@davemloft.net> net: Put flowi_* prefix on AF independent members of struct flowi

I intend to turn struct flowi into a union of AF specific flowi
structs. There will be a common structure that each variant includes
first, much like struct sock_common.

This is the first step to move in that direction.

Signed-off-by: David S. Miller <davem@davemloft.net>
fbef0a40919a80eb8a02fe9d3b96dfdcdebf4317 11-Mar-2011 David S. Miller <davem@davemloft.net> net: Remove unnecessary padding in struct flowi

Move tos, scope, proto, and flags to the beginning of
the structure.

Signed-off-by: David S. Miller <davem@davemloft.net>
5df65e5567a497a28067019b8ff08f98fb026629 01-Mar-2011 David S. Miller <davem@davemloft.net> net: Add FLOWI_FLAG_CAN_SLEEP.

And set is in contexts where the route resolution can sleep.

Signed-off-by: David S. Miller <davem@davemloft.net>
dee9f4bceb5fd9dbfcc1567148fccdbf16d6a38a 23-Feb-2011 David S. Miller <davem@davemloft.net> net: Make flow cache paths use a const struct flowi.

Signed-off-by: David S. Miller <davem@davemloft.net>
0730b9a1504cb76f80c97d90ff82f8daeb1243a3 23-Feb-2011 David S. Miller <davem@davemloft.net> net: Mark flowi arg to flow_cache_uli_match() const.

Signed-off-by: David S. Miller <davem@davemloft.net>
a4daad6b0923030fbd3b00a01f570e4c3eef446b 28-Jan-2011 David S. Miller <davem@davemloft.net> net: Pre-COW metrics for TCP.

TCP is going to record metrics for the connection,
so pre-COW the route metrics at route cache entry
creation time.

This avoids several atomic operations that have to
occur if we COW the metrics after the entry reaches
global visibility.

Signed-off-by: David S. Miller <davem@davemloft.net>
e058464990c2ef1f3ecd6b83a154913c3c06f02a 23-Dec-2010 David S. Miller <davem@davemloft.net> Revert "ipv4: Allow configuring subnets as local addresses"

This reverts commit 4465b469008bc03b98a1b8df4e9ae501b6c69d4b.

Conflicts:

net/ipv4/fib_frontend.c

As reported by Ben Greear, this causes regressions:

> Change 4465b469008bc03b98a1b8df4e9ae501b6c69d4b caused rules
> to stop matching the input device properly because the
> FLOWI_FLAG_MATCH_ANY_IIF is always defined in ip_dev_find().
>
> This breaks rules such as:
>
> ip rule add pref 512 lookup local
> ip rule del pref 0 lookup local
> ip link set eth2 up
> ip -4 addr add 172.16.0.102/24 broadcast 172.16.0.255 dev eth2
> ip rule add to 172.16.0.102 iif eth2 lookup local pref 10
> ip rule add iif eth2 lookup 10001 pref 20
> ip route add 172.16.0.0/24 dev eth2 table 10001
> ip route add unreachable 0/0 table 10001
>
> If you had a second interface 'eth0' that was on a different
> subnet, pinging a system on that interface would fail:
>
> [root@ct503-60 ~]# ping 192.168.100.1
> connect: Invalid argument

Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cc9ff19da9bf76a2f70bcb80225a1c587c162e52 03-Nov-2010 Timo Teräs <timo.teras@iki.fi> xfrm: use gre key as flow upper protocol info

The GRE Key field is intended to be used for identifying an individual
traffic flow within a tunnel. It is useful to be able to have XFRM
policy selector matches to have different policies for different
GRE tunnels.

Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
4465b469008bc03b98a1b8df4e9ae501b6c69d4b 23-May-2010 Tom Herbert <therbert@google.com> ipv4: Allow configuring subnets as local addresses

This patch allows a host to be configured to respond to any address in
a specified range as if it were local, without actually needing to
configure the address on an interface. This is done through routing
table configuration. For instance, to configure a host to respond
to any address in 10.1/16 received on eth0 as a local address we can do:

ip rule add from all iif eth0 lookup 200
ip route add local 10.1/16 dev lo proto kernel scope host src 127.0.0.1 table 200

This host is now reachable by any 10.1/16 address (route lookup on
input for packets received on eth0 can find the route). On output, the
rule will not be matched so that this host can still send packets to
10.1/16 (not sent on loopback). Presumably, external routing can be
configured to make sense out of this.

To make this work, we needed to modify the logic in finding the
interface which is assigned a given source address for output
(dev_ip_find). We perform a normal fib_lookup instead of just a
lookup on the local table, and in the lookup we ignore the input
interface for matching.

This patch is useful to implement IP-anycast for subnets of virtual
addresses.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fe1a5f031e76bd8761a7803d75b95ee96e84a574 07-Apr-2010 Timo Teräs <timo.teras@iki.fi> flow: virtualize flow cache entry methods

This allows to validate the cached object before returning it.
It also allows to destruct object properly, if the last reference
was held in flow cache. This is also a prepartion for caching
bundles in the flow cache.

In return for virtualizing the methods, we save on:
- not having to regenerate the whole flow cache on policy removal:
each flow matching a killed policy gets refreshed as the getter
function notices it smartly.
- we do not have to call flow_cache_flush from policy gc, since the
flow cache now properly deletes the object if it had any references

Signed-off-by: Timo Teras <timo.teras@iki.fi>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
52479b623d3d41df84c499325b6a8c7915413032 26-Nov-2008 Alexey Dobriyan <adobriyan@gmail.com> netns xfrm: lookup in netns

Pass netns to xfrm_lookup()/__xfrm_lookup(). For that pass netns
to flow_cache_lookup() and resolver callback.

Take it from socket or netdevice. Stub DECnet to init_net.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a210d01ae3ee006b59e54e772a7f212486e0f021 01-Oct-2008 Julian Anastasov <ja@ssi.bg> ipv4: Loosen source address check on IPv4 output

ip_route_output() contains a check to make sure that no flows with
non-local source IP addresses are routed. This obviously makes using
such addresses impossible.

This patch introduces a flowi flag which makes omitting this check
possible. The new flag provides a way of handling transparent and
non-transparent connections differently.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: KOVACS Krisztian <hidden@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
95c3e8bfcdea8676e2d4d61910c379f4502049bf 05-Aug-2008 Rami Rosen <ramirose@gmail.com> ipv4: remove unused field in struct flowi (include/net/flow.h).

This patch removes an unused field (flags) from struct flowi; it seems
that this "flags" field was used once in the past for multipath
routing with FLOWI_FLAG_MULTIPATHOLDROUTE flag (which does no longer
exist); however, the "flags" field of struct flowi is not used
anymore.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
61f1ab41b8ede8e2a26c349a4e3372100545c5ec 31-Dec-2007 Rami Rosen <ramirose@gmail.com> [IPV4]: Remove unused multipath cached routing defintion in net/flow.h

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
136ebf08b46f839e2dc9db34322b654e5d9b9936 27-Jun-2007 Masahide NAKAMURA <nakam@linux-ipv6.org> [IPV6] MIP6: Kill unnecessary ifdefs.

Kill unnecessary CONFIG_IPV6_MIP6.

o It is redundant for RAW socket to keep MH out with the config then
it can handle any protocol.
o Clean-up at AH.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
157bfc25020f7eb731f94140e099307ade47299e 30-Apr-2007 Masahide NAKAMURA <nakam@linux-ipv6.org> [XFRM]: Restrict upper layer information by bundle.

On MIPv6 usage, XFRM sub policy is enabled.
When main (IPsec) and sub (MIPv6) policy selectors have the same
address set but different upper layer information (i.e. protocol
number and its ports or type/code), multiple bundle should be created.
However, currently we have issue to use the same bundle created for
the first time with all flows covered by the case.

It is useful for the bundle to have the upper layer information
to be restructured correctly if it does not match with the flow.

1. Bundle was created by two policies
Selector from another policy is added to xfrm_dst.
If the flow does not match the selector, it goes to slow path to
restructure new bundle by single policy.

2. Bundle was created by one policy
Flow cache is added to xfrm_dst as originated one. If the flow does
not match the cache, it goes to slow path to try searching another
policy.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
47dcf0cb1005e86d0eea780f2984b2e7490f63cd 10-Nov-2006 Thomas Graf <tgraf@suug.ch> [NET]: Rethink mark field in struct flowi

Now that all protocols have been made aware of the mark
field it can be moved out of the union thus simplyfing
its usage.

The config options in the IPv4/IPv6/DECnet subsystems
to enable respectively disable mark based routing only
obfuscate the code with ifdefs, the cost for the
additional comparison in the flow key is insignificant,
and most distributions have all these options enabled
by default anyway. Therefore it makes sense to remove
the config options and enable mark based routing by
default.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
90bcaf7b4a33bb9b100cc06869f0c033a870d4a0 08-Nov-2006 Al Viro <viro@zeniv.linux.org.uk> [IPV6]: flowlabels are net-endian

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
185b1aa122f87052d9154bb74990bc785372a750 22-Oct-2006 Eric Dumazet <dada1@cosmosbay.com> [NET]: Reduce sizeof(struct flowi) by 20 bytes.

As suggested by David, just kill off some unused fields in dnports to
reduce sizef(struct flowi). If they come back, they should be moved to
nl_u.dn_u in order not to enlarge again struct flowi

[ Modified to really delete this stuff instead of using #if 0. -DaveM ]

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
134b0fc544ba062498451611cb6f3e4454221b3d 05-Oct-2006 James Morris <jmorris@namei.org> IPsec: propagate security module errors up from flow_cache_lookup

When a security module is loaded (in this case, SELinux), the
security_xfrm_policy_lookup() hook can return an access denied permission
(or other error). We were not handling that correctly, and in fact
inverting the return logic and propagating a false "ok" back up to
xfrm_lookup(), which then allowed packets to pass as if they were not
associated with an xfrm policy.

The way I was seeing the problem was when connecting via IPsec to a
confined service on an SELinux box (vsftpd), which did not have the
appropriate SELinux policy permissions to send packets via IPsec.

The first SYNACK would be blocked, because of an uncached lookup via
flow_cache_lookup(), which would fail to resolve an xfrm policy because
the SELinux policy is checked at that point via the resolver.

However, retransmitted SYNACKs would then find a cached flow entry when
calling into flow_cache_lookup() with a null xfrm policy, which is
interpreted by xfrm_lookup() as the packet not having any associated
policy and similarly to the first case, allowing it to pass without
transformation.

The solution presented here is to first ensure that errno values are
correctly propagated all the way back up through the various call chains
from security_xfrm_policy_lookup(), and handled correctly.

Then, flow_cache_lookup() is modified, so that if the policy resolver
fails (typically a permission denied via the security module), the flow
cache entry is killed rather than having a null policy assigned (which
indicates that the packet can pass freely). This also forces any future
lookups for the same flow to consult the security module (e.g. SELinux)
for current security policy (rather than, say, caching the error on the
flow cache entry).

Signed-off-by: James Morris <jmorris@namei.org>
4324a174304d1d826582be5422a976e09d2b1e63 28-Sep-2006 Al Viro <viro@zeniv.linux.org.uk> [XFRM]: fl_ipsec_spi is net-endian

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
cc939d37349bf82891bd1f4558284d7fafe7acb2 28-Sep-2006 Al Viro <viro@zeniv.linux.org.uk> [NET]: ip ports in struct flowi are net-endian

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
f2c3fe24119ee4f8faca08699f0488f500014a27 27-Sep-2006 Al Viro <viro@zeniv.linux.org.uk> [IPV4]: annotate ipv4 addresses in struct rtable and struct flowi

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
75bff8f023e02b045a8f68f36fa7da98dca124b8 21-Aug-2006 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [IPV6] ROUTE: Routing by FWMARK.

Based on patch by Jean Lorchat <lorchat@sfc.wide.ad.jp>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2b741653b6c824fe7520ee92b6795f11c5f24b24 24-Aug-2006 Masahide NAKAMURA <nakam@linux-ipv6.org> [IPV6] MIP6: Add Mobility header definition.

Add Mobility header definition for Mobile IPv6.
Based on MIPL2 kernel patch.

This patch was also written by: Antti Tuominen <anttit@tcs.hut.fi>

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
e0d1caa7b0d5f02e4f34aa09c695d04251310c6c 25-Jul-2006 Venkat Yekkirala <vyekkirala@TrustedCS.com> [MLSXFRM]: Flow based matching of xfrm policy and state

This implements a seemless mechanism for xfrm policy selection and
state matching based on the flow sid. This also includes the necessary
SELinux enforcement pieces.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b6340fcd761acf9249b3acbc95c4dc555d9beb07 25-Jul-2006 Venkat Yekkirala <vyekkirala@TrustedCS.com> [MLSXFRM]: Add security sid to flowi

This adds security to flow key for labeling of flows as also to allow
for making flow cache lookups based on the security label seemless.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c4ea94ab3710eb2434abe2eab1a479c2dc01f8ac 21-Mar-2006 Steven Whitehouse <steve@chygwyn.com> [DECnet]: Endian annotation and fixes for DECnet.

The typedef for dn_address has been removed in favour of using __le16
or __u16 directly as appropriate. All the DECnet header files are
updated accordingly.

The byte ordering of dn_eth2dn() and dn_dn2eth() are both changed
since just about all their callers wanted network order rather than
host order, so the conversion is now done in the functions themselves.

Several missed endianess conversions have been picked up during the
conversion process. The nh_gw field in struct dn_fib_info has been
changed from a 32 bit field to 16 bits as it ought to be.

One or two cases of using htons rather than dn_htons in the routing
code have been found and fixed.

There are still a few warnings to fix, but this patch deals with the
important cases.

Signed-off-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
df71837d5024e2524cd51c93621e558aa7dd9f3f 14-Dec-2005 Trent Jaeger <tjaeger@cse.psu.edu> [LSM-IPSec]: Security association restriction.

This patch series implements per packet access control via the
extension of the Linux Security Modules (LSM) interface by hooks in
the XFRM and pfkey subsystems that leverage IPSec security
associations to label packets. Extensions to the SELinux LSM are
included that leverage the patch for this purpose.

This patch implements the changes necessary to the XFRM subsystem,
pfkey interface, ipv4/ipv6, and xfrm_user interface to restrict a
socket to use only authorized security associations (or no security
association) to send/receive network packets.

Patch purpose:

The patch is designed to enable access control per packets based on
the strongly authenticated IPSec security association. Such access
controls augment the existing ones based on network interface and IP
address. The former are very coarse-grained, and the latter can be
spoofed. By using IPSec, the system can control access to remote
hosts based on cryptographic keys generated using the IPSec mechanism.
This enables access control on a per-machine basis or per-application
if the remote machine is running the same mechanism and trusted to
enforce the access control policy.

Patch design approach:

The overall approach is that policy (xfrm_policy) entries set by
user-level programs (e.g., setkey for ipsec-tools) are extended with a
security context that is used at policy selection time in the XFRM
subsystem to restrict the sockets that can send/receive packets via
security associations (xfrm_states) that are built from those
policies.

A presentation available at
www.selinux-symposium.org/2005/presentations/session2/2-3-jaeger.pdf
from the SELinux symposium describes the overall approach.

Patch implementation details:

On output, the policy retrieved (via xfrm_policy_lookup or
xfrm_sk_policy_lookup) must be authorized for the security context of
the socket and the same security context is required for resultant
security association (retrieved or negotiated via racoon in
ipsec-tools). This is enforced in xfrm_state_find.

On input, the policy retrieved must also be authorized for the socket
(at __xfrm_policy_check), and the security context of the policy must
also match the security association being used.

The patch has virtually no impact on packets that do not use IPSec.
The existing Netfilter (outgoing) and LSM rcv_skb hooks are used as
before.

Also, if IPSec is used without security contexts, the impact is
minimal. The LSM must allow such policies to be selected for the
combination of socket and remote machine, but subsequent IPSec
processing proceeds as in the original case.

Testing:

The pfkey interface is tested using the ipsec-tools. ipsec-tools have
been modified (a separate ipsec-tools patch is available for version
0.5) that supports assignment of xfrm_policy entries and security
associations with security contexts via setkey and the negotiation
using the security contexts via racoon.

The xfrm_user interface is tested via ad hoc programs that set
security contexts. These programs are also available from me, and
contain programs for setting, getting, and deleting policy for testing
this interface. Testing of sa functions was done by tracing kernel
behavior.

Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 17-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org> Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!