History log of /net/netfilter/nf_conntrack_netlink.c
Revision Date Author Comments
797a7d66d2048fe8a4ac1ba58c5d4752d64b1ac4 21-Jun-2013 Florian Westphal <fw@strlen.de> netfilter: ctnetlink: send event when conntrack label was modified

commit 0ceabd83875b72a29f33db4ab703d6ba40ea4c58
(netfilter: ctnetlink: deliver labels to userspace) sets the event bit
when we raced with another packet, instead of raising the event bit
when the label bit is set for the first time.

commit 9b21f6a90924dfe8e5e686c314ddb441fb06501e
(netfilter: ctnetlink: allow userspace to modify labels) forgot to update
the event mask in the "conntrack already exists" case.

Both issues result in CTA_LABELS attribute not getting included in the
conntrack event.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
e844a928431fa8f1359d1f4f2cef53d9b446bf52 18-Mar-2013 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: allow to dump expectation per master conntrack

This patch adds the ability to dump all existing expectations
per master conntrack.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
b67bfe0d42cac56c512dd5da4b1b347a23f4b70a 28-Feb-2013 Sasha Levin <sasha.levin@oracle.com> hlist: drop the node parameter from iterators

I'm not sure why, but the hlist for each entry iterators were conceived

list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
442fad9423b78319e0019a7f5047eddf3317afbc 12-Feb-2013 Florian Westphal <fw@strlen.de> netfilter: ctnetlink: don't permit ct creation with random tuple

Userspace can cause kernel panic by not specifying orig/reply
tuple: kernel will create a tuple with random stack values.

Problem is that tuple.dst.dir will be random, too, which
causes nf_ct_tuplehash_to_ctrack() to return garbage.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@gnumonks.org>
c14b78e7decd0d1d5add6a4604feb8609fe920a9 05-Feb-2013 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: nfnetlink: add mutex per subsystem

This patch replaces the global lock to one lock per subsystem.
The per-subsystem lock avoids that processes operating
with different subsystems are synchronized.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
9b21f6a90924dfe8e5e686c314ddb441fb06501e 11-Jan-2013 Florian Westphal <fw@strlen.de> netfilter: ctnetlink: allow userspace to modify labels

Add the ability to set/clear labels assigned to a conntrack
via ctnetlink.

To allow userspace to only alter specific bits, Pablo suggested to add
a new CTA_LABELS_MASK attribute:

The new set of active labels is then determined via

active = (active & ~mask) ^ changeset

i.e., the mask selects those bits in the existing set that should be
changed.

This follows the same method already used by MARK and CONNMARK targets.

Omitting CTA_LABELS_MASK is the same as setting all bits in CTA_LABELS_MASK
to 1: The existing set is replaced by the one from userspace.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
0ceabd83875b72a29f33db4ab703d6ba40ea4c58 11-Jan-2013 Florian Westphal <fw@strlen.de> netfilter: ctnetlink: deliver labels to userspace

Introduce CTA_LABELS attribute to send a bit-vector of currently active labels
to userspace.

Future patch will permit userspace to also set/delete active labels.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
c539f01717c239cfa0921dd43927afc976f1eedc 11-Jan-2013 Florian Westphal <fw@strlen.de> netfilter: add connlabel conntrack extension

similar to connmarks, except labels are bit-based; i.e.
all labels may be attached to a flow at the same time.

Up to 128 labels are supported. Supporting more labels
is possible, but requires increasing the ct offset delta
from u8 to u16 type due to increased extension sizes.

Mapping of bit-identifier to label name is done in userspace.

The extension is enabled at run-time once "-m connlabel" netfilter
rules are added.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
1310b955c804975651dca6c674ebfd1cb2b4c7ff 26-Dec-2012 Jesper Juhl <jj@chaosbits.net> netfilter: ctnetlink: fix leak in error path of ctnetlink_create_expect

This patch fixes a leak in one of the error paths of
ctnetlink_create_expect if no helper and no timeout is specified.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
6d1fafcaecaa2e66eb9861a39d22fc7380ce6f78 22-Nov-2012 Florian Westphal <fw@strlen.de> netfilter: ctnetlink: nla_policy updates

Add stricter checking for a few attributes.
Note that these changes don't fix any bug in the current code base.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
d871befe357ccc262edbb0a4f9aeea650012edf5 27-Nov-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: dump entries from the dying and unconfirmed lists

This patch adds a new operation to dump the content of the dying and
unconfirmed lists.

Under some situations, the global conntrack counter can be inconsistent
with the number of entries that we can dump from the conntrack table.
The way to resolve this is to allow dumping the content of the unconfirmed
and dying lists, so far it was not possible to look at its content.

This provides some extra instrumentation to resolve problematic situations
in which anyone suspects memory leaks.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
04dac0111da7e1d284952cd415162451ffaa094d 27-Nov-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: nf_conntrack: improve nf_conn object traceability

This patch modifies the conntrack subsystem so that all existing
allocated conntrack objects can be found in any of the following
places:

* the hash table, this is the typical place for alive conntrack objects.
* the unconfirmed list, this is the place for newly created conntrack objects
that are still traversing the stack.
* the dying list, this is where you can find conntrack objects that are dying
or that should die anytime soon (eg. once the destroy event is delivered to
the conntrackd daemon).

Thus, we make sure that we follow the track for all existing conntrack
objects. This patch, together with some extension of the ctnetlink interface
to dump the content of the dying and unconfirmed lists, will help in case
to debug suspected nf_conn object leaks.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
7be54ca4764bdead40bee7b645a72718c20ff2c8 21-Sep-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: nf_ct_ftp: add sequence tracking pickup facility for injected entries

This patch allows the FTP helper to pickup the sequence tracking from
the first packet seen. This is useful to fix the breakage of the first
FTP command after the failover while using conntrackd to synchronize
states.

The seq_aft_nl_num field in struct nf_ct_ftp_info has been shrinked to
16-bits (enough for what it does), so we can use the remaining 16-bits
to store the flags while using the same size for the private FTP helper
data.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
c7cbb9173d3c6d41cbfbca451902d66fe6440cbb 11-Sep-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix module auto-load in ctnetlink_parse_nat

(c7232c9 netfilter: add protocol independent NAT core) added
incorrect locking for the module auto-load case in ctnetlink_parse_nat.

That function is always called from ctnetlink_create_conntrack which
requires no locking.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
15e473046cb6e5d18a4d0057e61d76315230382b 07-Sep-2012 Eric W. Biederman <ebiederm@xmission.com> netlink: Rename pid to portid to avoid confusion

It is a frequent mistake to confuse the netlink port identifier with a
process identifier. Try to reduce this confusion by renaming fields
that hold port identifiers portid instead of pid.

I have carefully avoided changing the structures exported to
userspace to avoid changing the userspace API.

I have successfully built an allyesconfig kernel with this change.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ef6acf68c259d907517dcc0ffefcd4e30276ae29 29-Aug-2012 Julia Lawall <Julia.Lawall@lip6.fr> netfilter: ctnetlink: fix error return code in init path

Initialize return variable before exiting on an error path.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
(
if@p1 (\(ret < 0\|ret != 0\))
{ ... return ret; }
|
ret@p1 = 0
)
... when != ret = e1
when != &ret
*if(...)
{
... when != ret = e2
when forall
return ret;
}

// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
c7232c9979cba684c50b64c513c4a83c9aa70563 26-Aug-2012 Patrick McHardy <kaber@trash.net> netfilter: add protocol independent NAT core

Convert the IPv4 NAT implementation to a protocol independent core and
address family specific modules.

Signed-off-by: Patrick McHardy <kaber@trash.net>
68e035c950dbceaf660144bf74054dfdfb6aad15 14-Aug-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix missing locking while changing conntrack from nfqueue

Since 9cb017665 netfilter: add glue code to integrate nfnetlink_queue and
ctnetlink, we can modify the conntrack entry via nfnl_queue. However, the
change of the conntrack entry via nfnetlink_queue requires appropriate
locking to avoid concurrent updates.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
392025f87a5105c640cf1b4b317c21c14c05a6f9 26-Jun-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: add new messages to obtain statistics

This patch adds the following messages to ctnetlink:

IPCTNL_MSG_CT_GET_STATS_CPU
IPCTNL_MSG_CT_GET_STATS
IPCTNL_MSG_EXP_GET_STATS_CPU

To display connection tracking system per-cpu and global statistics.

This provides a replacement for the following /proc interfaces:

/proc/net/stat/nf_conntrack
/proc/sys/net/netfilter/nf_conntrack_count

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8e36c4b5b673edc6081599b8bd461e062e4910f4 23-Jun-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix compilation with NF_CONNTRACK_EVENTS=n

This patch fixes compilation with NF_CONNTRACK_EVENTS=n and
NETFILTER_NETLINK_QUEUE_CT=y.

I'm leaving all those static inline functions that calculate the size
of the event message out of the ifdef area of NF_CONNTRACK_EVENTS since
they will not be included by gcc in case they are unused.

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
d584a61a931e6cbfef0dd811c4ae0250ec5987f4 20-Jun-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: nfnetlink_queue: fix compilation with CONFIG_NF_NAT=m and CONFIG_NF_CT_NETLINK=y

LD init/built-in.o
net/built-in.o:(.data+0x4408): undefined reference to `nf_nat_tcp_seq_adjust'
make: *** [vmlinux] Error 1

This patch adds a new pointer hook (nfq_ct_nat_hook) similar to other existing
in Netfilter to solve our complicated configuration dependencies.

Reported-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
7c62234547255ce4c385a218915965bc2f14fe45 19-Jun-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: nfnetlink_queue: fix compilation with NF_CONNTRACK disabled

In "9cb0176 netfilter: add glue code to integrate nfnetlink_queue and ctnetlink"
the compilation with NF_CONNTRACK disabled is broken. This patch fixes this
issue.

I have moved the conntrack part into nfnetlink_queue_ct.c to avoid
peppering the entire nfnetlink_queue.c code with ifdefs.

I also needed to rename nfnetlink_queue.c to nfnetlink_queue_pkt.c
to update the net/netfilter/Makefile to support conditional compilation
of the conntrack integration.

This patch also adds CONFIG_NETFILTER_QUEUE_CT in case you want to explicitly
disable the integration between nf_conntrack and nfnetlink_queue.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
fd7462de461949e36d70f5b0bc17b98c5a00729c 18-Jun-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix NULL dereference while trying to change helper

The patch 1afc56794e03: "netfilter: nf_ct_helper: implement variable
length helper private data" from Jun 7, 2012, leads to the following
Smatch complaint:

net/netfilter/nf_conntrack_netlink.c:1231 ctnetlink_change_helper()
error: we previously assumed 'help->helper' could be null (see line 1228)

This NULL dereference can be triggered with the following sequence:

1) attach the helper for first time when the conntrack is created.
2) remove the helper module or detach the helper from the conntrack
via ctnetlink.
3) attach helper again (the same or different one, no matter) to the
that existing conntrack again via ctnetlink.

This patch fixes the problem by removing the use case that allows you
to re-assign again a helper for one conntrack entry via ctnetlink since
I cannot find any practical use for it.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
ae243bee397102c51fbf9db440eca3b077e0e702 07-Jun-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: add CTA_HELP_INFO attribute

This attribute can be used to modify and to dump the internal
protocol information.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8c88f87cb27ad09086940bdd3e6955e5325ec89a 07-Jun-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: nfnetlink_queue: add NAT TCP sequence adjustment if packet mangled

User-space programs that receive traffic via NFQUEUE may mangle packets.
If NAT is enabled, this usually puzzles sequence tracking, leading to
traffic disruptions.

With this patch, nfnl_queue will make the corresponding NAT TCP sequence
adjustment if:

1) The packet has been mangled,
2) the NFQA_CFG_F_CONNTRACK flag has been set, and
3) NAT is detected.

There are some records on the Internet complaning about this issue:
http://stackoverflow.com/questions/260757/packet-mangling-utilities-besides-iptables

By now, we only support TCP since we have no helpers for DCCP or SCTP.
Better to add this if we ever have some helper over those layer 4 protocols.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
9cb0176654a7dc33a32af8a0bc9e0b2f9f9ebb0f 07-Jun-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: add glue code to integrate nfnetlink_queue and ctnetlink

This patch allows you to include the conntrack information together
with the packet that is sent to user-space via NFQUEUE.

Previously, there was no integration between ctnetlink and
nfnetlink_queue. If you wanted to access conntrack information
from your libnetfilter_queue program, you required to query
ctnetlink from user-space to obtain it. Thus, delaying the packet
processing even more.

Including the conntrack information is optional, you can set it
via NFQA_CFG_F_CONNTRACK flag with the new NFQA_CFG_FLAGS attribute.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
1afc56794e03229fa53cfa3c5012704d226e1dec 07-Jun-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: nf_ct_helper: implement variable length helper private data

This patch uses the new variable length conntrack extensions.

Instead of using union nf_conntrack_help that contain all the
helper private data information, we allocate variable length
area to store the private helper data.

This patch includes the modification of all existing helpers.
It also includes a couple of include header to avoid compilation
warnings.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
9768e1ace458fa4ebf88bc3943fd8fb77113ed9c 02-May-2012 Kelvie Wong <kelvie@ieee.org> netfilter: nf_ct_expect: partially implement ctnetlink_change_expect

This refreshes the "timeout" attribute in existing expectations if one is
given.

The use case for this would be for userspace helpers to extend the lifetime
of the expectation when requested, as this is not possible right now
without deleting/recreating the expectation.

I use this specifically for forwarding DCERPC traffic through:

DCERPC has a port mapper daemon that chooses a (seemingly) random port for
future traffic to go to. We expect this traffic (with a reasonable
timeout), but sometimes the port mapper will tell the client to continue
using the same port. This allows us to extend the expectation accordingly.

Signed-off-by: Kelvie Wong <kelvie@ieee.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
cc1eb43134c07435955263dfe5d2fc980fe8b808 02-Apr-2012 David S. Miller <davem@davemloft.net> nf_conntrack_netlink: Stop using NLA_PUT*().

These macros contain a hidden goto, and are thus extremely error
prone and make code hard to audit.

Signed-off-by: David S. Miller <davem@davemloft.net>
a16a1647fa6b6783c2e91623e72e86f0c2adac5e 16-Mar-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix race between delete and timeout expiration

Kerin Millar reported hardlockups while running `conntrackd -c'
in a busy firewall. That system (with several processors) was
acting as backup in a primary-backup setup.

After several tries, I found a race condition between the deletion
operation of ctnetlink and timeout expiration. This patch fixes
this problem.

Tested-by: Kerin Millar <kerframil@gmail.com>
Reported-by: Kerin Millar <kerframil@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
3b988ece9b42452c59da5844942661cd782b2473 05-Mar-2012 Hans Schillstrom <hans@schillstrom.com> netfilter: ctnetlink: fix lockep splats

net/netfilter/nf_conntrack_proto.c:70 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
3 locks held by conntrack/3235:
nfnl_lock+0x17/0x20
netlink_dump+0x32/0x240
ctnetlink_dump_table+0x3e/0x170 [nf_conntrack_netlink]

stack backtrace:
Pid: 3235, comm: conntrack Tainted: G W 3.2.0+ #511
Call Trace:
[<ffffffff8108ce45>] lockdep_rcu_suspicious+0xe5/0x100
[<ffffffffa00ec6e1>] __nf_ct_l4proto_find+0x81/0xb0 [nf_conntrack]
[<ffffffffa0115675>] ctnetlink_fill_info+0x215/0x5f0 [nf_conntrack_netlink]
[<ffffffffa0115dc1>] ctnetlink_dump_table+0xd1/0x170 [nf_conntrack_netlink]
[<ffffffff815fbdbf>] netlink_dump+0x7f/0x240
[<ffffffff81090f9d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffff815fd34f>] netlink_dump_start+0xdf/0x190
[<ffffffffa0111490>] ? ctnetlink_change_nat_seq_adj+0x160/0x160 [nf_conntrack_netlink]
[<ffffffffa0115cf0>] ? ctnetlink_get_conntrack+0x2a0/0x2a0 [nf_conntrack_netlink]
[<ffffffffa0115ad9>] ctnetlink_get_conntrack+0x89/0x2a0 [nf_conntrack_netlink]
[<ffffffff81603a47>] nfnetlink_rcv_msg+0x467/0x5f0
[<ffffffff81603a7c>] ? nfnetlink_rcv_msg+0x49c/0x5f0
[<ffffffff81603922>] ? nfnetlink_rcv_msg+0x342/0x5f0
[<ffffffff81071b21>] ? get_parent_ip+0x11/0x50
[<ffffffff816035e0>] ? nfnetlink_subsys_register+0x60/0x60
[<ffffffff815fed49>] netlink_rcv_skb+0xa9/0xd0
[<ffffffff81603475>] nfnetlink_rcv+0x15/0x20
[<ffffffff815fe70e>] netlink_unicast+0x1ae/0x1f0
[<ffffffff815fea16>] netlink_sendmsg+0x2c6/0x320
[<ffffffff815b2a87>] sock_sendmsg+0x117/0x130
[<ffffffff81125093>] ? might_fault+0x53/0xb0
[<ffffffff811250dc>] ? might_fault+0x9c/0xb0
[<ffffffff81125093>] ? might_fault+0x53/0xb0
[<ffffffff815b5991>] ? move_addr_to_kernel+0x71/0x80
[<ffffffff815b644e>] sys_sendto+0xfe/0x130
[<ffffffff815b5c94>] ? sys_bind+0xb4/0xd0
[<ffffffff817a8a0e>] ? retint_swapgs+0xe/0x13
[<ffffffff817afcd2>] system_call_fastpath+0x16/0x1b

Reported-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
544d5c7d9f4d1ec4f170bc5bcc522012cb7704bc 05-Feb-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: allow to set expectfn for expectations

This patch allows you to set expectfn which is specifically used
by the NAT side of most of the existing conntrack helpers.

I have added a symbol map that uses a string as key to look up for
the function that is attached to the expectation object. This is
the best solution I came out with to solve this issue.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
076a0ca02644657b13e4af363f487ced2942e9cb 05-Feb-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: add NAT support for expectations

This patch adds the missing bits to create expectations that
are created in NAT setups.
b8c5e52c13edc99ce192d78c8a7fe2fd626ac643 05-Feb-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: allow to set expectation class

This patch allows you to set the expectation class.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
660fdb2a0f5f670da4728d7028d3227296e0226c 05-Feb-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: allow to set helper for new expectations

This patch allow you to set the helper for newly created
expectations based of the CTA_EXPECT_HELP_NAME attribute.
Before this, the helper set was NULL.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8be619d1e430fd87a02587a2a6830b692cb91b84 06-Mar-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: remove incorrect spin_[un]lock_bh on NAT module autoload

Since 7d367e0, ctnetlink_new_conntrack is called without holding
the nf_conntrack_lock spinlock. Thus, ctnetlink_parse_nat_setup
does not require to release that spinlock anymore in the NAT module
autoload case.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
0f298a285f2e365cb34f69d1f79bb9fc996f683d 24-Feb-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: support kernel-space dump filtering by ctmark

This patch adds CTA_MARK_MASK which, together with CTA_MARK, allows
you to selectively send conntrack entries to user-space by
returning those that match mark & mask.

With this, we can save cycles in the building and the parsing of
the entries that may be later on filtered out in user-space by using
the ctmark & mask.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
80d326fab534a5380e8f6e509a0b9076655a9670 24-Feb-2012 Pablo Neira Ayuso <pablo@netfilter.org> netlink: add netlink_dump_control structure for netlink_dump_start()

Davem considers that the argument list of this interface is getting
out of control. This patch tries to address this issue following
his proposal:

struct netlink_dump_control c = { .dump = dump, .done = done, ... };

netlink_dump_start(..., &c);

Suggested by David S. Miller.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7d367e06688dc7a2cc98c2ace04e1296e1d987e2 24-Feb-2012 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2)

Marcell Zambo and Janos Farago noticed and reported that when
new conntrack entries are added via netlink and the conntrack table
gets full, soft lockup happens. This is because the nf_conntrack_lock
is held while nf_conntrack_alloc is called, which is in turn wants
to lock nf_conntrack_lock while evicting entries from the full table.

The patch fixes the soft lockup with limiting the holding of the
nf_conntrack_lock to the minimum, where it's absolutely required.
It required to extend (and thus change) nf_conntrack_hash_insert
so that it makes sure conntrack and ctnetlink do not add the same entry
twice to the conntrack table.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
279072882dc0149e5740dace075e1a49f087046d 24-Feb-2012 Pablo Neira Ayuso <pablo@netfilter.org> Revert "netfilter: ctnetlink: fix soft lockup when netlink adds new entries"

This reverts commit af14cca162ddcdea017b648c21b9b091e4bf1fa4.

This patch contains a race condition between packets and ctnetlink
in the conntrack addition. A new patch to fix this issue follows up.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
af14cca162ddcdea017b648c21b9b091e4bf1fa4 21-Feb-2012 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> netfilter: ctnetlink: fix soft lockup when netlink adds new entries

Marcell Zambo and Janos Farago noticed and reported that when
new conntrack entries are added via netlink and the conntrack table
gets full, soft lockup happens. This is because the nf_conntrack_lock
is held while nf_conntrack_alloc is called, which is in turn wants
to lock nf_conntrack_lock while evicting entries from the full table.

The patch fixes the soft lockup with limiting the holding of the
nf_conntrack_lock to the minimum, where it's absolutely required.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
9bf04646b0b41c5438ed8a27c5f8dbe0ff40d756 15-Jan-2012 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: revert user-space expectation helper support

This patch partially reverts:
3d058d7 netfilter: rework user-space expectation helper support
that was applied during the 3.2 development cycle.

After this patch, the tree remains just like before patch bc01bef,
that initially added the preliminary infrastructure.

I decided to partially revert this patch because the approach
that I proposed to resolve this problem is broken in NAT setups.
Moreover, a new infrastructure will be submitted for the 3.3.x
development cycle that resolve the existing issues while
providing a neat solution.

Since nobody has been seriously using this infrastructure in
user-space, the removal of this feature should affect any know
FOSS project (to my knowledge).

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
cf778b00e96df6d64f8e21b8395d1f8a859ecdc7 12-Jan-2012 Eric Dumazet <eric.dumazet@gmail.com> net: reintroduce missing rcu_assign_pointer() calls

commit a9b3cd7f32 (rcu: convert uses of rcu_assign_pointer(x, NULL) to
RCU_INIT_POINTER) did a lot of incorrect changes, since it did a
complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x,
y).

We miss needed barriers, even on x86, when y is not NULL.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c121638277a71c1e1fb44c3e654ea353357bbc2c 30-Dec-2011 Xi Wang <xi.wang@gmail.com> netfilter: ctnetlink: fix timeout calculation

The sanity check (timeout < 0) never works; the dividend is unsigned
and so is the division, which should have been a signed division.

long timeout = (ct->timeout.expires - jiffies) / HZ;
if (timeout < 0)
timeout = 0;

This patch converts the time values to signed for the division.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
1a31a4a8388a90e9240fb4e5e5c9c909fcfdfd0e 24-Dec-2011 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix scheduling while atomic if helper is autoloaded

This patch fixes one scheduling while atomic error:

[ 385.565186] ctnetlink v0.93: registering with nfnetlink.
[ 385.565349] BUG: scheduling while atomic: lt-expect_creat/16163/0x00000200

It can be triggered with utils/expect_create included in
libnetfilter_conntrack if the FTP helper is not loaded.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
81378f728fe560e175fb2e8fd33206793567e896 24-Dec-2011 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix return value of ctnetlink_get_expect()

This fixes one bogus error that is returned to user-space:

libnetfilter_conntrack/utils# ./expect_get
TEST: get expectation (-1)(Unknown error 18446744073709551504)

This patch includes the correct handling for EAGAIN (nfnetlink
uses this error value to restart the operation after module
auto-loading).

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
80e60e67bc4bbfe61b61a344f542af23e16abdbf 24-Dec-2011 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: get and zero operations must be atomic

The get and zero operations have to be done in an atomic context,
otherwise counters added between them will be lost.

This problem was spotted by Changli Gao while discussing the
nfacct infrastructure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
cbc9f2f4fcd70d5a627558ca9a881fa9391abf69 23-Dec-2011 Patrick McHardy <kaber@trash.net> netfilter: nf_nat: export NAT definitions to userspace

Export the NAT definitions to userspace. So far userspace (specifically,
iptables) has been copying the headers files from include/net. Also
rename some structures and definitions in preparation for IPv6 NAT.
Since these have never been officially exported, this doesn't affect
existing userspace code.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
3d058d7bc2c5671ae630e0b463be8a69b5783fb9 18-Dec-2011 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: rework user-space expectation helper support

This partially reworks bc01befdcf3e40979eb518085a075cbf0aacede0
which added userspace expectation support.

This patch removes the nf_ct_userspace_expect_list since now we
force to use the new iptables CT target feature to add the helper
extension for conntracks that have attached expectations from
userspace.

A new version of the proof-of-concept code to implement userspace
helpers from userspace is available at:

http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-POC.tar.bz2

This patch also modifies the CT target to allow to set the
conntrack's userspace helper status flags. This flag is used
to tell the conntrack system to explicitly allocate the helper
extension.

This helper extension is useful to link the userspace expectations
with the master conntrack that is being tracked from one userspace
helper.

This feature fixes a problem in the current approach of the
userspace helper support. Basically, if the master conntrack that
has got a userspace expectation vanishes, the expectations point to
one invalid memory address. Thus, triggering an oops in the
expectation deletion event path.

I decided not to add a new revision of the CT target because
I only needed to add a new flag for it. I'll document in this
issue in the iptables manpage. I have also changed the return
value from EINVAL to EOPNOTSUPP if one flag not supported is
specified. Thus, in the future adding new features that only
require a new flag can be added without a new revision.

There is no official code using this in userspace (apart from
the proof-of-concept) that uses this infrastructure but there
will be some by beginning 2012.

Reported-by: Sam Roberts <vieuxtech@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
c4042a339f40fe00d85e31055b1c0808dd025539 14-Dec-2011 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: support individual atomic-get-and-reset of counters

This allows to use the get operation to atomically get-and-reset
counters.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
35dba1d7f3ae669128a42c969d599ab8c604d61d 14-Dec-2011 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: use expect instead of master tuple in get operation

Use the expect tuple (if possible) instead of the master tuple for
the get operation. If two or more expectations come from the same
master, the returned expectation may not be the one that user-space
is requesting.

This is how it works for the expect deletion operation.

Although I think that nobody has been seriously using this. We
accept both possibilities, using the expect tuple if possible.
I decided to do it like this to avoid breaking backward
compatibility.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
b3e0bfa71b1db9d7a9fbea6965867784fd00ca3c 14-Dec-2011 Eric Dumazet <eric.dumazet@gmail.com> netfilter: nf_conntrack: use atomic64 for accounting counters

We can use atomic64_t infrastructure to avoid taking a spinlock in fast
path, and remove inaccuracies while reading values in
ctnetlink_dump_counters() and connbytes_mt() on 32bit arches.

Suggested by Pablo.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
70e9942f17a6193e9172a804e6569a8806633d6b 22-Nov-2011 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: nf_conntrack: make event callback registration per-netns

This patch fixes an oops that can be triggered following this recipe:

0) make sure nf_conntrack_netlink and nf_conntrack_ipv4 are loaded.
1) container is started.
2) connect to it via lxc-console.
3) generate some traffic with the container to create some conntrack
entries in its table.
4) stop the container: you hit one oops because the conntrack table
cleanup tries to report the destroy event to user-space but the
per-netns nfnetlink socket has already gone (as the nfnetlink
socket is per-netns but event callback registration is global).

To fix this situation, we make the ctnl_notifier per-netns so the
callback is registered/unregistered if the container is
created/destroyed.

Alex Bligh and Alexey Dobriyan originally proposed one small patch to
check if the nfnetlink socket is gone in nfnetlink_has_listeners,
but this is a very visited path for events, thus, it may reduce
performance and it looks a bit hackish to check for the nfnetlink
socket only to workaround this situation. As a result, I decided
to follow the bigger path choice, which seems to look nicer to me.

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Reported-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
a9b3cd7f323b2e57593e7215362a7b02fc933e3a 01-Aug-2011 Stephen Hemminger <shemminger@vyatta.com> rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER

When assigning a NULL value to an RCU protected pointer, no barrier
is needed. The rcu_assign_pointer, used to handle that but will soon
change to not handle the special case.

Convert all rcu_assign_pointer of NULL value.

//smpl
@@ expression P; @@

- rcu_assign_pointer(P, NULL)
+ RCU_INIT_POINTER(P, NULL)

// </smpl>

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c7ac8679bec9397afe8918f788cbcef88c38da54 10-Jun-2011 Greg Rose <gregory.v.rose@intel.com> rtnetlink: Compute and store minimum ifinfo dump size

The message size allocated for rtnl ifinfo dumps was limited to
a single page. This is not enough for additional interface info
available with devices that support SR-IOV and caused a bug in
which VF info would not be displayed if more than approximately
40 VFs were created per interface.

Implement a new function pointer for the rtnl_register service that will
calculate the amount of data required for the ifinfo dump and allocate
enough data to satisfy the request.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
315c34dae0069d0c67abd714bb846cd466289c7f 21-Apr-2011 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix timestamp support for new conntracks

This patch fixes the missing initialization of the start time if
the timestamp support is enabled.

libnetfilter_conntrack/utils# conntrack -E &
libnetfilter_conntrack/utils# ./conntrack_create
tcp 6 109 ESTABLISHED src=1.1.1.1 dst=2.2.2.2 sport=1025 dport=21 packets=0 bytes=0 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=21 dport=1025 packets=0 bytes=0 mark=0 delta-time=1303296401 use=2

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
358b1361bed42f4e6cbf8956a73aebf193957d4a 21-Apr-2011 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix timestamp support for new conntracks

This patch fixes the missing initialization of the start time if
the timestamp support is enabled.

libnetfilter_conntrack/utils# conntrack -E &
libnetfilter_conntrack/utils# ./conntrack_create
tcp 6 109 ESTABLISHED src=1.1.1.1 dst=2.2.2.2 sport=1025 dport=21 packets=0 bytes=0 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=21 dport=1025 packets=0 bytes=0 mark=0 delta-time=1303296401 use=2

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
a00f1f3686d6a062b5295c092a9dff059adbdbf5 01-Feb-2011 Patrick McHardy <kaber@trash.net> netfilter: ctnetlink: fix ctnetlink_parse_tuple() warning

net/netfilter/nf_conntrack_netlink.c: In function 'ctnetlink_parse_tuple':
net/netfilter/nf_conntrack_netlink.c:832:11: warning: comparison between 'enum ctattr_tuple' and 'enum ctattr_type'

Use ctattr_type for the 'type' parameter since that's the type of all attributes
passed to this function.

Signed-off-by: Patrick McHardy <kaber@trash.net>
c71caf4114a0e1da3451cc92fba6a152929cd4c2 24-Jan-2011 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix missing refcount increment during dumps

In 13ee6ac netfilter: fix race in conntrack between dump_table and
destroy, we recovered spinlocks to protect the dump of the conntrack
table according to reports from Stephen and acknowledgments on the
issue from Eric.

In that patch, the refcount bump that allows to keep a reference
to the current ct object was removed. However, we still decrement
the refcount for that object in the output path of
ctnetlink_dump_table():

if (last)
nf_ct_put(last)

Cc: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
b8f3ab4290f1e720166e888ea2a1d1d44c4d15dd 18-Jan-2011 David S. Miller <davem@davemloft.net> Revert "netlink: test for all flags of the NLM_F_DUMP composite"

This reverts commit 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf.

It breaks several things including the avahi daemon.

Signed-off-by: David S. Miller <davem@davemloft.net>
a992ca2a0498edd22a88ac8c41570f536de29c9e 19-Jan-2011 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: nf_conntrack_tstamp: add flow-based timestamp extension

This patch adds flow-based timestamping for conntracks. This
conntrack extension is disabled by default. Basically, we use
two 64-bits variables to store the creation timestamp once the
conntrack has been confirmed and the other to store the deletion
time. This extension is disabled by default, to enable it, you
have to:

echo 1 > /proc/sys/net/netfilter/nf_conntrack_timestamp

This patch allows to save memory for user-space flow-based
loogers such as ulogd2. In short, ulogd2 does not need to
keep a hashtable with the conntrack in user-space to know
when they were created and destroyed, instead we use the
kernel timestamp. If we want to have a sane IPFIX implementation
in user-space, this nanosecs resolution timestamps are also
useful. Other custom user-space applications can benefit from
this via libnetfilter_conntrack.

This patch modifies the /proc output to display the delta time
in seconds since the flow start. You can also obtain the
flow-start date by means of the conntrack-tools.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
f31e8d4982653b39fe312f9938be0f49dd9ab5fa 13-Jan-2011 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix loop in ctnetlink_get_conntrack()

This patch fixes a loop in ctnetlink_get_conntrack() that can be
triggered if you use the same socket to receive events and to
perform a GET operation. Under heavy load, netlink_unicast()
may return -EAGAIN, this error code is reserved in nfnetlink for
the module load-on-demand. Instead, we return -ENOBUFS which is
the appropriate error code that has to be propagated to
user-space.

Reported-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
13ee6ac579574a2a95e982b19920fd2495dce8cd 11-Jan-2011 Stephen Hemminger <shemminger@vyatta.com> netfilter: fix race in conntrack between dump_table and destroy

The netlink interface to dump the connection tracking table has a race
when entries are deleted at the same time. A customer reported a crash
and the backtrace showed thatctnetlink_dump_table was running while a
conntrack entry was being destroyed.
(see https://bugzilla.vyatta.com/show_bug.cgi?id=6402).

According to RCU documentation, when using hlist_nulls the reader
must handle the case of seeing a deleted entry and not proceed
further down the linked list. The old code would continue
which caused the scan to walk into the free list.

This patch uses locking (rather than RCU) for this operation which
is guaranteed safe, and no longer requires getting reference while
doing dump operation.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
0ab03c2b1478f2438d2c80204f7fef65b1bca9cf 07-Jan-2011 Jan Engelhardt <jengelh@medozas.de> netlink: test for all flags of the NLM_F_DUMP composite

Due to NLM_F_DUMP is composed of two bits, NLM_F_ROOT | NLM_F_MATCH,
when doing "if (x & NLM_F_DUMP)", it tests for _either_ of the bits
being set. Because NLM_F_MATCH's value overlaps with NLM_F_EXCL,
non-dump requests with NLM_F_EXCL set are mistaken as dump requests.

Substitute the condition to test for _all_ bits being set.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
cba85b532e4aabdb97f44c18987d45141fd93faa 06-Jan-2011 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: fix export secctx error handling

In 1ae4de0cdf855305765592647025bde55e85e451, the secctx was exported
via the /proc/net/netfilter/nf_conntrack and ctnetlink interfaces
instead of the secmark.

That patch introduced the use of security_secid_to_secctx() which may
return a non-zero value on error.

In one of my setups, I have NF_CONNTRACK_SECMARK enabled but no
security modules. Thus, security_secid_to_secctx() returns a negative
value that results in the breakage of the /proc and `conntrack -L'
outputs. To fix this, we skip the inclusion of secctx if the
aforementioned function fails.

This patch also fixes the dynamic netlink message size calculation
if security_secid_to_secctx() returns an error, since its logic is
also wrong.

This problem exists in Linux kernel >= 2.6.37.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
e5fc9e7a666e5964b60e05903b90aa832354b68c 12-Nov-2010 Changli Gao <xiaosuo@gmail.com> netfilter: nf_conntrack: don't always initialize ct->proto

ct->proto is big(60 bytes) due to structure ip_ct_tcp, and we don't need
to initialize the whole for all the other protocols. This patch moves
proto to the end of structure nf_conn, and pushes the initialization down
to the individual protocols.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
ff660c80d00b52287f1f67ee6c115dc0057bcdde 20-Oct-2010 Eric Paris <eparis@redhat.com> secmark: fix config problem when CONFIG_NF_CONNTRACK_SECMARK is not set

When CONFIG_NF_CONNTRACK_SECMARK is not set we accidentally attempt to use
the secmark fielf of struct nf_conn. Problem is when that config isn't set
the field doesn't exist. whoops. Wrap the incorrect usage in the config.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
1cc63249adfa957b34ca51effdee90ff8261d63f 13-Oct-2010 Eric Paris <eparis@redhat.com> conntrack: export lsm context rather than internal secid via netlink

The conntrack code can export the internal secid to userspace. These are
dynamic, can change on lsm changes, and have no meaning in userspace. We
should instead be sending lsm contexts to userspace instead. This patch sends
the secctx (rather than secid) to userspace over the netlink socket. We use a
new field CTA_SECCTX and stop using the the old CTA_SECMARK field since it did
not send particularly useful information.

Signed-off-by: Eric Paris <eparis@redhat.com>
Reviewed-by: Paul Moore <paul.moore@hp.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: James Morris <jmorris@namei.org>
ebbf41df4aabb6d506fa18ea8cb4c2b4388a18b9 19-Oct-2010 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: add expectation deletion events

This patch allows to listen to events that inform about
expectations destroyed.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
bc01befdcf3e40979eb518085a075cbf0aacede0 28-Sep-2010 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: add support for user-space expectation helpers

This patch adds the basic infrastructure to support user-space
expectation helpers via ctnetlink and the netfilter queuing
infrastructure NFQUEUE. Basically, this patch:

* adds NF_CT_EXPECT_USERSPACE flag to identify user-space
created expectations. I have also added a sanity check in
__nf_ct_expect_check() to avoid that kernel-space helpers
may create an expectation if the master conntrack has no
helper assigned.
* adds some branches to check if the master conntrack helper
exists, otherwise we skip the code that refers to kernel-space
helper such as the local expectation list and the expectation
policy.
* allows to set the timeout for user-space expectations with
no helper assigned.
* a list of expectations created from user-space that depends
on ctnetlink (if this module is removed, they are deleted).
* includes USERSPACE in the /proc output for expectations
that have been created by a user-space helper.

This patch also modifies ctnetlink to skip including the helper
name in the Netlink messages if no kernel-space helper is set
(since no user-space expectation has not kernel-space kernel
assigned).

You can access an example user-space FTP conntrack helper at:
http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-userspace-POC.tar.bz

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
8b008faf92ac8f7eeb65e8cd36077601af7c46db 22-Sep-2010 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: allow to specify the expectation flags

With this patch, you can specify the expectation flags for user-space
created expectations.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
bcac0dfab191cb53b3f9b43c8014a34070ed58ff 22-Sep-2010 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: missing validation of CTA_EXPECT_ZONE attribute

This patch adds the missing validation of the CTA_EXPECT_ZONE
attribute in the ctnetlink code.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
5bfddbd46a95c978f4d3c992339cbdf4f4b790a3 08-Jun-2010 Eric Dumazet <eric.dumazet@gmail.com> netfilter: nf_conntrack: IPS_UNTRACKED bit

NOTRACK makes all cpus share a cache line on nf_conntrack_untracked
twice per packet. This is bad for performance.
__read_mostly annotation is also a bad choice.

This patch introduces IPS_UNTRACKED bit so that we can use later a
per_cpu untrack structure more easily.

A new helper, nf_ct_untracked_get() returns a pointer to
nf_conntrack_untracked.

Another one, nf_ct_untracked_status_or() is used by nf_nat_init() to add
IPS_NAT_DONE_MASK bits to untracked status.

nf_ct_is_untracked() prototype is changed to work on a nf_conn pointer.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
736d58e3a2245ac2779fe0f278f8735bcf33ca8d 13-May-2010 Joe Perches <joe@perches.com> netfilter: remove unnecessary returns from void function()s

This patch removes from net/ netfilter files
all the unnecessary return; statements that precede the
last closing brace of void functions.

It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.

Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

Signed-off-by: Joe Perches <joe@perches.com>
[Patrick: changed to keep return statements in otherwise empty function bodies]
Signed-off-by: Patrick McHardy <kaber@trash.net>
654d0fbdc8fe1041918741ed5b6abc8ad6b4c1d8 13-May-2010 Stephen Hemminger <shemminger@vyatta.com> netfilter: cleanup printk messages

Make sure all printk messages have a severity level.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
d26e6a02835affa8bafe09a51e37f9fbc339e415 01-Apr-2010 Jiri Pirko <jpirko@redhat.com> netfilter: ctnetlink: compute message size properly

Message size should be dependent on the presence of an accounting
extension, not on CONFIG_NF_CT_ACCT definition.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
5a0e3ad6af8660be21ca98a971cd00f331318c05 24-Mar-2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
37b7ef7203240b3aba577bb1ff6765fe15225976 16-Mar-2010 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix reliable event delivery if message building fails

This patch fixes a bug that allows to lose events when reliable
event delivery mode is used, ie. if NETLINK_BROADCAST_SEND_ERROR
and NETLINK_RECV_NO_ENOBUFS socket options are set.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
a88e22adf5aad79b6e2ddd1bf0109c2ba8b46b0e 19-Feb-2010 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix creation of conntrack with helpers

This patch fixes a bug that triggers an assertion if you create
a conntrack entry with a helper and netfilter debugging is enabled.
Basically, we hit the assertion because the confirmation flag is
set before the conntrack extensions are added. To fix this, we
move the extension addition before the aforementioned flag is
set.

This patch also removes the possibility of setting a helper for
existing conntracks. This operation would also trigger the
assertion since we are not allowed to add new extensions for
existing conntracks. We know noone that could benefit from
this operation sanely.

Thanks to Eric Dumazet for initial posting a preliminary patch
to address this issue.

Reported-by: David Ramblewski <David.Ramblewski@atosorigin.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
ef00f89f1eb7e056aab9dfe068521e6f2320c94a 15-Feb-2010 Patrick McHardy <kaber@trash.net> netfilter: ctnetlink: add zone support

Parse and dump the conntrack zone in ctnetlink.

Signed-off-by: Patrick McHardy <kaber@trash.net>
5d0aa2ccd4699a01cfdf14886191c249d7b45a01 15-Feb-2010 Patrick McHardy <kaber@trash.net> netfilter: nf_conntrack: add support for "conntrack zones"

Normally, each connection needs a unique identity. Conntrack zones allow
to specify a numerical zone using the CT target, connections in different
zones can use the same identity.

Example:

iptables -t raw -A PREROUTING -i veth0 -j CT --zone 1
iptables -t raw -A OUTPUT -o veth1 -j CT --zone 1

Signed-off-by: Patrick McHardy <kaber@trash.net>
d1e7a03f4fee4059ee3fa7ce0edb7c48c1a75fcf 11-Feb-2010 Patrick McHardy <kaber@trash.net> netfilter: ctnetlink: dump expectation helper name

Signed-off-by: Patrick McHardy <kaber@trash.net>
d0b0268fddea3235a8df35e52167c3b206bf2f5a 10-Feb-2010 Patrick McHardy <kaber@trash.net> netfilter: ctnetlink: add missing netlink attribute policies

Signed-off-by: Patrick McHardy <kaber@trash.net>
d696c7bdaa55e2208e56c6f98e6bc1599f34286d 08-Feb-2010 Patrick McHardy <kaber@trash.net> netfilter: nf_conntrack: fix hash resizing with namespaces

As noticed by Jon Masters <jonathan@jonmasters.org>, the conntrack hash
size is global and not per namespace, but modifiable at runtime through
/sys/module/nf_conntrack/hashsize. Changing the hash size will only
resize the hash in the current namespace however, so other namespaces
will use an invalid hash size. This can cause crashes when enlarging
the hashsize, or false negative lookups when shrinking it.

Move the hash size into the per-namespace data and only use the global
hash size to initialize the per-namespace value when instanciating a
new namespace. Additionally restrict hash resizing to init_net for
now as other namespaces are not handled currently.

Cc: stable@kernel.org
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9ab48ddcb144fdee908708669448dd136cf4894a 08-Feb-2010 Patrick McHardy <kaber@trash.net> netfilter: nf_conntrack: fix hash resizing with namespaces

As noticed by Jon Masters <jonathan@jonmasters.org>, the conntrack hash
size is global and not per namespace, but modifiable at runtime through
/sys/module/nf_conntrack/hashsize. Changing the hash size will only
resize the hash in the current namespace however, so other namespaces
will use an invalid hash size. This can cause crashes when enlarging
the hashsize, or false negative lookups when shrinking it.

Move the hash size into the per-namespace data and only use the global
hash size to initialize the per-namespace value when instanciating a
new namespace. Additionally restrict hash resizing to init_net for
now as other namespaces are not handled currently.

Cc: stable@kernel.org
Signed-off-by: Patrick McHardy <kaber@trash.net>
b2a15a604d379af323645e330638e2cfcc696aff 03-Feb-2010 Patrick McHardy <kaber@trash.net> netfilter: nf_conntrack: support conntrack templates

Support initializing selected parameters of new conntrack entries from a
"conntrack template", which is a specially marked conntrack entry attached
to the skb.

Currently the helper and the event delivery masks can be initialized this
way.

Signed-off-by: Patrick McHardy <kaber@trash.net>
0cebe4b4163b6373c9d24c1a192939777bc27e55 03-Feb-2010 Patrick McHardy <kaber@trash.net> netfilter: ctnetlink: support selective event delivery

Add two masks for conntrack end expectation events to struct nf_conntrack_ecache
and use them to filter events. Their default value is "all events" when the
event sysctl is on and "no events" when it is off. A following patch will add
specific initializations. Expectation events depend on the ecache struct of
their master conntrack.

Signed-off-by: Patrick McHardy <kaber@trash.net>
858b31330054a9ad259feceea0ad1ce5385c47f0 03-Feb-2010 Patrick McHardy <kaber@trash.net> netfilter: nf_conntrack: split up IPCT_STATUS event

Split up the IPCT_STATUS event into an IPCT_REPLY event, which is generated
when the IPS_SEEN_REPLY bit is set, and an IPCT_ASSURED event, which is
generated when the IPS_ASSURED bit is set.

In combination with a following patch to support selective event delivery,
this can be used for "sparse" conntrack replication: start replicating the
conntrack entry after it reached the ASSURED state and that way it's SYN-flood
resistant.

Signed-off-by: Patrick McHardy <kaber@trash.net>
794e68716bab578ae8f8912dc934496d7c7abc90 03-Feb-2010 Patrick McHardy <kaber@trash.net> netfilter: ctnetlink: only assign helpers for matching protocols

Make sure not to assign a helper for a different network or transport
layer protocol to a connection.

Additionally change expectation deletion by helper to compare the name
directly - there might be multiple helper registrations using the same
name, currently one of them is chosen in an unpredictable manner and
only those expectations are removed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
e578756c35859a459d78d8416195bc5f5ff897d0 26-Jan-2010 Patrick McHardy <kaber@trash.net> netfilter: ctnetlink: fix expectation mask dump

The protocol number is not initialized, so userspace can't interpret
the layer 4 data properly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
9592a5c01e79dbc59eb56fa26b124e94ffcd0962 13-Jan-2010 Alexey Dobriyan <adobriyan@gmail.com> netfilter: ctnetlink: netns support

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
cd8c20b650f49354722b8cc1f03320b004815a0a 13-Jan-2010 Alexey Dobriyan <adobriyan@gmail.com> netfilter: nfnetlink: netns support

Make nfnl socket per-petns.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
3993832464dd4e14a4c926583a11f0fa92c1f0f0 25-Aug-2009 Patrick McHardy <kaber@trash.net> netfilter: nfnetlink: constify message attributes and headers

Signed-off-by: Patrick McHardy <kaber@trash.net>
dd7669a92c6066b2b31bae7e04cd787092920883 13-Jun-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: conntrack: optional reliable conntrack event delivery

This patch improves ctnetlink event reliability if one broadcast
listener has set the NETLINK_BROADCAST_ERROR socket option.

The logic is the following: if an event delivery fails, we keep
the undelivered events in the missed event cache. Once the next
packet arrives, we add the new events (if any) to the missed
events in the cache and we try a new delivery, and so on. Thus,
if ctnetlink fails to deliver an event, we try to deliver them
once we see a new packet. Therefore, we may lose state
transitions but the userspace process gets in sync at some point.

At worst case, if no events were delivered to userspace, we make
sure that destroy events are successfully delivered. Basically,
if ctnetlink fails to deliver the destroy event, we remove the
conntrack entry from the hashes and we insert them in the dying
list, which contains inactive entries. Then, the conntrack timer
is added with an extra grace timeout of random32() % 15 seconds
to trigger the event again (this grace timeout is tunable via
/proc). The use of a limited random timeout value allows
distributing the "destroy" resends, thus, avoiding accumulating
lots "destroy" events at the same time. Event delivery may
re-order but we can identify them by means of the tuple plus
the conntrack ID.

The maximum number of conntrack entries (active or inactive) is
still handled by nf_conntrack_max. Thus, we may start dropping
packets at some point if we accumulate a lot of inactive conntrack
entries that did not successfully report the destroy event to
userspace.

During my stress tests consisting of setting a very small buffer
of 2048 bytes for conntrackd and the NETLINK_BROADCAST_ERROR socket
flag, and generating lots of very small connections, I noticed
very few destroy entries on the fly waiting to be resend.

A simple way to test this patch consist of creating a lot of
entries, set a very small Netlink buffer in conntrackd (+ a patch
which is not in the git tree to set the BROADCAST_ERROR flag)
and invoke `conntrack -F'.

For expectations, no changes are introduced in this patch.
Currently, event delivery is only done for new expectations (no
events from expectation expiration, removal and confirmation).
In that case, they need a per-expectation event cache to implement
the same idea that is exposed in this patch.

This patch can be useful to provide reliable flow-accouting. We
still have to add a new conntrack extension to store the creation
and destroy time.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
a0891aa6a635f658f29bb061a00d6d3486941519 13-Jun-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: conntrack: move event caching to conntrack extension infrastructure

This patch reworks the per-cpu event caching to use the conntrack
extension infrastructure.

The main drawback is that we consume more memory per conntrack
if event delivery is enabled. This patch is required by the
reliable event delivery that follows to this patch.

BTW, this patch allows you to enable/disable event delivery via
/proc/sys/net/netfilter/nf_conntrack_events in runtime, although
you can still disable event caching as compilation option.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
440f0d588555892601cfe511728a0fc0c8204063 10-Jun-2009 Patrick McHardy <kaber@trash.net> netfilter: nf_conntrack: use per-conntrack locks for protocol data

Introduce per-conntrack locks and use them instead of the global protocol
locks to avoid contention. Especially tcp_lock shows up very high in
profiles on larger machines.

This will also allow to simplify the upcoming reliable event delivery patches.

Signed-off-by: Patrick McHardy <kaber@trash.net>
e34d5c1a4f9919a81b4ea4591d7383245f35cb8e 03-Jun-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: conntrack: replace notify chain by function pointer

This patch removes the notify chain infrastructure and replace it
by a simple function pointer. This issue has been mentioned in the
mailing list several times: the use of the notify chain adds
too much overhead for something that is only used by ctnetlink.

This patch also changes nfnetlink_send(). It seems that gfp_any()
returns GFP_KERNEL for user-context request, like those via
ctnetlink, inside the RCU read-side section which is not valid.
Using GFP_KERNEL is also evil since netlink may schedule(),
this leads to "scheduling while atomic" bug reports.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
17e6e4eac070607a35464ea7e2c5eceac32e5eca 02-Jun-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: conntrack: simplify event caching system

This patch simplifies the conntrack event caching system by removing
several events:

* IPCT_[*]_VOLATILE, IPCT_HELPINFO and IPCT_NATINFO has been deleted
since the have no clients.
* IPCT_COUNTER_FILLING which is a leftover of the 32-bits counter
days.
* IPCT_REFRESH which is not of any use since we always include the
timeout in the messages.

After this patch, the existing events are:

* IPCT_NEW, IPCT_RELATED and IPCT_DESTROY, that are used to identify
addition and deletion of entries.
* IPCT_STATUS, that notes that the status bits have changes,
eg. IPS_SEEN_REPLY and IPS_ASSURED.
* IPCT_PROTOINFO, that reports that internal protocol information has
changed, eg. the TCP, DCCP and SCTP protocol state.
* IPCT_HELPER, that a helper has been assigned or unassigned to this
entry.
* IPCT_MARK and IPCT_SECMARK, that reports that the mark has changed, this
covers the case when a mark is set to zero.
* IPCT_NATSEQADJ, to report that there's updates in the NAT sequence
adjustment.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
274d383b9c1906847a64bbb267b0183599ce86a0 02-Jun-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: conntrack: don't report events on module removal

During the module removal there are no possible event listeners
since ctnetlink must be removed before to allow removing
nf_conntrack. This patch removes the event reporting for the
module removal case which is not of any use in the existing code.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
03b64f518a893512d32f07a10b053e558beafcaf 02-Jun-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: cleanup message-size calculation

This patch cleans up the message calculation to make it similar
to rtnetlink, moreover, it removes unneeded verbose information.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
96bcf938dc9637e8bb8b2c5d7885d72af5cd10af 02-Jun-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: use nlmsg_* helper function to build messages

Replaces the old macros to build Netlink messages with the
new nlmsg_*() helper functions.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
f2f3e38c63c58a3d39bd710039af8bbd15ecaff6 02-Jun-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: rename tuple() by nf_ct_tuple() macro definition

This patch move the internal tuple() macro definition to the
header file as nf_ct_tuple().

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
8b0a231d4d6336baf10f13b6142fd5c1f628247e 02-Jun-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: remove nowait parameter from *fill_info()

This patch is a cleanup, it removes the `nowait' parameter
from all *fill_info() function since it is always set to one.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
fecc1133b66af6e0cd49115a248f34bbb01f180a 05-May-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix wrong message type in user updates

This patch fixes the wrong message type that are triggered by
user updates, the following commands:

(term1)# conntrack -I -p tcp -s 1.1.1.1 -d 2.2.2.2 -t 10 --sport 10 --dport 20 --state LISTEN
(term1)# conntrack -U -p tcp -s 1.1.1.1 -d 2.2.2.2 -t 10 --sport 10 --dport 20 --state SYN_SENT
(term1)# conntrack -U -p tcp -s 1.1.1.1 -d 2.2.2.2 -t 10 --sport 10 --dport 20 --state SYN_RECV

only trigger event message of type NEW, when only the first is NEW
while others should be UPDATE.

(term2)# conntrack -E
[NEW] tcp 6 10 LISTEN src=1.1.1.1 dst=2.2.2.2 sport=10 dport=20 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=20 dport=10 mark=0
[NEW] tcp 6 10 SYN_SENT src=1.1.1.1 dst=2.2.2.2 sport=10 dport=20 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=20 dport=10 mark=0
[NEW] tcp 6 10 SYN_RECV src=1.1.1.1 dst=2.2.2.2 sport=10 dport=20 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=20 dport=10 mark=0

This patch also removes IPCT_REFRESH from the bitmask since it is
not of any use.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
29fe1b481283a1bada994a69f65736db4ae6f35f 22-Apr-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix gcc warning during compilation

This patch fixes a (bogus?) gcc warning during compilation:

net/netfilter/nf_conntrack_netlink.c:1234: warning: 'helpname' may be used uninitialized in this function
net/netfilter/nf_conntrack_netlink.c:991: warning: 'helpname' may be used uninitialized in this function

In fact, helpname is initialized by ctnetlink_parse_help() so
I cannot see a way to use it without being initialized.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
150ace0db360373d2016a2497d252138a59c5ba8 17-Apr-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: report error if event message allocation fails

This patch fixes an inconsistency that results in no error reports
to user-space listeners if we fail to allocate the event message.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
d271e8bd8c60ce059ee36d836ba063cfc61c3e21 26-Mar-2009 Holger Eitzenberger <holger@eitzenberger.org> ctnetlink: compute generic part of event more acurately

On a box with most of the optional Netfilter switches turned off some
of the NLAs are never send, e. g. secmark, mark or the conntrack
byte/packet counters. As a worst case scenario this may possibly
still lead to ctnetlink skbs being reallocated in netlink_trim()
later, loosing all the nice effects from the previous patches.

I try to solve that (at least partly) by correctly #ifdef'ing the
NLAs in the computation.

Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2732c4e45bb67006fdc9ae6669be866762711ab5 25-Mar-2009 Holger Eitzenberger <holger@eitzenberger.org> netfilter: ctnetlink: allocate right-sized ctnetlink skb

Try to allocate a Netlink skb roughly the size of the actual
message, with the help from the l3 and l4 protocol helpers.
This is all to prevent a reallocation in netlink_trim() later.

The overhead of allocating the right-sized skb is rather small, with
ctnetlink_alloc_skb() actually being inlined away on my x86_64 box.
The size of the per-proto space is determined at registration time of
the protocol helper.

Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
ea781f197d6a835cbb93a0bf88ee1696296ed8aa 25-Mar-2009 Eric Dumazet <dada1@cosmosbay.com> netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu()

Use "hlist_nulls" infrastructure we added in 2.6.29 for RCUification of UDP & TCP.

This permits an easy conversion from call_rcu() based hash lists to a
SLAB_DESTROY_BY_RCU one.

Avoiding call_rcu() delay at nf_conn freeing time has numerous gains.

First, it doesnt fill RCU queues (up to 10000 elements per cpu).
This reduces OOM possibility, if queued elements are not taken into account
This reduces latency problems when RCU queue size hits hilimit and triggers
emergency mode.

- It allows fast reuse of just freed elements, permitting better use of
CPU cache.

- We delete rcu_head from "struct nf_conn", shrinking size of this structure
by 8 or 16 bytes.

This patch only takes care of "struct nf_conn".
call_rcu() is still used for less critical conntrack parts, that may
be converted later if necessary.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
dd5b6ce6fd465eab90357711c8e8124dc3a31ff0 23-Mar-2009 Pablo Neira Ayuso <pablo@netfilter.org> nefilter: nfnetlink: add nfnetlink_set_err and use it in ctnetlink

This patch adds nfnetlink_set_err() to propagate the error to netlink
broadcast listener in case of memory allocation errors in the
message building.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
0f5b3e85a3716efebb0150ebb7c6d022e2bf17d7 18-Mar-2009 Patrick McHardy <kaber@trash.net> netfilter: ctnetlink: fix rcu context imbalance

Introduced by 7ec47496 (netfilter: ctnetlink: cleanup master conntrack assignation):

net/netfilter/nf_conntrack_netlink.c:1275:2: warning: context imbalance in 'ctnetlink_create_conntrack' - different lock contexts for basic block

Signed-off-by: Patrick McHardy <kaber@trash.net>
cd91566e4bdbcb8841385e4b2eacc8d0c29c9208 18-Mar-2009 Florian Westphal <fw@strlen.de> netfilter: ctnetlink: remove remaining module refcounting

Convert the remaining refcount users.

As pointed out by Patrick McHardy, the protocols can be accessed safely using RCU.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
626ba8fbac9156a94a80be46ffd2f2ce9e4e89a0 16-Mar-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix crash during expectation creation

This patch fixes a possible crash due to the missing initialization
of the expectation class when nf_ct_expect_related() is called.

Reported-by: BORBELY Zoltan <bozo@andrews.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
f0a3c0869f3b0ef93d9df044e9a41e40086d4c97 16-Mar-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: move event reporting for new entries outside the lock

This patch moves the event reporting outside the lock section. With
this patch, the creation and update of entries is homogeneous from
the event reporting perspective. Moreover, as the event reporting is
done outside the lock section, the netlink broadcast delivery can
benefit of the yield() call under congestion.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
e098360f159b3358f085543eb6dc2eb500d6667c 16-Mar-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: cleanup conntrack update preliminary checkings

This patch moves the preliminary checkings that must be fulfilled
to update a conntrack, which are the following:

* NAT manglings cannot be updated
* Changing the master conntrack is not allowed.

This patch is a cleanup.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
7ec4749675bf33ea639bbcca8a5365ccc5091a6a 16-Mar-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: cleanup master conntrack assignation

This patch moves the assignation of the master conntrack to
ctnetlink_create_conntrack(), which is where it really belongs.
This patch is a cleanup.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
1f9da256163e3ff91a12d0b861091f0e525139df 09-Feb-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix echo if not subscribed to any multicast group

This patch fixes echoing if the socket that has sent the request to
create/update/delete an entry is not subscribed to any multicast
group. With the current code, ctnetlink would not send the echo
message via unicast as nfnetlink_send() would be skip.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
c969aa7d2cd5621ad4129dae6b6551af422944c6 09-Feb-2009 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: allow changing NAT sequence adjustment in creation

This patch fixes an inconsistency in the current ctnetlink code
since NAT sequence adjustment bit can only be updated but not set
in the conntrack entry creation.

This patch is used by conntrackd to successfully recover newly
created entries that represent connections with helpers and NAT
payload mangling.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
748085fcbedbf7b0f38d95e178265d7b13360b44 21-Jan-2009 Patrick McHardy <kaber@trash.net> netfilter: ctnetlink: fix scheduling while atomic

Caused by call to request_module() while holding nf_conntrack_lock.

Reported-and-tested-by: Kövesdi György <kgy@teledigit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
cd7fcbf1cb6933bfb9171452b4a370c92923544d 12-Jan-2009 Julia Lawall <julia@diku.dk> netfilter 07/09: simplify nf_conntrack_alloc() error handling

nf_conntrack_alloc cannot return NULL, so there is no need to check for
NULL before using the value. I have also removed the initialization of ct
to NULL in nf_conntrack_alloc, since the value is never used, and since
perhaps it might lead one to think that return ct at the end might return
NULL.

The semantic patch that finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@match exists@
expression x, E;
position p1,p2;
statement S1, S2;
@@

x@p1 = nf_conntrack_alloc(...)
... when != x = E
(
if (x@p2 == NULL || ...) S1 else S2
|
if (x@p2 == NULL && ...) S1 else S2
)

@other_match exists@
expression match.x, E1, E2;
position p1!=match.p1,match.p2;
@@

x@p1 = E1
... when != x = E2
x@p2

@ script:python depends on !other_match@
p1 << match.p1;
p2 << match.p2;
@@

print "%s: call to nf_conntrack_alloc %s bad test %s" % (p1[0].file,p1[0].line,p2[0].line)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
3ec192559033ed457f0d7856838654c100fc659f 26-Nov-2008 Patrick McHardy <kaber@trash.net> netfilter: ctnetlink: fix GFP_KERNEL allocation under spinlock

The previous fix for the conntrack creation race (netfilter: ctnetlink:
fix conntrack creation race) missed a GFP_KERNEL allocation that is
now performed while holding a spinlock. Switch to GFP_ATOMIC.

Reported-and-tested-by: Zoltan Borbely <bozo@andrews.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
b54ad409fd09a395b839fb81f300880d76861c0e 25-Nov-2008 Patrick McHardy <kaber@trash.net> netfilter: ctnetlink: fix conntrack creation race

Conntrack creation through ctnetlink has two races:

- the timer may expire and free the conntrack concurrently, causing an
invalid memory access when attempting to put it in the hash tables

- an identical conntrack entry may be created in the packet processing
path in the time between the lookup and hash insertion

Hold the conntrack lock between the lookup and insertion to avoid this.

Reported-by: Zoltan Borbely <bozo@andrews.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
19abb7b090a6bce88d4e9b2914a0367f4f684432 18-Nov-2008 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: deliver events for conntracks changed from userspace

As for now, the creation and update of conntracks via ctnetlink do not
propagate an event to userspace. This can result in inconsistent situations
if several userspace processes modify the connection tracking table by means
of ctnetlink at the same time. Specifically, using the conntrack command
line tool and conntrackd at the same time can trigger unconsistencies.

This patch also modifies the event cache infrastructure to pass the
process PID and the ECHO flag to nfnetlink_send() to report back
to userspace if the process that triggered the change needs so.
Based on a suggestion from Patrick McHardy.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
226c0c0ef2abdf91b8d9cce1aaf7d4635a5e5926 18-Nov-2008 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: helper modules load-on-demand support

This patch adds module loading for helpers via ctnetlink.

* Creation path: We support explicit and implicit helper assignation. For
the explicit case, we try to load the module. If the module is correctly
loaded and the helper is present, we return EAGAIN to re-start the
creation. Otherwise, we return EOPNOTSUPP.
* Update path: release the spin lock, load the module and check. If it is
present, then return EAGAIN to re-start the update.

This patch provides a refactorized function to lookup-and-set the
connection tracking helper. The function removes the exported symbol
__nf_ct_helper_find as it has not clients anymore.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
528a3a6f67d4fbe708b9f306be194e78b29e8d7a 17-Nov-2008 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: get rid of module refcounting in ctnetlink

This patch replaces the unnecessary module refcounting with
the read-side locks. With this patch, all the dump and fill_info
function are called under the RCU read lock.

Based on a patch from Fabian Hugelshofer.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
bfe2967735e0e0f650bf698a5683db2b6cf4cfd7 17-Nov-2008 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: use EOPNOTSUPP instead of EINVAL if the conntrack has no helper

This patch changes the return value if the conntrack has no helper assigned.
Instead of EINVAL, which is reserved for malformed messages, it returns
EOPNOTSUPP.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
238ede8160443a32379fd8f9eb88d00456a09bb4 17-Nov-2008 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: use nf_conntrack_get instead of atomic_inc

Use nf_conntrack_get instead of the direct call to atomic_inc.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
67671841dfb82df7a60c46e6fefe813cf57805ff 20-Oct-2008 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: fix compilation error with NAT=n

This patch fixes the compilation of ctnetlink when the NAT support
is not enabled.

/home/benh/kernels/linux-powerpc/net/netfilter/nf_conntrack_netlink.c:819: warning: enum nf_nat_manip_type\u2019 declared inside parameter list
/home/benh/kernels/linux-powerpc/net/netfilter/nf_conntrack_netlink.c:819: warning: its scope is only this definition or declaration, which is probably not what you want

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
95a5afca4a8d2e1cb77e1d4bc6ff9f718dc32f7a 17-Oct-2008 Johannes Berg <johannes@sipsolutions.net> net: Remove CONFIG_KMOD from net/ (towards removing CONFIG_KMOD entirely)

Some code here depends on CONFIG_KMOD to not try to load
protocol modules or similar, replace by CONFIG_MODULES
where more than just request_module depends on CONFIG_KMOD
and and also use try_then_request_module in ebtables.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
e6a7d3c04f8fe49099521e6dc9a46b0272381f2f 14-Oct-2008 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: remove bogus module dependency between ctnetlink and nf_nat

This patch removes the module dependency between ctnetlink and
nf_nat by means of an indirect call that is initialized when
nf_nat is loaded. Now, nf_conntrack_netlink only requires
nf_conntrack and nfnetlink.

This patch puts nfnetlink_parse_nat_setup_hook into the
nf_conntrack_core to avoid dependencies between ctnetlink,
nf_conntrack_ipv4 and nf_conntrack_ipv6.

This patch also introduces the function ctnetlink_change_nat
that is only invoked from the creation path. Actually, the
nat handling cannot be invoked from the update path since
this is not allowed. By introducing this function, we remove
the useless nat handling in the update path and we avoid
deadlock-prone code.

This patch also adds the required EAGAIN logic for nfnetlink.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9b03f38d0487f3908696242286d934c9b38f9d2a 08-Oct-2008 Alexey Dobriyan <adobriyan@gmail.com> netfilter: netns nf_conntrack: per-netns expectations

Make per-netns a) expectation hash and b) expectations count.

Expectations always belongs to netns to which it's master conntrack belong.
This is natural and doesn't bloat expectation.

Proc files and leaf users are stubbed to init_net, this is temporary.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
400dad39d1c33fe797e47326d87a3f54d0ac5181 08-Oct-2008 Alexey Dobriyan <adobriyan@gmail.com> netfilter: netns nf_conntrack: per-netns conntrack hash

* make per-netns conntrack hash

Other solution is to add ->ct_net pointer to tuplehashes and still has one
hash, I tried that it's ugly and requires more code deep down in protocol
modules et al.

* propagate netns pointer to where needed, e. g. to conntrack iterators.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
5a1fb391d881905e89623d78858d05b248cbc86a 08-Oct-2008 Alexey Dobriyan <adobriyan@gmail.com> netfilter: netns nf_conntrack: add ->ct_net -- pointer from conntrack to netns

Conntrack (struct nf_conn) gets pointer to netns: ->ct_net -- netns in which
it was created. It comes from netdevice.

->ct_net is write-once field.

Every conntrack in system has ->ct_net initialized, no exceptions.

->ct_net doesn't pin netns: conntracks are recycled after timeouts and
pinning background traffic will prevent netns from even starting shutdown
sequence.

Right now every conntrack is created in init_net.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
fab00c5d15091546be681426c60b2ed2c10513bf 19-Aug-2008 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: sleepable allocation with spin lock bh

This patch removes a GFP_KERNEL allocation while holding a spin lock with
bottom halves disabled in ctnetlink_change_helper().

This problem was introduced in 2.6.23 with the netfilter extension
infrastructure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
cb1cb5c47457ff2b604dac2da44cab4d39d11459 19-Aug-2008 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix sleep in read-side lock section

Fix allocation with GFP_KERNEL in ctnetlink_create_conntrack() under
read-side lock sections.

This problem was introduced in 2.6.25.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
1575e7ea018fec992b94a12a1a491ce693ae9eac 19-Aug-2008 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: fix double helper assignation for NAT'ed conntracks

If we create a conntrack that has NAT handlings and a helper, the helper
is assigned twice. This happens because nf_nat_setup_info() - via
nf_conntrack_alter_reply() - sets the helper before ctnetlink, which
indeed does not check if the conntrack already has a helper as it thinks that
it is a brand new conntrack.

The fix moves the helper assignation before the set of the status flags.
This avoids a bogus assertion in __nf_ct_ext_add (if netfilter assertions are
enabled) which checks that the conntrack must not be confirmed.

This problem was introduced in 2.6.23 with the netfilter extension
infrastructure.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
584015727a3b88b46602b20077b46cd04f8b4ab3 21-Jul-2008 Krzysztof Piotr Oledzki <ole@ans.pl> netfilter: accounting rework: ct_extend + 64bit counters (v4)

Initially netfilter has had 64bit counters for conntrack-based accounting, but
it was changed in 2.6.14 to save memory. Unfortunately in-kernel 64bit counters are
still required, for example for "connbytes" extension. However, 64bit counters
waste a lot of memory and it was not possible to enable/disable it runtime.

This patch:
- reimplements accounting with respect to the extension infrastructure,
- makes one global version of seq_print_acct() instead of two seq_print_counters(),
- makes it possible to enable it at boot time (for CONFIG_SYSCTL/CONFIG_SYSFS=n),
- makes it possible to enable/disable it at runtime by sysctl or sysfs,
- extends counters from 32bit to 64bit,
- renames ip_conntrack_counter -> nf_conn_counter,
- enables accounting code unconditionally (no longer depends on CONFIG_NF_CT_ACCT),
- set initial accounting enable state based on CONFIG_NF_CT_ACCT
- removes buggy IPCT_COUNTER_FILLING event handling.

If accounting is enabled newly created connections get additional acct extend.
Old connections are not changed as it is not possible to add a ct_extend area
to confirmed conntrack. Accounting is performed for all connections with
acct extend regardless of a current state of "net.netfilter.nf_conntrack_acct".

Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
b891c5a831b13f74989dcbd7b39d04537b2a05d9 08-Jul-2008 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: nf_conntrack: add allocation flag to nf_conntrack_alloc

ctnetlink does not need to allocate the conntrack entries with GFP_ATOMIC
as its code is executed in user context.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
e57dce60c7478fdeeb9a1ebd311261ec901afe4d 10-Jun-2008 Fabian Hugelshofer <hugelshofer2006@gmx.ch> netfilter: ctnetlink: include conntrack status in destroy event message

When a conntrack is destroyed, the connection status does not get
exported to netlink. I don't see a reason for not doing so. This patch
exports the status on all conntrack events.

Signed-off-by: Fabian Hugelshofer <hugelshofer2006@gmx.ch>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
51091764f26ec36c02e35166f083193a30f426fc 10-Jun-2008 Patrick McHardy <kaber@trash.net> netfilter: nf_conntrack: add nf_ct_kill()

Encapsulate the common

if (del_timer(&ct->timeout))
ct->timeout.function((unsigned long)ct)

sequence in a new function.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
0adf9d67489cd30bab8eb93f7de81a674e44e1c3 10-Jun-2008 Pablo Neira Ayuso <pablo@netfilter.org> netfilter: ctnetlink: group errors into logical errno sets

This patch groups ctnetlink errors into three logical sets:

* Malformed messages: if ctnetlink receives a message without some mandatory
attribute, then it returns EINVAL.
* Unsupported operations: if userspace tries to perform an unsupported
operation, then it returns EOPNOTSUPP.
* Unchangeable: if userspace tries to change some attribute of the
conntrack object that can only be set once, then it returns EBUSY.

This patch reduces the number of -EINVAL from 23 to 14 and it results in
5 -EBUSY and 6 -EOPNOTSUPP.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
711bbdd659b685b45d3f28b29a00f17be6484f38 17-May-2008 Ingo Molnar <mingo@elte.hu> rculist.h: fix include in net/netfilter/nf_conntrack_netlink.c

this file has rculist dependency but did not explicitly include it,
which broke the build.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
1eedf69993d4016428fd99ffd619e73b374be3c1 14-May-2008 Eric Leblond <eric@inl.fr> netfilter: ctnetlink: dump conntrack ID in event messages

Conntrack ID is not put (anymore ?) in event messages. This causes
current ulogd2 code to fail because it uses the ID to build a hash in
userspace. This hash is used to be able to output the starting time of
a connection.

Conntrack ID can be used in userspace application to maintain an easy
match between kernel connections list and userspace one. It may worth
to add it if there is no performance related issue.

[ Patrick: it was never included in events, but really should be ]

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
5e8fbe2ac8a3f1e34e7004c5750ef59bf9304f82 14-Apr-2008 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack: add tuplehash l3num/protonum accessors

Add accessors for l3num and protonum and get rid of some overly long
expressions.

Signed-off-by: Patrick McHardy <kaber@trash.net>
ca6a50749012fc17feeec91ee2f9eeacacf06f0b 14-Apr-2008 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack_netlink: clean up NAT protocol parsing

Move responsibility for setting the IP_NAT_RANGE_PROTO_SPECIFIED flag
to the NAT protocol, properly propagate errors and get rid of ugly
return value convention.

Signed-off-by: Patrick McHardy <kaber@trash.net>
a83099a60ffda10fa2af85f1c5a141610ffbb2b6 31-Jan-2008 Eric Leblond <eric@inl.fr> [NETFILTER]: nf_conntrack_netlink: transmit mark during all events

The following feature was submitted some months ago. It forces the dump
of mark during the connection destruction event. The induced load is
quiet small and the patch is usefull to provide an easy way to filter
event on user side without having to keep an hash in userspace.

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
ba419aff2cda91680e5d4d3eeff95df49bd2edec 31-Jan-2008 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack: optimize __nf_conntrack_find()

Ignoring specific entries in __nf_conntrack_find() is only needed by NAT
for nf_conntrack_tuple_taken(). Remove it from __nf_conntrack_find()
and make nf_conntrack_tuple_taken() search the hash itself.

Saves 54 bytes of text in the hotpath on x86_64:

__nf_conntrack_find | -54 # 321 -> 267, # inlines: 3 -> 2, size inlines: 181 -> 127
nf_conntrack_tuple_taken | +305 # 15 -> 320, lexblocks: 0 -> 3, # inlines: 0 -> 3, size inlines: 0 -> 181
nf_conntrack_find_get | -2 # 90 -> 88
3 functions changed, 305 bytes added, 56 bytes removed, diff: +249

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
f8ba1affa18398610e765736153fff614309ccc8 31-Jan-2008 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack: switch rwlock to spinlock

With the RCU conversion only write_lock usages of nf_conntrack_lock are
left (except one read_lock that should actually use write_lock in the
H.323 helper). Switch to a spinlock.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
76507f69c44ed199a1a68086145398459e55835d 31-Jan-2008 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack: use RCU for conntrack hash

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7d0742da1c8f5df3a34030f0170b30d1a052be80 31-Jan-2008 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack_expect: use RCU for expectation hash

Use RCU for expectation hash. This doesn't buy much for conntrack
runtime performance, but allows to reduce the use of nf_conntrack_lock
for /proc and nf_netlink_conntrack.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
58a3c9bb0c69f8517c2243cd0912b3f87b4f868c 31-Jan-2008 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack: use RCU for conntrack helpers

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
1d670fdc8c14780b8e0ad915ad3bb13b2fd9223b 31-Jan-2008 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack_netlink: fix unbalanced locking

Properly drop nf_conntrack_lock on tuple parsing error.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
bb5cf80e94ad9650c4bd39e92fb917af8e87fa43 06-Jan-2008 Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> [NETFILTER]: Kill some supper dupper bloatry

/me awards the bloatiest-of-all-net/-.c-code award to
nf_conntrack_netlink.c, congratulations to all the authors :-/!

Hall of (unquestionable) fame (measured per inline, top 10 under
net/):
-4496 ctnetlink_parse_tuple netfilter/nf_conntrack_netlink.c
-2165 ctnetlink_dump_tuples netfilter/nf_conntrack_netlink.c
-2115 __ip_vs_get_out_rt ipv4/ipvs/ip_vs_xmit.c
-1924 xfrm_audit_helper_pktinfo xfrm/xfrm_state.c
-1799 ctnetlink_parse_tuple_proto netfilter/nf_conntrack_netlink.c
-1268 ctnetlink_parse_tuple_ip netfilter/nf_conntrack_netlink.c
-1093 ctnetlink_exp_dump_expect netfilter/nf_conntrack_netlink.c
-1060 void ccid3_update_send_interval dccp/ccids/ccid3.c
-983 ctnetlink_dump_tuples_proto netfilter/nf_conntrack_netlink.c
-827 ctnetlink_exp_dump_tuple netfilter/nf_conntrack_netlink.c

(i386 / gcc (GCC) 4.1.2 20070626 (Red Hat 4.1.2-13) /
allyesconfig except CONFIG_FORCED_INLINING)

...and I left < 200 byte gains as future work item.

After iterative inline removal, I finally have this:

net/netfilter/nf_conntrack_netlink.c:
ctnetlink_exp_fill_info | -1104
ctnetlink_new_expect | -1572
ctnetlink_fill_info | -1303
ctnetlink_new_conntrack | -2230
ctnetlink_get_expect | -341
ctnetlink_del_expect | -352
ctnetlink_expect_event | -1110
ctnetlink_conntrack_event | -1548
ctnetlink_del_conntrack | -729
ctnetlink_get_conntrack | -728
10 functions changed, 11017 bytes removed, diff: -11017

net/netfilter/nf_conntrack_netlink.c:
ctnetlink_parse_tuple | +419
dump_nat_seq_adj | +183
ctnetlink_dump_counters | +166
ctnetlink_dump_tuples | +261
ctnetlink_exp_dump_expect | +633
ctnetlink_change_status | +460
6 functions changed, 2122 bytes added, diff: +2122

net/netfilter/nf_conntrack_netlink.o:
16 functions changed, 2122 bytes added, 11017 bytes removed, diff: -8895

Without a number of CONFIG.*DEBUGs, I got this:
net/netfilter/nf_conntrack_netlink.o:
16 functions changed, 2122 bytes added, 11029 bytes removed, diff: -8907

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
e79ec50b9587c175f65f98550d66ad5b96c05dd9 18-Dec-2007 Jan Engelhardt <jengelh@computergmbh.de> [NETFILTER]: Parenthesize macro parameters

Parenthesize macro parameters.

Signed-off-by: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
cc01dcbd26865addfe9eb5431f1f9dbc511515ba 18-Dec-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_nat: pass manip type instead of hook to nf_nat_setup_info

nf_nat_setup_info gets the hook number and translates that to the
manip type to perform. This is a relict from the time when one
manip per hook could exist, the exact hook number doesn't matter
anymore, its converted to the manip type. Most callers already
know what kind of NAT they want to perform, so pass the maniptype
in directly.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2b628a0866860d44652362aafe403e5b5895583d 18-Dec-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_nat: mark NAT protocols const

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
d978e5daec544ec72b28bf72a30dc9ac3da23a35 18-Dec-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: ctnetlink: fix expectation timeout dumping

When the timer is late its timeout might be before the current time,
in which case a very large value is dumped.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
77236b6e33b06aaf756a86ed1965ca7d460b1b53 18-Dec-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: ctnetlink: use netlink attribute helpers

Use NLA_PUT_BE32, nla_get_be32() etc.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
37fccd8577d38e249dde71512fb38d2f6a4d9d3c 18-Dec-2007 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: ctnetlink: add support for secmark

This patch adds support for James Morris' connsecmark.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
0f417ce989f84cfd5418e3b316064bfbb2708196 18-Dec-2007 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: ctnetlink: add support for master tuple event notification and dumping

This patch adds support for master tuple event notification and
dumping. Conntrackd needs this information to recover related
connections appropriately.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
13eae15a244bb29beaa47bf86a24fd29ca7f8a4c 18-Dec-2007 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: ctnetlink: add support for NAT sequence adjustments

The combination of NAT and helpers may produce TCP sequence adjustments.
In failover setups, this information needs to be replicated in order to
achieve a successful recovery of mangled, related connections. This patch is
particularly useful for conntrackd, see:

http://people.netfilter.org/pablo/conntrack-tools/

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
6e23ae2a48750bda407a4a58f52a4865d7308bf5 20-Nov-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: Introduce NF_INET_ hook values

The IPv4 and IPv6 hook values are identical, yet some code tries to figure
out the "correct" value by looking at the address family. Introduce NF_INET_*
values for both IPv4 and IPv6. The old values are kept in a #ifndef __KERNEL__
section for userspace compatibility.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
f2a89004da23a5ed2d78ac5550ccda5b714fe7d0 12-Dec-2007 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: ctnetlink: set expected bit for related conntracks

This patch is a fix. It sets IPS_EXPECTED for related conntracks.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
5faa1f4cb5a1f124f76172d775467f4a9db5b452 28-Sep-2007 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: nf_conntrack_netlink: add support to related connections

This patch adds support to relate a connection to an existing master
connection. This patch is used by conntrackd to correctly replicate
related connections.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
3583240249ef354760e04ae49bd7b462a638f40c 28-Sep-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack_expect: kill unique ID

Similar to the conntrack ID, the per-expectation ID is not needed
anymore, kill it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7f85f914721ffcef382a57995182916bd43d8a65 28-Sep-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack: kill unique ID

Remove the per-conntrack ID, its not necessary anymore for dumping.
For compatiblity reasons we send the address of the conntrack to
userspace as ID.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
f73e924cdd166360e8cc9a1b193008fdc9b3e3e2 28-Sep-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: ctnetlink: use netlink policy

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
fdf708322d4658daa6eb795d1a835b97efdb335e 28-Sep-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: nfnetlink: rename functions containing 'nfattr'

There is no struct nfattr anymore, rename functions to 'nlattr'.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
df6fb868d6118686805c2fa566e213a8f31c8e4f 28-Sep-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: nfnetlink: convert to generic netlink attribute functions

Get rid of the duplicated rtnetlink macros and use the generic netlink
attribute functions. The old duplicated stuff is moved to a new header
file that exists just for userspace.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7c8d4cb4198d199e65a6ced8c81f71e3ac3f4cfc 28-Sep-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: nfnetlink: make subsystem and callbacks const

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
ff4ca8273eafbba875a86d333e059e78f292107f 08-Aug-2007 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: ctnetlink: return EEXIST instead of EINVAL for existing nat'ed conntracks

ctnetlink must return EEXIST for existing nat'ed conntracks instead of
EINVAL. Only return EINVAL if we try to update a conntrack with NAT
handlings (that is not allowed).

Decadence:libnetfilter_conntrack/utils# ./conntrack_create_nat
TEST: create conntrack (0)(Success)
Decadence:libnetfilter_conntrack/utils# ./conntrack_create_nat
TEST: create conntrack (-1)(Invalid argument)

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
b560580a13b180bc1e3cad7ffbc93388cc39be5d 08-Jul-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack_expect: maintain per conntrack expectation list

This patch brings back the per-conntrack expectation list that was
removed around 2.6.10 to avoid walking all expectations on expectation
eviction and conntrack destruction.

As these were the last users of the global expectation list, this patch
also kills that.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
31f15875c5ad98a13b528aaf19c839e22b43dc9a 08-Jul-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack_helper/nf_conntrack_netlink: convert to expectation hash

Convert from the global expectation list to the hash table.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
cf6994c2b9812a9f02b99e89df411ffc5db9c779 08-Jul-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack_netlink: sync expectation dumping with conntrack table dumping

Resync expectation table dumping code with conntrack dumping: don't
rely on the unique ID anymore since that requires to walk the list
backwards, which doesn't work with the upcoming conversion to hlists.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
d4156e8cd93f5772483928aaf4960120caebd789 08-Jul-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack: reduce masks to a subset of tuples

Since conntrack currently allows to use masks for every bit of both
helper and expectation tuples, we can't hash them and have to keep
them on two global lists that are searched for every new connection.

This patch removes the never used ability to use masks for the
destination part of the expectation tuple and completely removes
masks from helpers since the only reasonable choice is a full
match on l3num, protonum and src.u.all.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
6823645d608541c2c69e8a99454936e058c294e0 08-Jul-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack_expect: function naming unification

Currently there is a wild mix of nf_conntrack_expect_, nf_ct_exp_,
expect_, exp_, ...

Consistently use nf_ct_ as prefix for exported functions.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
330f7db5e578e1e298ba3a41748e5ea333a64a2b 08-Jul-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack: remove 'ignore_conntrack' argument from nf_conntrack_find_get

All callers pass NULL, this also doesn't seem very useful for modules.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
f205c5e0c28aa7e0fb6eaaa66e97928f9d9e6994 08-Jul-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack: use hlists for conntrack hash

Convert conntrack hash to hlists to reduce its size and cache
footprint. Since the default hashsize to max. entries ratio
sucks (1:16), this patch doesn't reduce the amount of memory
used for the hash by default, but instead uses a better ratio
of 1:8, which results in the same max. entries value.

One thing worth noting is early_drop. It really should use LRU,
so it now has to iterate over the entire chain to find the last
unconfirmed entry. Since chains shouldn't be very long and the
entire operation is very rare this shouldn't be a problem.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
ceceae1b1555a9afcb8dacf90df5fa1f20fd5466 08-Jul-2007 Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> [NETFILTER]: nf_conntrack: use extension infrastructure for helper

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
e2d8e314ad18d4302b3b7ea21ab8b2cb72f2b152 22-Jun-2007 Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> [NETFILTER]: nfctnetlink: Don't allow to change helper

There is no realistic situation to change helper (Who wants IRC helper to
track FTP traffic ?). Moreover, if we want to do that, we need to fix race
issue by nfctnetlink and running helper. That will add overhead to packet
processing. It wouldn't pay. So this rejects the request to change
helper. The requests to add or remove helper are accepted as ever.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
3c158f7f57601bc27eab82f0dc4fd3fad314d845 05-Jun-2007 Patrick McHarrdy <kaber@trash.net> [NETFILTER]: nf_conntrack: fix helper module unload races

When a helper module is unloaded all conntracks refering to it have their
helper pointer NULLed out, leading to lots of races. In most places this
can be fixed by proper use of RCU (they do already check for != NULL,
but in a racy way), additionally nf_conntrack_expect_related needs to
bail out when no helper is present.

Also remove two paranoid BUG_ONs in nf_conntrack_proto_gre that are racy
and not worth fixing.

Signed-off-by: Patrick McHarrdy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
df293bbb6ff80f40a2308140ba4cbc2d3c1b18da 10-May-2007 Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> [NETFILTER]: ctnetlink: clear helper area and handle unchanged helper

This patch
- Clears private area for helper even if no helper is assigned to
conntrack. It might be used by old helper.
- Unchanges if the same helper as the used one is specified.
- Does not find helper if no helper is specified. And it does not
require private area for helper in that case.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
1b53d9042c04b8eb875d02e65792e9884efc3784 23-Mar-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: Remove changelogs and CVS IDs

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
c702e8047fe74648f7852a9c1de781b0d5a98402 23-Mar-2007 Thomas Graf <tgraf@suug.ch> [NETLINK]: Directly return -EINTR from netlink_dump_start()

Now that all users of netlink_dump_start() use netlink_run_queue()
to process the receive queue, it is possible to return -EINTR from
netlink_dump_start() directly, therefore simplying the callers.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
1d00a4eb42bdade33a6ec0961cada93577a66ae6 23-Mar-2007 Thomas Graf <tgraf@suug.ch> [NETLINK]: Remove error pointer from netlink message handler

The error pointer argument in netlink message handlers is used
to signal the special case where processing has to be interrupted
because a dump was started but no error happened. Instead it is
simpler and more clear to return -EINTR and have netlink_run_queue()
deal with getting the queue right.

nfnetlink passed on this error pointer to its subsystem handlers
but only uses it to signal the start of a netlink dump. Therefore
it can be removed there as well.

This patch also cleans up the error handling in the affected
message handlers to be consistent since it had to be touched anyway.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
dc5fc579b90ed0a9a4e55b0218cdbaf0a8cf2e67 26-Mar-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [NETLINK]: Use nlmsg_trim() where appropriate

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
27a884dc3cb63b93c2b3b643f5b31eed5f8a4d26 20-Apr-2007 Arnaldo Carvalho de Melo <acme@redhat.com> [SK_BUFF]: Convert skb->tail to sk_buff_data_t

So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
on 64bit architectures, allowing us to combine the 4 bytes hole left by the
layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
:-)

Many calculations that previously required that skb->{transport,network,
mac}_header be first converted to a pointer now can be done directly, being
meaningful as offsets or pointers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
601e68e100b6bf8ba13a32db8faf92d43acaa997 12-Feb-2007 YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> [NETFILTER]: Fix whitespace errors

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
40e0cb004a6d4a7ad577724e451e8dbd6cba5a89 03-Feb-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: ctnetlink: fix compile failure with NF_CONNTRACK_MARK=n

CC net/netfilter/nf_conntrack_netlink.o
net/netfilter/nf_conntrack_netlink.c: In function 'ctnetlink_conntrack_event':
net/netfilter/nf_conntrack_netlink.c:392: error: 'struct nf_conn' has no member named 'mark'
make[3]: *** [net/netfilter/nf_conntrack_netlink.o] Error 1

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
c54ea3b95ac504ed81e0ec3acfaa26d0f55bdfa4 16-Jan-2007 Patrick McHardy <kaber@trash.net> [NETFILTER]: ctnetlink: fix leak in ctnetlink_create_conntrack error path

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
5b1158e909ecbe1a052203e0d8df15633f829930 03-Dec-2006 Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> [NETFILTER]: Add NAT support for nf_conntrack

Add NAT support for nf_conntrack. Joint work of Jozsef Kadlecsik,
Yasuyuki Kozakai, Martin Josefsson and myself.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
d2483ddefd38b06053cdce7206382ca61f6282b1 03-Dec-2006 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack: add module aliases to IPv4 conntrack names

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9457d851fc5df54522d733f72cbb1f02ab59272e 03-Dec-2006 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack: automatic helper assignment for expectations

Some helpers (namely H.323) manually assign further helpers to expected
connections. This is not possible with nf_conntrack anymore since we
need to know whether a helper is used at allocation time.

Handle the helper assignment centrally, which allows to perform the
correct allocation and as a nice side effect eliminates the need
for the H.323 helper to fiddle with nf_conntrack_lock.

Mid term the allocation scheme really needs to be redesigned since
we do both the helper and expectation lookup _twice_ for every new
connection.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
bff9a89bcac5b68ac0a1ea856b1726a35ae1eabb 03-Dec-2006 Patrick McHardy <kaber@trash.net> [NETFILTER]: nf_conntrack: endian annotations

Resync with Al Viro's ip_conntrack annotations and fix a missed
spot in ip_nat_proto_icmp.c.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7b621c1ea64a54f77b8a841b16dc4c9fee3ecf48 29-Nov-2006 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: ctnetlink: rework conntrack fields dumping logic on events

| NEW | UPDATE | DESTROY |
----------------------------------------|
tuples | Y | Y | Y |
status | Y | Y | N |
timeout | Y | Y | N |
protoinfo | S | S | N |
helper | S | S | N |
mark | S | S | N |
counters | F | F | Y |

Leyend:
Y: yes
N: no
S: iif the field is set
F: iif overflow

This patch also replace IPCT_HELPINFO by IPCT_HELPER since we want to
track the helper assignation process, not the changes in the private
information held by the helper.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
bbb3357d14f6becd156469220992ef7ab0f10e69 29-Nov-2006 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: ctnetlink: check for status attribute existence on conntrack creation

Check that status flags are available in the netlink message received
to create a new conntrack.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
468ec44bd5a863736d955f78b8c38896f26864a1 29-Nov-2006 Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> [NETFILTER]: conntrack: add '_get' to {ip, nf}_conntrack_expect_find

We usually uses 'xxx_find_get' for function which increments
reference count.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
605dcad6c85226e6d43387917b329d65b95cef39 29-Nov-2006 Martin Josefsson <gandalf@wlug.westbo.se> [NETFILTER]: nf_conntrack: rename struct nf_conntrack_protocol

Rename 'struct nf_conntrack_protocol' to 'struct nf_conntrack_l4proto' in
order to help distinguish it from 'struct nf_conntrack_l3proto'. It gets
rather confusing with 'nf_conntrack_protocol'.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
77ab9cff0f4112703df3ef7903c1a15adb967114 29-Nov-2006 Martin Josefsson <gandalf@wlug.westbo.se> [NETFILTER]: nf_conntrack: split out expectation handling

This patch splits out expectation handling into its own file
nf_conntrack_expect.c

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
4e9b82693542003b028c8494e9e3c49615b91ce7 27-Nov-2006 Thomas Graf <tgraf@suug.ch> [NETLINK]: Remove unused dst_pid field in netlink_skb_parms

The destination PID is passed directly to netlink_unicast()
respectively netlink_multicast().

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
c537b75a3ba9f5d2569f313742cd379dff6ceb70 27-Nov-2006 Patrick McHardy <kaber@trash.net> [NETFILTER]: ctnetlink: fix reference count leak

When NFA_NEST exceeds the skb size the protocol reference is leaked.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
dafc741cf23351a6f43895579a72ab8818ba00ae 27-Nov-2006 Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> [NETFILTER]: nfctnetlink: assign helper to newly created conntrack

This fixes the bug which doesn't assign helper to newly created
conntrack via nf_conntrack_netlink.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
9ea8cfd6aa74e710f0cb0731ecb9dee53fbebfb9 12-Oct-2006 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: ctnetlink: Remove debugging messages

Remove (compilation-breaking) debugging messages introduced at early
development stage.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
01f348484dd8509254d045e3ad49029716eca6a1 20-Sep-2006 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: ctnetlink: simplify the code to dump the conntrack table

Merge the bits to dump the conntrack table and the ones to dump and
zero counters in a single piece of code. This patch does not change
the default behaviour if accounting is not enabled.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
1a31526baeed30aaa70503cee0ab281f78cae0d6 22-Aug-2006 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: ctnetlink: remove impossible events tests for updates

IPCT_HELPER and IPCT_NATINFO bits are never set on updates.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
b3a27bfba51d445784eb0cd6451b73a73fb69cf9 22-Aug-2006 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: ctnetlink: check for listeners before sending expectation events

This patch uses nfnetlink_has_listeners to check for listeners in
userspace.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
b9a37e0c81c498be2db9f52063c53e55d76c815e 22-Aug-2006 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: ctnetlink: dump connection mark

ctnetlink dumps the mark iif the event mark happened

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
d205dc40798d97d63ad348bfaf7394f445d152d4 18-Aug-2006 Patrick McHardy <kaber@trash.net> [NETFILTER]: ctnetlink: fix deadlock in table dumping

ip_conntrack_put must not be called while holding ip_conntrack_lock
since destroy_conntrack takes it again.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
40a839fdbd5d76cebb2a61980bc1fc7ecd784be2 27-Jun-2006 Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp> [NETFILTER]: nf_conntrack: Fix undefined references to local_bh_*

CC net/netfilter/nf_conntrack_proto_sctp.o
net/netfilter/nf_conntrack_proto_sctp.c: In function `sctp_print_conntrack':
net/netfilter/nf_conntrack_proto_sctp.c:206: warning: implicit declaration of function `local_bh_disable'
net/netfilter/nf_conntrack_proto_sctp.c:208: warning: implicit declaration of function `local_bh_enable'
CC net/netfilter/nf_conntrack_netlink.o
net/netfilter/nf_conntrack_netlink.c: In function `ctnetlink_dump_table':
net/netfilter/nf_conntrack_netlink.c:429: warning: implicit declaration of function `local_bh_disable'
net/netfilter/nf_conntrack_netlink.c:452: warning: implicit declaration of function `local_bh_enable'

Spotted by Toralf Förster

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
89f2e21883b59a6ff1e64d0b4924d06b1c6101ba 30-May-2006 Patrick McHardy <kaber@trash.net> [NETFILTER]: ctnetlink: change table dumping not to require an unique ID

Instead of using the ID to find out where to continue dumping, take a
reference to the last entry dumped and try to continue there.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
3726add76643c715d437aceda320d319153b6113 30-May-2006 Patrick McHardy <kaber@trash.net> [NETFILTER]: ctnetlink: fix NAT configuration

The current configuration only allows to configure one manip and overloads
conntrack status flags with netlink semantic.

Signed-off-by: Patrick Mchardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
e64a70be5175ac2c209fa742123a6ce845852e0e 01-Apr-2006 Martin Josefsson <gandalf@wlug.westbo.se> [NETFILTER]: {ip,nf}_conntrack_netlink: fix expectation notifier unregistration

This patch fixes expectation notifier unregistration on module unload to
use ip_conntrack_expect_unregister_notifier(). This bug causes a soft
lockup at the first expectation created after a rmmod ; insmod of this
module.

Should go into -stable as well.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
bcd1e830a5ac37d708647d492a1436a8a9babb07 01-Apr-2006 Martin Josefsson <gandalf@wlug.westbo.se> [NETFILTER]: fix ifdef for connmark support in nf_conntrack_netlink

When ctnetlink was ported from ip_conntrack to nf_conntrack two #ifdef's
for connmark support were left unchanged and this code was never
compiled.

Problem noticed by Daniel De Graaf.

Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
1cde64365b0c4f576f8f45b834e6a6de081b5914 22-Mar-2006 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: ctnetlink: Fix expectaction mask dumping

The expectation mask has some particularities that requires a different
handling. The protocol number fields can be set to non-valid protocols,
ie. l3num is set to 0xFFFF. Since that protocol does not exist, the mask
tuple will not be dumped. Moreover, this results in a kernel panic when
nf_conntrack accesses the array of protocol handlers, that is PF_MAX (0x1F)
long.

This patch introduces the function ctnetlink_exp_dump_mask, that correctly
dumps the expectation mask. Such function uses the l3num value from the
expectation tuple that is a valid layer 3 protocol number. The value of the
l3num mask isn't dumped since it is meaningless from the userspace side.

Thanks to Yasuyuki Kozakai and Patrick McHardy for the feedback.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
a24276924875802853b5bdc12c56d29f1c1bbc79 21-Mar-2006 Patrick McHardy <kaber@trash.net> [NETFILTER]: ctnetlink: avoid unneccessary event message generation

Avoid unneccessary event message generation by checking for netlink
listeners before building a message.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
dc808fe28db59fadf4ec32d53f62477fa28f3be8 21-Mar-2006 Harald Welte <laforge@netfilter.org> [NETFILTER] nf_conntrack: clean up to reduce size of 'struct nf_conn'

This patch moves all helper related data fields of 'struct nf_conn'
into a separate structure 'struct nf_conn_help'. This new structure
is only present in conntrack entries for which we actually have a
helper loaded.

Also, this patch cleans up the nf_conntrack 'features' mechanism to
resemble what the original idea was: Just glue the feature-specific
data structures at the end of 'struct nf_conn', and explicitly
re-calculate the pointer to it when needed rather than keeping
pointers around.

Saves 20 bytes per conntrack on my x86_64 box. A non-helped conntrack
is 276 bytes. We still need to save another 20 bytes in order to fit
into to target of 256bytes.

Signed-off-by: Harald Welte <laforge@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
34f9a2e4deb760ddcb94cd0cd4f9ce18070d53d9 04-Feb-2006 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: ctnetlink: add MODULE_ALIAS for expectation subsystem

Add load-on-demand support for expectation request. eg. conntrack -L expect

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
b633ad5fbf9e534142208700c58a530a4091eaab 04-Feb-2006 Marcus Sundberg <marcus@ingate.com> [NETFILTER]: ctnetlink: Fix subsystem used for expectation events

The ctnetlink expectation events should use the NFNL_SUBSYS_CTNETLINK_EXP
subsystem, not NFNL_SUBSYS_CTNETLINK.

Signed-off-by: Marcus Sundberg <marcus@ingate.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
87711cb81c33e75fe8c95137fe62c8d462ff781c 05-Jan-2006 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: Filter dumped entries based on the layer 3 protocol number

Dump entries of a given Layer 3 protocol number.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
c1d10adb4a521de5760112853f42aaeefcec96eb 05-Jan-2006 Pablo Neira Ayuso <pablo@netfilter.org> [NETFILTER]: Add ctnetlink port for nf_conntrack

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>