History log of /net/netfilter/ipvs/ip_vs_app.c
Revision Date Author Comments
ac69269a45e84c1772dcb9e77db976a932f4af22 22-Mar-2013 Julian Anastasov <ja@ssi.bg> ipvs: do not disable bh for long time

We used a global BH disable in LOCAL_OUT hook.
Add _bh suffix to all places that need it and remove
the disabling from LOCAL_OUT and sync code.

Functions like ip_defrag need protection from
BH, so add it. As for nf_nat_mangle_tcp_packet, it needs
RCU lock.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
363c97d7435ebba8a040f86e29bdec79ee182f0c 21-Mar-2013 Julian Anastasov <ja@ssi.bg> ipvs: convert app locks

We use locks like tcp_app_lock, udp_app_lock,
sctp_app_lock to protect access to the protocol hash tables
from readers in packet context while the application
instances (inc) are [un]registered under global mutex.

As the hash tables are mostly read when conns are
created and bound to app, use RCU for readers and reclaim
app instance after grace period.

Simplify ip_vs_app_inc_get because we use usecnt
only for statistics and rely on module refcounting.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
ece31ffd539e8e2b586b1ca5f50bc4f4591e3893 18-Feb-2013 Gao feng <gaofeng@cn.fujitsu.com> net: proc: change proc_net_remove to remove_proc_entry

proc_net_remove is only used to remove proc entries
that under /proc/net,it's not a general function for
removing proc entries of netns. if we want to remove
some proc entries which under /proc/net/stat/, we still
need to call remove_proc_entry.

this patch use remove_proc_entry to replace proc_net_remove.
we can remove proc_net_remove after this patch.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
d4beaa66add8aebf83ab16d2fde4e4de8dac36df 18-Feb-2013 Gao feng <gaofeng@cn.fujitsu.com> net: proc: change proc_net_fops_create to proc_create

Right now, some modules such as bonding use proc_create
to create proc entries under /proc/net/, and other modules
such as ipv4 use proc_net_fops_create.

It looks a little chaos.this patch changes all of
proc_net_fops_create to proc_create. we can remove
proc_net_fops_create after this patch.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
be97fdb5fbcc828240c51769cd28cba609158703 12-Jul-2012 Julian Anastasov <ja@ssi.bg> ipvs: generalize app registration in netns

Get rid of the ftp_app pointer and allow applications
to be registered without adding fields in the netns_ipvs structure.

v2: fix coding style as suggested by Pablo Neira Ayuso <pablo@netfilter.org>

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
95c961747284a6b83a5e2d81240e214b0fa3464d 15-Apr-2012 Eric Dumazet <eric.dumazet@gmail.com> net: cleanup unsigned to unsigned int

Use of "unsigned int" is preferred to bare "unsigned" in net tree.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9ffc93f203c18a70623f21950f1dd473c9ec48cd 28-Mar-2012 David Howells <dhowells@redhat.com> Remove all #inclusions of asm/system.h

Remove all #inclusions of asm/system.h preparatory to splitting and killing
it. Performed with the following command:

perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *`

Signed-off-by: David Howells <dhowells@redhat.com>
6c8f7949931854be360fcc7f008f2672dc17996f 13-Jun-2011 Hans Schillstrom <hans.schillstrom@ericsson.com> IPVS: remove unused init and cleanup functions.

After restructuring, there is some unused or empty functions
left to be removed.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
503cf15a5ecc0f3f7a05ffe04c89fb7496100ee7 01-May-2011 Hans Schillstrom <hans@schillstrom.com> IPVS: rename of netns init and cleanup functions.

Make it more clear what the functions does,
on request by Julian.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
0f08190fe8af3cdb6ba19690eb0fa253ecef4bde 15-May-2011 Hans Schillstrom <hans.schillstrom@ericsson.com> IPVS: fix netns if reading ip_vs_* procfs entries

Without this patch every access to ip_vs in procfs will increase
the netns count i.e. an unbalanced get_net()/put_net().
(ipvsadm commands also use procfs.)
The result is you can't exit a netns if reading ip_vs_* procfs entries.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
7a4f0761fce32ff4918a7c23b08db564ad33092d 03-May-2011 Hans Schillstrom <hans@schillstrom.com> IPVS: init and cleanup restructuring

DESCRIPTION
This patch tries to restore the initial init and cleanup
sequences that was before namspace patch.
Netns also requires action when net devices unregister
which has never been implemented. I.e this patch also
covers when a device moves into a network namespace,
and has to be released.

IMPLEMENTATION
The number of calls to register_pernet_device have been
reduced to one for the ip_vs.ko
Schedulers still have their own calls.

This patch adds a function __ip_vs_service_cleanup()
and an enable flag for the netfilter hooks.

The nf hooks will be enabled when the first service is loaded
and never disabled again, except when a namespace exit starts.

Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
74973f6fbfcd1b084c3ccc75b783a6dacac94a10 03-May-2011 Hans Schillstrom <hans@schillstrom.com> IPVS: init and cleanup restructuring

DESCRIPTION
This patch tries to restore the initial init and cleanup
sequences that was before namspace patch.
Netns also requires action when net devices unregister
which has never been implemented. I.e this patch also
covers when a device moves into a network namespace,
and has to be released.

IMPLEMENTATION
The number of calls to register_pernet_device have been
reduced to one for the ip_vs.ko
Schedulers still have their own calls.

This patch adds a function __ip_vs_service_cleanup()
and an enable flag for the netfilter hooks.

The nf hooks will be enabled when the first service is loaded
and never disabled again, except when a namespace exit starts.

Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
736561a01f11114146b1b7f82d486fa9c95828ef 21-Mar-2011 Simon Horman <horms@verge.net.au> IPVS: Use global mutex in ip_vs_app.c

As part of the work to make IPVS network namespace aware
__ip_vs_app_mutex was replaced by a per-namespace lock,
ipvs->app_mutex. ipvs->app_key is also supplied for debugging purposes.

Unfortunately this implementation results in ipvs->app_key residing
in non-static storage which at the very least causes a lockdep warning.

This patch takes the rather heavy-handed approach of reinstating
__ip_vs_app_mutex which will cover access to the ipvs->list_head
of all network namespaces.

[ 12.610000] IPVS: Creating netns size=2456 id=0
[ 12.630000] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
[ 12.640000] BUG: key ffff880003bbf1a0 not in .data!
[ 12.640000] ------------[ cut here ]------------
[ 12.640000] WARNING: at kernel/lockdep.c:2701 lockdep_init_map+0x37b/0x570()
[ 12.640000] Hardware name: Bochs
[ 12.640000] Pid: 1, comm: swapper Tainted: G W 2.6.38-kexec-06330-g69b7efe-dirty #122
[ 12.650000] Call Trace:
[ 12.650000] [<ffffffff8102e685>] warn_slowpath_common+0x75/0xb0
[ 12.650000] [<ffffffff8102e6d5>] warn_slowpath_null+0x15/0x20
[ 12.650000] [<ffffffff8105967b>] lockdep_init_map+0x37b/0x570
[ 12.650000] [<ffffffff8105829d>] ? trace_hardirqs_on+0xd/0x10
[ 12.650000] [<ffffffff81055ad8>] debug_mutex_init+0x38/0x50
[ 12.650000] [<ffffffff8104bc4c>] __mutex_init+0x5c/0x70
[ 12.650000] [<ffffffff81685ee7>] __ip_vs_app_init+0x64/0x86
[ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[ 12.660000] [<ffffffff811b1c33>] T.620+0x43/0x170
[ 12.660000] [<ffffffff811b1e9a>] ? register_pernet_subsys+0x1a/0x40
[ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[ 12.660000] [<ffffffff811b1db7>] register_pernet_operations+0x57/0xb0
[ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[ 12.670000] [<ffffffff811b1ea9>] register_pernet_subsys+0x29/0x40
[ 12.670000] [<ffffffff81685f19>] ip_vs_app_init+0x10/0x12
[ 12.670000] [<ffffffff81685a87>] ip_vs_init+0x4c/0xff
[ 12.670000] [<ffffffff8166562c>] do_one_initcall+0x7a/0x12e
[ 12.670000] [<ffffffff8166583e>] kernel_init+0x13e/0x1c2
[ 12.670000] [<ffffffff8128c134>] kernel_thread_helper+0x4/0x10
[ 12.670000] [<ffffffff8128ad40>] ? restore_args+0x0/0x30
[ 12.680000] [<ffffffff81665700>] ? kernel_init+0x0/0x1c2
[ 12.680000] [<ffffffff8128c130>] ? kernel_thread_helper+0x0/0x1global0

Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c6d2d445d8dee04cde47eb4021636399a4239e9f 03-Jan-2011 Hans Schillstrom <hans.schillstrom@ericsson.com> IPVS: netns, final patch enabling network name space.

all init_net removed, (except for some alloc related
that needs to be there)

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
ab8a5e8408c3df2d654611bffc3aaf04f418b266 03-Jan-2011 Hans Schillstrom <hans.schillstrom@ericsson.com> IPVS: netns awareness to ip_vs_app

All variables moved to struct ipvs,
most external changes fixed (i.e. init_net removed)

in ip_vs_protocol param struct net *net added to:
- register_app()
- unregister_app()
This affected almost all proto_xxx.c files

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
61b1ab4583e275af216c8454b9256de680499b19 03-Jan-2011 Hans Schillstrom <hans.schillstrom@ericsson.com> IPVS: netns, add basic init per netns.

Preparation for network name-space init, in this stage
some empty functions exists.

In most files there is a check if it is root ns i.e. init_net
if (!net_eq(net, &init_net))
return ...
this will be removed by the last patch, when enabling name-space.

*v3
ip_vs_conn.c merge error corrected.
net_ipvs #ifdef removed as sugested by Jan Engelhardt

[ horms@verge.net.au: Removed whitespace-change-only hunks ]
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
26c15cfd291f8b4ee40b4bbdf5e3772adfd704f5 21-Sep-2010 Julian Anastasov <ja@ssi.bg> ipvs: changes related to service usecnt

Change the usage of svc usecnt during command execution:

- we check if svc is registered but we do not need to hold usecnt
reference while under __ip_vs_mutex, only the packet handling needs
it during scheduling

- change __ip_vs_service_get to __ip_vs_service_find and
__ip_vs_svc_fwm_get to __ip_vs_svc_fwm_find because now caller
will increase svc->usecnt

- put common code that calls update_service in __ip_vs_update_dest

- put common code in ip_vs_unlink_service() and use it to unregister
the service

- add comment that svc should not be accessed after ip_vs_del_service
anymore

- all IP_VS_WAIT_WHILE calls are now unified: usecnt > 0

- Properly log the app ports

As result, some problems are fixed:

- possible use-after-free of svc in ip_vs_genl_set_cmd after
ip_vs_del_service because our usecnt reference does not guarantee that
svc is not freed on refcnt==0, eg. when no dests are moved to trash

- possible usecnt leak in do_ip_vs_set_ctl after ip_vs_del_service
when the service is not freed now, for example, when some
destionations are moved into trash and svc->refcnt remains above 0.
It is harmless because svc is not in hash anymore.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
7f1c407579519e71a0dcadc05614fd98acec585e 23-Jul-2010 Hannes Eder <heder@google.com> IPVS: make FTP work with full NAT support

Use nf_conntrack/nf_nat code to do the packet mangling and the TCP
sequence adjusting. The function 'ip_vs_skb_replace' is now dead
code, so it is removed.

To SNAT FTP, use something like:

% iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \
--vport 21 -j SNAT --to-source 192.168.10.10
and for the data connections in passive mode:

% iptables -t nat -A POSTROUTING -m ipvs --vaddr 192.168.100.30/32 \
--vportctl 21 -j SNAT --to-source 192.168.10.10
using '-m state --state RELATED' would also works.

Make sure the kernel modules ip_vs_ftp, nf_conntrack_ftp, and
nf_nat_ftp are loaded.

[ up-port and minor fixes by Simon Horman <horms@verge.net.au> ]
Signed-off-by: Hannes Eder <heder@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
5a0e3ad6af8660be21ca98a971cd00f331318c05 24-Mar-2010 Tejun Heo <tj@kernel.org> include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
1e3e238e9c4bf9987b19185235cd0cdc21ea038c 02-Aug-2009 Hannes Eder <heder@google.com> IPVS: use pr_err and friends instead of IP_VS_ERR and friends

Since pr_err and friends are used instead of printk there is no point
in keeping IP_VS_ERR and friends. Furthermore make use of '__func__'
instead of hard coded function names.

Signed-off-by: Hannes Eder <heder@google.com>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
9aada7ac047f789ffb27540cc1695989897b2dfe 30-Jul-2009 Hannes Eder <heder@google.com> IPVS: use pr_fmt

While being at it cleanup whitespace.

Signed-off-by: Hannes Eder <heder@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cb7f6a7b716e801097b564dec3ccb58d330aef56 19-Sep-2008 Julius Volz <juliusv@google.com> IPVS: Move IPVS to net/netfilter/ipvs

Since IPVS now has partial IPv6 support, this patch moves IPVS from
net/ipv4/ipvs to net/netfilter/ipvs. It's a result of:

$ git mv net/ipv4/ipvs net/netfilter

and adapting the relevant Kconfigs/Makefiles to the new path.

Signed-off-by: Julius Volz <juliusv@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>