History log of /net/sctp/sysctl.c
Revision Date Author Comments
eaea2da7286ebc56d557b40ad7e59e715a84e4a0 30-Jun-2014 Daniel Borkmann <dborkman@redhat.com> net: sctp: only warn in proc_sctp_do_alpha_beta if write

Only warn if the value is written to alpha or beta. We don't care
emitting a one-time warning when only reading it.

Reported-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
24599e61b7552673dd85971cf5a35369cd8c119e 18-Jun-2014 Daniel Borkmann <dborkman@redhat.com> net: sctp: check proc_dointvec result in proc_sctp_do_auth

When writing to the sysctl field net.sctp.auth_enable, it can well
be that the user buffer we handed over to proc_dointvec() via
proc_sctp_do_auth() handler contains something other than integers.

In that case, we would set an uninitialized 4-byte value from the
stack to net->sctp.auth_enable that can be leaked back when reading
the sysctl variable, and it can unintentionally turn auth_enable
on/off based on the stack content since auth_enable is interpreted
as a boolean.

Fix it up by making sure proc_dointvec() returned sucessfully.

Fixes: b14878ccb7fa ("net: sctp: cache auth_enable per endpoint")
Reported-by: Florian Westphal <fwestpha@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ff5e92c1affe7166b3f6e7073e648ed65a6e2e59 19-Jun-2014 Daniel Borkmann <dborkman@redhat.com> net: sctp: propagate sysctl errors from proc_do* properly

sysctl handler proc_sctp_do_hmac_alg(), proc_sctp_do_rto_min() and
proc_sctp_do_rto_max() do not properly reflect some error cases
when writing values via sysctl from internal proc functions such
as proc_dointvec() and proc_dostring().

In all these cases we pass the test for write != 0 and partially
do additional work just to notice that additional sanity checks
fail and we return with hard-coded -EINVAL while proc_do*
functions might also return different errors. So fix this up by
simply testing a successful return of proc_do* right after
calling it.

This also allows to propagate its return value onwards to the user.
While touching this, also fix up some minor style issues.

Fixes: 4f3fdf3bc59c ("sctp: add check rto_min and rto_max in sysctl")
Fixes: 3c68198e7511 ("sctp: Make hmac algorithm selection for cookie generation dynamic")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b58537a1f5629bdc98a8b9dc2051ce0e952f6b4b 15-Jun-2014 Daniel Borkmann <dborkman@redhat.com> net: sctp: fix permissions for rto_alpha and rto_beta knobs

Commit 3fd091e73b81 ("[SCTP]: Remove multiple levels of msecs
to jiffies conversions.") has silently changed permissions for
rto_alpha and rto_beta knobs from 0644 to 0444. The purpose of
this was to discourage users from tweaking rto_alpha and
rto_beta knobs in production environments since they are key
to correctly compute rtt/srtt.

RFC4960 under section 6.3.1. RTO Calculation says regarding
rto_alpha and rto_beta under rule C3 and C4:

[...]
C3) When a new RTT measurement R' is made, set

RTTVAR <- (1 - RTO.Beta) * RTTVAR + RTO.Beta * |SRTT - R'|

and

SRTT <- (1 - RTO.Alpha) * SRTT + RTO.Alpha * R'

Note: The value of SRTT used in the update to RTTVAR
is its value before updating SRTT itself using the
second assignment. After the computation, update
RTO <- SRTT + 4 * RTTVAR.

C4) When data is in flight and when allowed by rule C5
below, a new RTT measurement MUST be made each round
trip. Furthermore, new RTT measurements SHOULD be
made no more than once per round trip for a given
destination transport address. There are two reasons
for this recommendation: First, it appears that
measuring more frequently often does not in practice
yield any significant benefit [ALLMAN99]; second,
if measurements are made more often, then the values
of RTO.Alpha and RTO.Beta in rule C3 above should be
adjusted so that SRTT and RTTVAR still adjust to
changes at roughly the same rate (in terms of how many
round trips it takes them to reflect new values) as
they would if making only one measurement per
round-trip and using RTO.Alpha and RTO.Beta as given
in rule C3. However, the exact nature of these
adjustments remains a research issue.
[...]

While it is discouraged to adjust rto_alpha and rto_beta
and not further specified how to adjust them, the RFC also
doesn't explicitly forbid it, but rather gives a RECOMMENDED
default value (rto_alpha=3, rto_beta=2). We have a couple
of users relying on the old permissions before they got
changed. That said, if someone really has the urge to adjust
them, we could allow it with a warning in the log.

Fixes: 3fd091e73b81 ("[SCTP]: Remove multiple levels of msecs to jiffies conversions.")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
f66138c8471442c24c58cdce6ba5f36c5ce93d7a 08-May-2014 wangweidong <wangweidong1@huawei.com> sctp: add a checking for sctp_sysctl_net_register

When register_net_sysctl failed, we should free the
sysctl_table.

Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
eb9f37053db5ea11101825d97442cf88754a48d1 08-May-2014 wangweidong <wangweidong1@huawei.com> Revert "sctp: optimize the sctp_sysctl_net_register"

This revert commit efb842c45("sctp: optimize the sctp_sysctl_net_register"),
Since it doesn't kmemdup a sysctl_table for init_net, so the
init_net->sctp.sysctl_header->ctl_table_arg points to sctp_net_table
which is a static array pointer. So when doing sctp_sysctl_net_unregister,
it will free sctp_net_table, then we will get a NULL pointer dereference
like that:

[ 262.948220] BUG: unable to handle kernel NULL pointer dereference at 000000000000006c
[ 262.948232] IP: [<ffffffff81144b70>] kfree+0x80/0x420
[ 262.948260] PGD db80a067 PUD dae12067 PMD 0
[ 262.948268] Oops: 0000 [#1] SMP
[ 262.948273] Modules linked in: sctp(-) crc32c_generic libcrc32c
...
[ 262.948338] task: ffff8800db830190 ti: ffff8800dad00000 task.ti: ffff8800dad00000
[ 262.948344] RIP: 0010:[<ffffffff81144b70>] [<ffffffff81144b70>] kfree+0x80/0x420
[ 262.948353] RSP: 0018:ffff8800dad01d88 EFLAGS: 00010046
[ 262.948358] RAX: 0100000000000000 RBX: ffffffffa0227940 RCX: ffffea0000707888
[ 262.948363] RDX: ffffea0000707888 RSI: 0000000000000001 RDI: ffffffffa0227940
[ 262.948369] RBP: ffff8800dad01de8 R08: 0000000000000000 R09: ffff8800d9e983a9
[ 262.948374] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0227940
[ 262.948380] R13: ffffffff8187cfc0 R14: 0000000000000000 R15: ffffffff8187da10
[ 262.948386] FS: 00007fa2a2658700(0000) GS:ffff880112800000(0000) knlGS:0000000000000000
[ 262.948394] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 262.948400] CR2: 000000000000006c CR3: 00000000cddc0000 CR4: 00000000000006e0
[ 262.948410] Stack:
[ 262.948413] ffff8800dad01da8 0000000000000286 0000000020227940 ffffffffa0227940
[ 262.948422] ffff8800dad01dd8 ffffffff811b7fa1 ffffffffa0227940 ffffffffa0227940
[ 262.948431] ffffffff8187d960 ffffffff8187cfc0 ffffffff8187d960 ffffffff8187da10
[ 262.948440] Call Trace:
[ 262.948457] [<ffffffff811b7fa1>] ? unregister_sysctl_table+0x51/0xa0
[ 262.948476] [<ffffffffa020d1a1>] sctp_sysctl_net_unregister+0x21/0x30 [sctp]
[ 262.948490] [<ffffffffa020ef6d>] sctp_net_exit+0x12d/0x150 [sctp]
[ 262.948512] [<ffffffff81394f49>] ops_exit_list+0x39/0x60
[ 262.948522] [<ffffffff813951ed>] unregister_pernet_operations+0x3d/0x70
[ 262.948530] [<ffffffff81395292>] unregister_pernet_subsys+0x22/0x40
[ 262.948544] [<ffffffffa020efcc>] sctp_exit+0x3c/0x12d [sctp]
[ 262.948562] [<ffffffff810c5e04>] SyS_delete_module+0x194/0x210
[ 262.948577] [<ffffffff81240fde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 262.948587] [<ffffffff815217a2>] system_call_fastpath+0x16/0x1b

With this revert, it won't occur the Oops.

Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b14878ccb7fac0242db82720b784ab62c467c0dc 17-Apr-2014 Vlad Yasevich <vyasevic@redhat.com> net: sctp: cache auth_enable per endpoint

Currently, it is possible to create an SCTP socket, then switch
auth_enable via sysctl setting to 1 and crash the system on connect:

Oops[#1]:
CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.1-mipsgit-20140415 #1
task: ffffffff8056ce80 ti: ffffffff8055c000 task.ti: ffffffff8055c000
[...]
Call Trace:
[<ffffffff8043c4e8>] sctp_auth_asoc_set_default_hmac+0x68/0x80
[<ffffffff8042b300>] sctp_process_init+0x5e0/0x8a4
[<ffffffff8042188c>] sctp_sf_do_5_1B_init+0x234/0x34c
[<ffffffff804228c8>] sctp_do_sm+0xb4/0x1e8
[<ffffffff80425a08>] sctp_endpoint_bh_rcv+0x1c4/0x214
[<ffffffff8043af68>] sctp_rcv+0x588/0x630
[<ffffffff8043e8e8>] sctp6_rcv+0x10/0x24
[<ffffffff803acb50>] ip6_input+0x2c0/0x440
[<ffffffff8030fc00>] __netif_receive_skb_core+0x4a8/0x564
[<ffffffff80310650>] process_backlog+0xb4/0x18c
[<ffffffff80313cbc>] net_rx_action+0x12c/0x210
[<ffffffff80034254>] __do_softirq+0x17c/0x2ac
[<ffffffff800345e0>] irq_exit+0x54/0xb0
[<ffffffff800075a4>] ret_from_irq+0x0/0x4
[<ffffffff800090ec>] rm7k_wait_irqoff+0x24/0x48
[<ffffffff8005e388>] cpu_startup_entry+0xc0/0x148
[<ffffffff805a88b0>] start_kernel+0x37c/0x398
Code: dd0900b8 000330f8 0126302d <dcc60000> 50c0fff1 0047182a a48306a0
03e00008 00000000
---[ end trace b530b0551467f2fd ]---
Kernel panic - not syncing: Fatal exception in interrupt

What happens while auth_enable=0 in that case is, that
ep->auth_hmacs is initialized to NULL in sctp_auth_init_hmacs()
when endpoint is being created.

After that point, if an admin switches over to auth_enable=1,
the machine can crash due to NULL pointer dereference during
reception of an INIT chunk. When we enter sctp_process_init()
via sctp_sf_do_5_1B_init() in order to respond to an INIT chunk,
the INIT verification succeeds and while we walk and process
all INIT params via sctp_process_param() we find that
net->sctp.auth_enable is set, therefore do not fall through,
but invoke sctp_auth_asoc_set_default_hmac() instead, and thus,
dereference what we have set to NULL during endpoint
initialization phase.

The fix is to make auth_enable immutable by caching its value
during endpoint initialization, so that its original value is
being carried along until destruction. The bug seems to originate
from the very first days.

Fix in joint work with Daniel Borkmann.

Reported-by: Joshua Kinard <kumba@gentoo.org>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Joshua Kinard <kumba@gentoo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
efb842c45e667332774196bd5a1d539d048159cc 12-Feb-2014 wangweidong <wangweidong1@huawei.com> sctp: optimize the sctp_sysctl_net_register

Here, when the net is init_net, we needn't to kmemdup the ctl_table
again. So add a check for net. Also we can save some memory.

Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
22a1f5140ed69baa5906ce9b2b9143478f5a4da0 12-Feb-2014 wangweidong <wangweidong1@huawei.com> sctp: fix a missed .data initialization

As commit 3c68198e75111a90("sctp: Make hmac algorithm selection for
cookie generation dynamic"), we miss the .data initialization.
If we don't use the net_namespace, the problem that parts of the
sysctl configuration won't be isolation and won't occur.

In sctp_sysctl_net_register(), we register the sysctl for each
net, in the for(), we use the 'table[i].data' as check condition, so
when the 'i' is the index of sctp_hmac_alg, the data is NULL, then
break. So add the .data initialization.

Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
26ac8e5fe1562831e68ccd9f7057aade37aab2a3 22-Dec-2013 wangweidong <wangweidong1@huawei.com> sctp: fix checkpatch errors with (foo*)|foo * bar|foo* bar

fix checkpatch errors below:
ERROR: "(foo*)" should be "(foo *)"
ERROR: "foo * bar" should be "foo *bar"
ERROR: "foo* bar" should be "foo *bar"

Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b486b2289e40797e386a18048a66b535206a463b 11-Dec-2013 wangweidong <wangweidong1@huawei.com> sctp: fix up a spacing

fix up spacing of proc_sctp_do_hmac_alg for according to the
proc_sctp_do_rto_min[max] in sysctl.c

Suggested-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4f3fdf3bc59cafd14c3bc2c2369efad34c7aa8b5 11-Dec-2013 wangweidong <wangweidong1@huawei.com> sctp: add check rto_min and rto_max in sysctl

rto_min should be smaller than rto_max while rto_max should be larger
than rto_min. Add two proc_handler for the checking.

Suggested-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Wang Weidong <wangweidong1@huawei.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4b2f13a25133b115eb56771bd4a8e71a82aea968 06-Dec-2013 Jeff Kirsher <jeffrey.t.kirsher@intel.com> sctp: Fix FSF address in file headers

Several files refer to an old address for the Free Software Foundation
in the file header comment. Resolve by replacing the address with
the URL <http://www.gnu.org/licenses/> so that we do not have to keep
updating the header comments anytime the address changes.

CC: Vlad Yasevich <vyasevich@gmail.com>
CC: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
71acc0ddd499cc323199fb1ae350ce9ea0744352 09-Aug-2013 David S. Miller <davem@davemloft.net> Revert "net: sctp: convert sctp_checksum_disable module param into sctp sysctl"

This reverts commit cda5f98e36576596b9230483ec52bff3cc97eb21.

As per Vlad's request.

Signed-off-by: David S. Miller <davem@davemloft.net>
477143e3fece3dc12629bb1ebd7b47e8e6e72b2b 06-Aug-2013 Daniel Borkmann <dborkman@redhat.com> net: sctp: trivial: update bug report in header comment

With the restructuring of the lksctp.org site, we only allow bug
reports through the SCTP mailing list linux-sctp@vger.kernel.org,
not via SF, as SF is only used for web hosting and nothing more.
While at it, also remove the obvious statement that bugs will be
fixed and incooperated into the kernel.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cda5f98e36576596b9230483ec52bff3cc97eb21 06-Aug-2013 Daniel Borkmann <dborkman@redhat.com> net: sctp: convert sctp_checksum_disable module param into sctp sysctl

Get rid of the last module parameter for SCTP and make this
configurable via sysctl for SCTP like all the rest of SCTP's
configuration knobs.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
91705c61b52029ab5da67a15a23eef08667bf40e 23-Jul-2013 Daniel Borkmann <dborkman@redhat.com> net: sctp: trivial: update mailing list address

The SCTP mailing list address to send patches or questions
to is linux-sctp@vger.kernel.org and not
lksctp-developers@lists.sourceforge.net anymore. Therefore,
update all occurences.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fe2c6338fd2c6f383c4d4164262f35c8f3708e1f 12-Jun-2013 Joe Perches <joe@perches.com> net: Convert uses of typedef ctl_table to struct ctl_table

Reduce the uses of this unnecessary typedef.

Done via perl script:

$ git grep --name-only -w ctl_table net | \
xargs perl -p -i -e '\
sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \
s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge'

Reflow the modified lines that now exceed 80 columns.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5f19d1219a5b96c7b00ad5c3f889030093a8d1a3 24-Jan-2013 Vlad Yasevich <vyasevich@gmail.com> SCTP: Free the per-net sysctl table on net exit. v2

Per-net sysctl table needs to be explicitly freed at
net exit. Otherwise we see the following with kmemleak:

unreferenced object 0xffff880402d08000 (size 2048):
comm "chrome_sandbox", pid 18437, jiffies 4310887172 (age 9097.630s)
hex dump (first 32 bytes):
b2 68 89 81 ff ff ff ff 20 04 04 f8 01 88 ff ff .h...... .......
04 00 00 00 a4 01 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff815b4aad>] kmemleak_alloc+0x21/0x3e
[<ffffffff81110352>] slab_post_alloc_hook+0x28/0x2a
[<ffffffff81113fad>] __kmalloc_track_caller+0xf1/0x104
[<ffffffff810f10c2>] kmemdup+0x1b/0x30
[<ffffffff81571e9f>] sctp_sysctl_net_register+0x1f/0x72
[<ffffffff8155d305>] sctp_net_init+0x100/0x39f
[<ffffffff814ad53c>] ops_init+0xc6/0xf5
[<ffffffff814ad5b7>] setup_net+0x4c/0xd0
[<ffffffff814ada5e>] copy_net_ns+0x6d/0xd6
[<ffffffff810938b1>] create_new_namespaces+0xd7/0x147
[<ffffffff810939f4>] copy_namespaces+0x63/0x99
[<ffffffff81076733>] copy_process+0xa65/0x1233
[<ffffffff81077030>] do_fork+0x10b/0x271
[<ffffffff8100a0e9>] sys_clone+0x23/0x25
[<ffffffff815dda73>] stub_clone+0x13/0x20
[<ffffffffffffffff>] 0xffffffffffffffff

I fixed the spelling of sysctl_header so the code actually
compiles. -- EWB.

Reported-by: Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3c68198e75111a905ac2412be12bf7b29099729b 24-Oct-2012 Neil Horman <nhorman@tuxdriver.com> sctp: Make hmac algorithm selection for cookie generation dynamic

Currently sctp allows for the optional use of md5 of sha1 hmac algorithms to
generate cookie values when establishing new connections via two build time
config options. Theres no real reason to make this a static selection. We can
add a sysctl that allows for the dynamic selection of these algorithms at run
time, with the default value determined by the corresponding crypto library
availability.
This comes in handy when, for example running a system in FIPS mode, where use
of md5 is disallowed, but SHA1 is permitted.

Note: This new sysctl has no corresponding socket option to select the cookie
hmac algorithm. I chose not to implement that intentionally, as RFC 6458
contains no option for this value, and I opted not to pollute the socket option
namespace.

Change notes:
v2)
* Updated subject to have the proper sctp prefix as per Dave M.
* Replaced deafult selection options with new options that allow
developers to explicitly select available hmac algs at build time
as per suggestion by Vlad Y.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: netdev@vger.kernel.org
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e1fc3b14f9a90d9591016749289f2c3d7b35fbf4 07-Aug-2012 Eric W. Biederman <ebiederm@xmission.com> sctp: Make sysctl tunables per net

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ebb7e95d9351f77a8ec1fca20eb645051401b7b2 07-Aug-2012 Eric W. Biederman <ebiederm@xmission.com> sctp: Add infrastructure for per net sysctls

Start with an empty sctp_net_table that will be populated as the various
tunable sysctls are made per net.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5aa93bcf66f4af094d6f11096e81d5501a0b4ba5 21-Jul-2012 Neil Horman <nhorman@tuxdriver.com> sctp: Implement quick failover draft from tsvwg

I've seen several attempts recently made to do quick failover of sctp transports
by reducing various retransmit timers and counters. While its possible to
implement a faster failover on multihomed sctp associations, its not
particularly robust, in that it can lead to unneeded retransmits, as well as
false connection failures due to intermittent latency on a network.

Instead, lets implement the new ietf quick failover draft found here:
http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05

This will let the sctp stack identify transports that have had a small number of
errors, and avoid using them quickly until their reliability can be
re-established. I've tested this out on two virt guests connected via multiple
isolated virt networks and believe its in compliance with the above draft and
works well.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: Sridhar Samudrala <sri@us.ibm.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: linux-sctp@vger.kernel.org
CC: joe@perches.com
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ec8f23ce0f4005b74013d4d122e0d540397a93c9 19-Apr-2012 Eric W. Biederman <ebiederm@xmission.com> net: Convert all sysctl registrations to register_net_sysctl

This results in code with less boiler plate that is a bit easier
to read.

Additionally stops us from using compatibility code in the sysctl
core, hastening the day when the compatibility code can be removed.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5dd3df105b9f6cb7dd2472b59e028d0d1c878ecb 19-Apr-2012 Eric W. Biederman <ebiederm@xmission.com> net: Move all of the network sysctls without a namespace into init_net.

This makes it clearer which sysctls are relative to your current network
namespace.

This makes it a little less error prone by not exposing sysctls for the
initial network namespace in other namespaces.

This is the same way we handle all of our other network interfaces to
userspace and I can't honestly remember why we didn't do this for
sysctls right from the start.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2692ba61a82203404abd7dd2a027bda962861f74 16-Dec-2011 Xi Wang <xi.wang@gmail.com> sctp: fix incorrect overflow check on autoclose

Commit 8ffd3208 voids the previous patches f6778aab and 810c0719 for
limiting the autoclose value. If userspace passes in -1 on 32-bit
platform, the overflow check didn't work and autoclose would be set
to 0xffffffff.

This patch defines a max_autoclose (in seconds) for limiting the value
and exposes it through sysctl, with the following intentions.

1) Avoid overflowing autoclose * HZ.

2) Keep the default autoclose bound consistent across 32- and 64-bit
platforms (INT_MAX / HZ in this patch).

3) Keep the autoclose value consistent between setsockopt() and
getsockopt() calls.

Suggested-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dd51be0f5484b450b8d48c9226ed86ce3dd5102e 26-Apr-2011 Michio Honda <micchie@sfc.wide.ad.jp> sctp: Add sysctl support for Auto-ASCONF.

This patch allows the system administrator to change default
Auto-ASCONF on/off behavior via an sysctl value.

Signed-off-by: Michio Honda <micchie@sfc.wide.ad.jp>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8d987e5c75107ca7515fa19e857cfa24aab6ec8f 10-Nov-2010 Eric Dumazet <eric.dumazet@gmail.com> net: avoid limits overflow

Robin Holt tried to boot a 16TB machine and found some limits were
reached : sysctl_tcp_mem[2], sysctl_udp_mem[2]

We can switch infrastructure to use long "instead" of "int", now
atomic_long_t primitives are available for free.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Robin Holt <holt@sgi.com>
Reviewed-by: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
a252e749f1ae17e43ccc5824f7b1b5854417c98b 08-Dec-2009 Linus Torvalds <torvalds@linux-foundation.org> sctp: fix compile error due to sysctl mismerge

I messed up the merge in d7fc02c7bae7b1cf69269992cf880a43a350cdaa, where
the conflict in question wasn't just about CTL_UNNUMBERED being removed,
but the 'strategy' field is too (sysctl handling is now done through the
/proc interface, with no duplicate protocols for reading the data).

Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
90f2f5318b3a5b0898fef0fec9b91376c7de7a2c 23-Nov-2009 Vlad Yasevich <vladislav.yasevich@hp.com> sctp: Update SWS avaoidance receiver side algorithm

We currently send window update SACKs every time we free up 1 PMTU
worth of data. That a lot more SACKs then necessary. Instead, we'll
now send back the actuall window every time we send a sack, and do
window-update SACKs when a fraction of the receive buffer has been
opened. The fraction is controlled with a sysctl.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
6d4561110a3e9fa742aeec6717248a491dfb1878 16-Nov-2009 Eric W. Biederman <ebiederm@xmission.com> sysctl: Drop & in front of every proc_handler.

For consistency drop & in front of every proc_handler. Explicity
taking the address is unnecessary and it prevents optimizations
like stubbing the proc_handlers to NULL.

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
f8572d8f2a2ba75408b97dc24ef47c83671795d7 05-Nov-2009 Eric W. Biederman <ebiederm@xmission.com> sysctl net: Remove unused binary sysctl code

Now that sys_sysctl is a compatiblity wrapper around /proc/sys
all sysctl strategy routines, and all ctl_name and strategy
entries in the sysctl tables are unused, and can be
revmoed.

In addition neigh_sysctl_register has been modified to no longer
take a strategy argument and it's callers have been modified not
to pass one.

Cc: "David Miller" <davem@davemloft.net>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
723884339f90a9c420783135168cc1045750eb5d 03-Sep-2009 Bhaskar Dutta <bhaskie@gmail.com> sctp: Sysctl configuration for IPv4 Address Scoping

This patch introduces a new sysctl option to make IPv4 Address Scoping
configurable <draft-stewart-tsvwg-sctp-ipv4-00.txt>.

In networking environments where DNAT rules in iptables prerouting
chains convert destination IP's to link-local/private IP addresses,
SCTP connections fail to establish as the INIT chunk is dropped by the
kernel due to address scope match failure.
For example to support overlapping IP addresses (same IP address with
different vlan id) a Layer-5 application listens on link local IP's,
and there is a DNAT rule that maps the destination IP to a link local
IP. Such applications never get the SCTP INIT if the address-scoping
draft is strictly followed.

This sysctl configuration allows SCTP to function in such
unconventional networking environments.

Sysctl options:
0 - Disable IPv4 address scoping draft altogether
1 - Enable IPv4 address scoping (default, current behavior)
2 - Enable address scoping but allow IPv4 private addresses in init/init-ack
3 - Enable address scoping but allow IPv4 link local address in init/init-ack

Signed-off-by: Bhaskar Dutta <bhaskar.dutta@globallogic.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
d48e074dfdada552fa53f5eab807540f352e0d5d 13-May-2009 Jean-Mickael Guerin <jean-mickael.guerin@6wind.com> sctp: fix sack_timeout sysctl min and max types

sctp_sack_timeout is defined as int, but the sysctl's maxsize is set
to sizeof(long) and the min/max are defined as long.

Signed-off-by: jean-mickael.guerin@6wind.com
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
6d9f239a1edb31d6133230f478fd1dc2da338ec5 04-Nov-2008 Alexey Dobriyan <adobriyan@gmail.com> net: '&' redux

I want to compile out proc_* and sysctl_* handlers totally and
stub them to NULL depending on config options, however usage of &
will prevent this, since taking adress of NULL pointer will break
compilation.

So, drop & in front of every ->proc_handler and every ->strategy
handler, it was never needed in fact.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
60c778b25972e095df8981dd41e99d161e8738f9 11-Jan-2008 Vlad Yasevich <vladislav.yasevich@hp.com> [SCTP]: Stop claiming that this is a "reference implementation"

I was notified by Randy Stewart that lksctp claims to be
"the reference implementation". First of all, "the
refrence implementation" was the original implementation
of SCTP in usersapce written ty Randy and a few others.
Second, after looking at the definiton of 'reference implementation',
we don't really meet the requirements.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
b5ccd792fa413f9336273cb8fa3b9dd3a7ec1735 09-Jan-2008 Pavel Emelyanov <xemul@openvz.org> [NET]: Simple ctl_table to ctl_path conversions.

This patch includes many places, that only required
replacing the ctl_table-s with appropriate ctl_paths
and call register_sysctl_paths().

Nothing special was done with them.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
73d9c4fd1a6ec4950b2eac8135d35506bf400d6c 24-Oct-2007 Vlad Yasevich <vladislav.yasevich@hp.com> SCTP: Allow ADD_IP to work with AUTH for backward compatibility.

This patch adds a tunable that will allow ADD_IP to work without
AUTH for backward compatibility. The default value is off since
the default value for ADD_IP is off as well. People who need
to use ADD-IP with older implementations take risks of connection
hijacking and should consider upgrading or turning this tunable on.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
a29a5bd4f5c3e8ba2e89688feab8b01c44f1654f 17-Sep-2007 Vlad Yasevich <vladislav.yasevich@hp.com> [SCTP]: Implement SCTP-AUTH initializations.

The patch initializes AUTH related members of the generic SCTP
structures and provides a way to enable/disable auth extension.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
007e3936bdaaa012483c9fe06ca71c272458c710 17-Sep-2007 Vlad Yasevich <vladislav.yasevich@hp.com> [SCTP]: Move sysctl_sctp_[rw]mem definitions to protocol.c

The sctp_[rw]mem definitions should really be in protocol.c
since that is where they are initialized. This also allows
one to build a kernel without sysctl support.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4d93df0abd50b9c9e2d4561439a1a1d21ec5e68f 16-Aug-2007 Neil Horman <nhorman@tuxdriver.com> [SCTP]: Rewrite of sctp buffer management code

This patch introduces autotuning to the sctp buffer management code
similar to the TCP. The buffer space can be grown if the advertised
receive window still has room. This might happen if small message
sizes are used, which is common in telecom environmens.
New tunables are introduced that provide limits to buffer growth
and memory pressure is entered if to much buffer spaces is used.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
0b4d414714f0d2f922d39424b0c5c82ad900a381 14-Feb-2007 Eric W. Biederman <ebiederm@xmission.com> [PATCH] sysctl: remove insert_at_head from register_sysctl

The semantic effect of insert_at_head is that it would allow new registered
sysctl entries to override existing sysctl entries of the same name. Which is
pain for caching and the proc interface never implemented.

I have done an audit and discovered that none of the current users of
register_sysctl care as (excpet for directories) they do not register
duplicate sysctl entries.

So this patch simply removes the support for overriding existing entries in
the sys_sysctl interface since no one uses it or cares and it makes future
enhancments harder.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Corey Minyard <minyard@acm.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: David Chinner <dgc@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3fd091e73b81f131e1567c4d4a1ec042940bf2f7 22-Aug-2006 Vladislav Yasevich <vladislav.yasevich@hp.com> [SCTP]: Remove multiple levels of msecs to jiffies conversions.

The SCTP sysctl entries are displayed in milliseconds, but stored
internally in jiffies. This results in multiple levels of msecs to
jiffies conversion and as a result produces a truncation error. This
patch makes things consistent in that we store and display defaults
in milliseconds and only convert once for use by association.
This patch also adds some sane min/max values so that we don't go off
the deep end.

Signed-off-by: Vladislav Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8116ffad4180b39d7a755345c1fde09da83930c0 17-Jan-2006 Vlad Yasevich <vladislav.yasevich@hp.com> [SCTP]: Fix bad sysctl formatting of SCTP timeout values on 64-bit m/cs.

Change all the structure members that hold jiffies to be of type
unsigned long. This also corrects bad sysctl formating on 64 bit
architectures.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
049b3ff5a86d0187184a189d2e31b8654d58fe22 12-Nov-2005 Neil Horman <nhorman@tuxdriver.com> [SCTP]: Include ulpevents in socket receive buffer accounting.

Also introduces a sysctl option to configure the receive buffer
accounting policy to be either at socket or association level.
Default is all the associations on the same socket share the
receive buffer.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8c5955d83ed26455a49d12e783cc2258d11279a9 06-Sep-2005 Adrian Bunk <bunk@stusta.de> [SCTP]: net/sctp/sysctl.c should #include <net/sctp/sctp.h>

Every file should #include the header files containing the prototypes of
it's global functions.

sctp.h contains the prototypes of sctp_sysctl_{,un}register().

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2f85a42964dd43fed3a339701db046bee5a8b903 28-Jun-2005 Vlad Yasevich <vladislav.yasevich@hp.com> [SCTP] Make init & delayed sack timeouts configurable by user.

Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4eb701dfc618491c9b97377df6e61de36dfc39ce 28-Apr-2005 Neil Horman <nhorman@redhat.com> [SCTP] Fix SCTP sendbuffer accouting.

- Include chunk and skb sizes in sendbuffer accounting.
- 2 policies are supported. 0: per socket accouting, 1: per association
accounting

DaveM: I've made the default per-socket.

Signed-off-by: Neil Horman <nhorman@redhat.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 17-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org> Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!