History log of /include/net/protocol.h
Revision Date Author Comments
b26ba202e0500eb852e89499ece1b2deaa64c3a7 23-May-2014 Tom Herbert <therbert@google.com> net: Eliminate no_check from protosw

It doesn't seem like an protocols are setting anything other
than the default, and allowing to arbitrarily disable checksums
for a whole protocol seems dangerous. This can be done on a per
socket basis.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
b582ef0990d457f7ce8ccf827af51a575ca0b4a6 20-Jan-2014 Or Gerlitz <ogerlitz@mellanox.com> net: Add GRO support for UDP encapsulating protocols

Add GRO handlers for protocols that do UDP encapsulation, with the intent of
being able to coalesce packets which encapsulate packets belonging to
the same TCP session.

For GRO purposes, the destination UDP port takes the role of the ether type
field in the ethernet header or the next protocol in the IP header.

The UDP GRO handler will only attempt to coalesce packets whose destination
port is registered to have gro handler.

Use a mark on the skb GRO CB data to disallow (flush) running the udp gro receive
code twice on a packet. This solves the problem of udp encapsulated packets whose
inner VM packet is udp and happen to carry a port which has registered offloads.

Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8ed1dc44d3e9e8387a104b1ae8f92e9a3fbf1b1e 09-Jan-2014 Hannes Frederic Sowa <hannes@stressinduktion.org> ipv4: introduce hardened ip_no_pmtu_disc mode

This new ip_no_pmtu_disc mode only allowes fragmentation-needed errors
to be honored by protocols which do more stringent validation on the
ICMP's packet payload. This knob is useful for people who e.g. want to
run an unmodified DNS server in a namespace where they need to use pmtu
for TCP connections (as they are used for zone transfers or fallback
for requests) but don't want to use possibly spoofed UDP pmtu information.

Currently the whitelisted protocols are TCP, SCTP and DCCP as they check
if the returned packet is in the window or if the association is valid.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Cc: John Heffner <johnwheffner@gmail.com>
Suggested-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
f307c6365648e919d7ac733f0b3a7e9ee3e31eb2 22-Sep-2013 Joe Perches <joe@perches.com> protocol.h: Remove extern from function prototypes

There are a mix of function prototypes with and without extern
in the kernel sources. Standardize on not using extern for
function prototypes.

Function prototypes don't need to be written with extern.
extern is assumed by the compiler. Its use is as unnecessary as
using auto to declare automatic/local variables in a block.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
f191a1d17f227032c159e5499809f545402b6dc6 15-Nov-2012 Vlad Yasevich <vyasevic@redhat.com> net: Remove code duplication between offload structures

Move the offload callbacks into its own structure.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c6b641a4c6b32f39db678c2441cb1ef824110d74 15-Nov-2012 Vlad Yasevich <vyasevic@redhat.com> ipv6: Pull IPv6 GSO registration out of the module

Sing GSO support is now separate, pull it out of the module
and make it its own init call.
Remove the cleanup functions as they are no longer called.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
3336288a9feaa809839adbaf05778dc2f16665dc 15-Nov-2012 Vlad Yasevich <vyasevic@redhat.com> ipv6: Switch to using new offload infrastructure.

Switch IPv6 protocol to using the new GRO/GSO calls and data.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bca49f843eac59cbb1ddd1f4a5d65fcc23b62efd 15-Nov-2012 Vlad Yasevich <vyasevic@redhat.com> ipv4: Switch to using the new offload infrastructure.

Switch IPv4 code base to using the new GRO/GSO calls and data.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8ca896cfdd17f32f5aa2747644733ebf3725360d 15-Nov-2012 Vlad Yasevich <vyasevic@redhat.com> ipv6: Add new offload registration infrastructure.

Create a new data structure for IPv6 protocols that holds GRO/GSO
callbacks and a new array to track the protocols that register GRO/GSO.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
de27d001d187721de775ed9e0c485aa6f517c745 15-Nov-2012 Vlad Yasevich <vyasevic@redhat.com> net: Add net protocol offload registration infrustructure

Create a new data structure for IPv4 protocols that holds GRO/GSO
callbacks and a new array to track the protocols that register GRO/GSO.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c7109986db3c945f50ceed884a30e0fd8af3b89b 26-Jul-2012 Eric Dumazet <edumazet@google.com> ipv6: Early TCP socket demux

This is the IPv6 missing bits for infrastructure added in commit
41063e9dd1195 (ipv4: Early TCP socket demux.)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
160eb5a6b14ca2eab5c598bdbbb24c24624bad34 28-Jun-2012 David S. Miller <davem@davemloft.net> ipv4: Kill early demux method return value.

It's completely unnecessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
c10237e077cef50e925f052e49f3b4fead9d71f9 28-Jun-2012 David S. Miller <davem@davemloft.net> Revert "ipv4: tcp: dont cache unconfirmed intput dst"

This reverts commit c074da2810c118b3812f32d6754bd9ead2f169e7.

This change has several unwanted side effects:

1) Sockets will cache the DST_NOCACHE route in sk->sk_rx_dst and we'll
thus never create a real cached route.

2) All TCP traffic will use DST_NOCACHE and never use the routing
cache at all.

Signed-off-by: David S. Miller <davem@davemloft.net>
c074da2810c118b3812f32d6754bd9ead2f169e7 27-Jun-2012 Eric Dumazet <edumazet@google.com> ipv4: tcp: dont cache unconfirmed intput dst

DDOS synflood attacks hit badly IP route cache.

On typical machines, this cache is allowed to hold up to 8 Millions dst
entries, 256 bytes for each, for a total of 2GB of memory.

rt_garbage_collect() triggers and tries to cleanup things.

Eventually route cache is disabled but machine is under fire and might
OOM and crash.

This patch exploits the new TCP early demux, to set a nocache
boolean in case incoming TCP frame is for a not yet ESTABLISHED or
TIMEWAIT socket.

This 'nocache' boolean is then used in case dst entry is not found in
route cache, to create an unhashed dst entry (DST_NOCACHE)

SYN-cookie-ACK sent use a similar mechanism (ipv4: tcp: dont cache
output dst for syncookies), so after this patch, a machine is able to
absorb a DDOS synflood attack without polluting its IP route cache.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
41063e9dd11956f2d285e12e4342e1d232ba0ea2 20-Jun-2012 David S. Miller <davem@davemloft.net> ipv4: Early TCP socket demux.

Input packet processing for local sockets involves two major demuxes.
One for the route and one for the socket.

But we can optimize this down to one demux for certain kinds of local
sockets.

Currently we only do this for established TCP sockets, but it could
at least in theory be expanded to other kinds of connections.

If a TCP socket is established then it's identity is fully specified.

This means that whatever input route was used during the three-way
handshake must work equally well for the rest of the connection since
the keys will not change.

Once we move to established state, we cache the receive packet's input
route to use later.

Like the existing cached route in sk->sk_dst_cache used for output
packets, we have to check for route invalidations using dst->obsolete
and dst->ops->check().

Early demux occurs outside of a socket locked section, so when a route
invalidation occurs we defer the fixup of sk->sk_rx_dst until we are
actually inside of established state packet processing and thus have
the socket locked.

Signed-off-by: David S. Miller <davem@davemloft.net>
f9242b6b28d61295f2bf7e8adfb1060b382e5381 20-Jun-2012 David S. Miller <davem@davemloft.net> inet: Sanitize inet{,6} protocol demux.

Don't pretend that inet_protos[] and inet6_protos[] are hashes, thay
are just a straight arrays. Remove all unnecessary hash masking.

Document MAX_INET_PROTOS.

Use RAW_HTABLE_SIZE when appropriate.

Reported-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dfd56b8b38fff3586f36232db58e1e9f7885a605 10-Dec-2011 Eric Dumazet <eric.dumazet@gmail.com> net: use IS_ENABLED(CONFIG_IPV6)

Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c8f44affb7244f2ac3e703cab13d55ede27621bb 15-Nov-2011 Michał Mirosław <mirq-linux@rere.qmqm.pl> net: introduce and use netdev_features_t for device features sets

v2: add couple missing conversions in drivers
split unexporting netdev_fix_features()
implemented %pNF
convert sock::sk_route_(no?)caps

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
04ed3e741d0f133e02bed7fa5c98edba128f90e7 25-Jan-2011 Michał Mirosław <mirq-linux@rere.qmqm.pl> net: change netdev->features to u32

Quoting Ben Hutchings: we presumably won't be defining features that
can only be enabled on 64-bit architectures.

Occurences found by `grep -r` on net/, drivers/net, include/

[ Move features and vlan_features next to each other in
struct netdev, as per Eric Dumazet's suggestion -DaveM ]

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
e0ad61ec867fdd262804afa7a68e11fc9930c2b9 25-Oct-2010 Eric Dumazet <eric.dumazet@gmail.com> net: add __rcu annotations to protocol

Add __rcu annotations to :
struct net_protocol *inet_protos
struct net_protocol *inet6_protos

And use appropriate casts to reduce sparse warnings if
CONFIG_SPARSE_RCU_POINTER=y

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
13f18aa05f5abe135f47b6417537ae2b2fedc18c 06-Nov-2009 Eric Paris <eparis@redhat.com> net: drop capability from protocol definitions

struct can_proto had a capability field which wasn't ever used. It is
dropped entirely.

struct inet_protosw had a capability field which can be more clearly
expressed in the code by just checking if sock->type = SOCK_RAW.

Signed-off-by: Eric Paris <eparis@redhat.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fd2c3ef761fbc5e6c27fa7d40b30cda06bfcd7d8 03-Nov-2009 Eric Dumazet <eric.dumazet@gmail.com> net: cleanup include/net

This cleanup patch puts struct/union/enum opening braces,
in first line to ease grep games.

struct something
{

becomes :

struct something {

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
41135cc836a1abeb12ca1416bdb29e87ad021153 14-Sep-2009 Alexey Dobriyan <adobriyan@gmail.com> net: constify struct inet6_protocol

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
32613090a96dba2ca2cc524c8d4749d3126fdde5 14-Sep-2009 Alexey Dobriyan <adobriyan@gmail.com> net: constify struct net_protocol

Remove long removed "inet_protocol_base" declaration.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
d5fdd6babcfc2b0e6a8da1acf492a69fb54b4c47 23-Jun-2009 Brian Haley <brian.haley@hp.com> ipv6: Use correct data types for ICMPv6 type and code

Change all the code that deals directly with ICMPv6 type and code
values to use u8 instead of a signed int as that's the actual data
type.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
787e9208360117835101f513f7db593dc2525cf8 08-Jan-2009 Herbert Xu <herbert@gondor.apana.org.au> ipv6: Add GRO support

This patch adds GRO support for IPv6. IPv6 GRO supports extension
headers in the same way as GSO (by using the same infrastructure).
It's also simpler compared to IPv4 since we no longer have to worry
about fragmentation attributes or header checksums.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
73cc19f1556b95976934de236fd9043f7208844f 16-Dec-2008 Herbert Xu <herbert@gondor.apana.org.au> ipv4: Add GRO infrastructure

This patch adds GRO support for IPv4.

The criteria for merging is more stringent than LRO, in particular,
we require all fields in the IP header to be identical except for
the length, ID and checksum. In addition, the ID must form an
arithmetic sequence with a difference of one.

The ID requirement might seem overly strict, however, most hardware
TSO solutions already obey this rule. Linux itself also obeys this
whether GSO is in use or not.

In future we could relax this rule by storing the IDs (or rather
making sure that we don't drop them when pulling the aggregate
skb's tail).

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
f145049a06f470d0489f47cb83ff3ccb2a0de622 24-Mar-2008 Denis V. Lunev <den@openvz.org> [NETNS]: Drop packets in the non-initial namespace on the per/protocol basis.

IP layer now can handle multiple namespaces normally. So, process such
packets normally and drop them only if the transport layer is not
aware about namespaces.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
87c3efbfdd1f98af14a1f60ff19f73d9a8d8da98 11-Dec-2007 Daniel Lezcano <dlezcano@fr.ibm.com> [IPV6]: make inet6_register_protosw to return an error code

This patch makes the inet6_register_protosw to return an error code.
The different protocols can be aware the registration was successful or
not and can pass the error to the initial caller, af_inet6.

Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
e5bbef20e017efcb10700398cc048c49b98628e0 15-Oct-2007 Herbert Xu <herbert@gondor.apana.org.au> [IPV6]: Replace sk_buff ** with sk_buff * in input handlers

With all the users of the double pointers removed from the IPv6 input path,
this patch converts all occurances of sk_buff ** to sk_buff * in IPv6 input
handlers.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
ee41e2dff1a0ac548f12871f4bb23fe9e69e13eb 28-Nov-2006 Arnaldo Carvalho de Melo <acme@mandriva.com> [INET]: Change protocol field in struct inet_protosw to u16

[acme@newtoy net-2.6.20]$ pahole /tmp/tcp_ipv6.o inet_protosw
/* /pub/scm/linux/kernel/git/acme/net-2.6.20/include/net/protocol.h:69 */
struct inet_protosw {
struct list_head list; /* 0 8 */
short unsigned int type; /* 8 2 */

/* XXX 2 bytes hole, try to pack */

int protocol; /* 12 4 */
struct proto * prot; /* 16 4 */
const struct proto_ops * ops; /* 20 4 */
int capability; /* 24 4 */
char no_check; /* 28 1 */
unsigned char flags; /* 29 1 */
}; /* size: 32, sum members: 28, holes: 1, sum holes: 2, padding: 2 */

So that we can kill that hole, protocol can only go all the way to 255 (RAW).

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
04ce69093f91547d3a7c4fc815d2868195591340 08-Nov-2006 Al Viro <viro@zeniv.linux.org.uk> [IPV6]: 'info' argument of ipv6 ->err_handler() is net-endian

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
a430a43d087545c96542ee64573237919109d370 08-Jul-2006 Herbert Xu <herbert@gondor.apana.org.au> [NET] gso: Fix up GSO packets with broken checksums

Certain subsystems in the stack (e.g., netfilter) can break the partial
checksum on GSO packets. Until they're fixed, this patch allows this to
work by recomputing the partial checksums through the GSO mechanism.

Once they've all been converted to update the partial checksum instead of
clearing it, this workaround can be removed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
adcfc7d0b4d7bc3c7edac6fdde9f3ae510bd6054 30-Jun-2006 Herbert Xu <herbert@gondor.apana.org.au> [IPV6]: Added GSO support for TCPv6

This patch adds GSO support for IPv6 and TCPv6.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
576a30eb6453439b3c37ba24455ac7090c247b5a 27-Jun-2006 Herbert Xu <herbert@gondor.apana.org.au> [NET]: Added GSO header verification

When GSO packets come from an untrusted source (e.g., a Xen guest domain),
we need to verify the header integrity before passing it to the hardware.

Since the first step in GSO is to verify the header, we can reuse that
code by adding a new bit to gso_type: SKB_GSO_DODGY. Packets with this
bit set can only be fed directly to devices with the corresponding bit
NETIF_F_GSO_ROBUST. If the device doesn't have that bit, then the skb
is fed to the GSO engine which will allow the packet to be sent to the
hardware if it passes the header check.

This patch changes the sg flag to a full features flag. The same method
can be used to implement TSO ECN support. We simply have to mark packets
with CWR set with SKB_GSO_ECN so that only hardware with a corresponding
NETIF_F_TSO_ECN can accept them. The GSO engine can either fully segment
the packet, or segment the first MTU and pass the rest to the hardware for
further segmentation.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
f4c50d990dcf11a296679dc05de3873783236711 22-Jun-2006 Herbert Xu <herbert@gondor.apana.org.au> [NET]: Add software TSOv4

This patch adds the GSO implementation for IPv4 TCP.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
62c4f0a2d5a188f73a94f2cb8ea0dba3e7cf0a7f 26-Apr-2006 David Woodhouse <dwmw2@infradead.org> Don't include linux/config.h from anywhere else in include/

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
951dbc8ac714b04c36296b8b5c36c8e036ce433f 07-Jan-2006 Patrick McHardy <kaber@trash.net> [IPV6]: Move nextheader offset to the IP6CB

Move nextheader offset to the IP6CB to make it possible to pass a
packet to ip6_input_finish multiple times and have it skip already
parsed headers. As a nice side effect this gets rid of the manual
hopopts skipping in ip6_input_finish.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
90ddc4f0470427df306f308ad03db6b6b21644b8 22-Dec-2005 Eric Dumazet <dada1@cosmosbay.com> [NET]: move struct proto_ops to const

I noticed that some of 'struct proto_ops' used in the kernel may share
a cache line used by locks or other heavily modified data. (default
linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at
least)

This patch makes sure a 'struct proto_ops' can be declared as const,
so that all cpus can share all parts of it without false sharing.

This is not mandatory : a driver can still use a read/write structure
if it needs to (and eventually a __read_mostly)

I made a global stubstitute to change all existing occurences to make
them const.

This should reduce the possibility of false sharing on SMP, and
speedup some socket system calls.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
d83d8461f902c672bc1bd8fbc6a94e19f092da97 14-Dec-2005 Arnaldo Carvalho de Melo <acme@mandriva.com> [IP_SOCKGLUE]: Remove most of the tcp specific calls

As DCCP needs to be called in the same spots.

Now we have a member in inet_sock (is_icsk), set at sock creation time from
struct inet_protosw->flags (if INET_PROTOSW_ICSK is set, like for TCP and
DCCP) to see if a struct sock instance is a inet_connection_sock for places
like the ones in ip_sockglue.c (v4 and v6) where we previously were looking if
sk_type was SOCK_STREAM, that is insufficient because we now use the same code
for DCCP, that has sk_type SOCK_DCCP.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 17-Apr-2005 Linus Torvalds <torvalds@ppc970.osdl.org> Linux-2.6.12-rc2

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!