0287587884b15041203b3a362d485e1ab1f24445 |
|
06-Oct-2014 |
Eric Dumazet <edumazet@google.com> |
net: better IFF_XMIT_DST_RELEASE support Testing xmit_more support with netperf and connected UDP sockets, I found strange dst refcount false sharing. Current handling of IFF_XMIT_DST_RELEASE is not optimal. Dropping dst in validate_xmit_skb() is certainly too late in case packet was queued by cpu X but dequeued by cpu Y The logical point to take care of drop/force is in __dev_queue_xmit() before even taking qdisc lock. As Julian Anastasov pointed out, need for skb_dst() might come from some packet schedulers or classifiers. This patch adds new helper to cleanly express needs of various drivers or qdiscs/classifiers. Drivers that need skb_dst() in their ndo_start_xmit() should call following helper in their setup instead of the prior : dev->priv_flags &= ~IFF_XMIT_DST_RELEASE; -> netif_keep_dst(dev); Instead of using a single bit, we use two bits, one being eventually rebuilt in bonding/team drivers. The other one, is permanent and blocks IFF_XMIT_DST_RELEASE being rebuilt in bonding/team. Eventually, we could add something smarter later. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
|
077c5a0948cc7b75032288bd37bd6641ef05da76 |
|
30-Sep-2014 |
Tom Herbert <therbert@google.com> |
ipip: Set inner IP protocol in ipip Call skb_set_inner_ipproto to set inner IP protocol to IPPROTO_IPV4 before tunnel_xmit. This is needed if UDP encapsulation (fou) is being done. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
473ab820dd4af588785a8e10b9c1547aadb4fd72 |
|
17-Sep-2014 |
Tom Herbert <therbert@google.com> |
ipip: Setup and TX path for ipip/UDP foo-over-udp encapsulation Add netlink handling for IP tunnel encapsulation parameters and and adjustment of MTU for encapsulation. ip_tunnel_encap is called from ip_tunnel_xmit to actually perform FOU encapsulation. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
2346829e641b804ece9ac9298136b56d9567c278 |
|
06-Jun-2014 |
Dmitry Popov <ixaphire@qrator.net> |
ipip, sit: fix ipv4_{update_pmtu,redirect} calls ipv4_{update_pmtu,redirect} were called with tunnel's ifindex (t->dev is a tunnel netdevice). It caused wrong route lookup and failure of pmtu update or redirect. We should use the same ifindex that we use in ip_route_output_* in *tunnel_xmit code. It is t->parms.link . Signed-off-by: Dmitry Popov <ixaphire@qrator.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
f98f89a0104454f35a62d681683c844f6dbf4043 |
|
15-May-2014 |
Tom Gundersen <teg@jklm.no> |
net: tunnels - enable module autoloading Enable the module alias hookup to allow tunnel modules to be autoloaded on demand. This is in line with how most other netdev kinds work, and will allow userspace to create tunnels without having CAP_SYS_MODULE. Signed-off-by: Tom Gundersen <teg@jklm.no> Signed-off-by: David S. Miller <davem@davemloft.net>
|
3acfa1e73c2a2cbf1fda7aef0c6c2c9281ce9db2 |
|
19-Jan-2014 |
Eric Dumazet <edumazet@google.com> |
ipv4: be friend with drop monitor Replace some dev_kfree_skb() with kfree_skb() calls when we drop one skb, this might help bug tracking. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
cb32f511a70be8967ac9025cf49c44324ced9a39 |
|
19-Oct-2013 |
Eric Dumazet <edumazet@google.com> |
ipip: add GSO/TSO support Now inet_gso_segment() is stackable, its relatively easy to implement GSO/TSO support for IPIP Performance results, when segmentation is done after tunnel device (as no NIC is yet enabled for TSO IPIP support) : Before patch : lpq83:~# ./netperf -H 7.7.9.84 -Cc MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.9.84 () port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 10.00 3357.88 5.09 3.70 2.983 2.167 After patch : lpq83:~# ./netperf -H 7.7.9.84 -Cc MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.9.84 () port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 16384 16384 10.00 7710.19 4.52 6.62 1.152 1.687 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
737e828bdbdaf2f9d7de07f20a0308ac46ce5178 |
|
28-Aug-2013 |
Li Hongjun <hongjun.li@6wind.com> |
ipv4 tunnels: fix an oops when using ipip/sit with IPsec Since commit 3d7b46cd20e3 (ip_tunnel: push generic protocol handling to ip_tunnel module.), an Oops is triggered when an xfrm policy is configured on an IPv4 over IPv4 tunnel. xfrm4_policy_check() calls __xfrm_policy_check2(), which uses skb_dst(skb). But this field is NULL because iptunnel_pull_header() calls skb_dst_drop(skb). Signed-off-by: Li Hongjun <hongjun.li@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
6c742e714d8c282fd8f8b22d3e20b5141738c1ee |
|
13-Aug-2013 |
Nicolas Dichtel <nicolas.dichtel@6wind.com> |
ipip: add x-netns support This patch allows to switch the netns when packet is encapsulated or decapsulated. In other word, the encapsulated packet is received in a netns, where the lookup is done to find the tunnel. Once the tunnel is found, the packet is decapsulated and injecting into the corresponding interface which stands to another netns. When one of the two netns is removed, the tunnel is destroyed. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
3b7b514f44bff05d26a6499c4d4fac2a83938e6e |
|
02-Jul-2013 |
Cong Wang <amwang@redhat.com> |
ipip: fix a regression in ioctl This is a regression introduced by commit fd58156e456d9f68fe0448 (IPIP: Use ip-tunneling code.) Similar to GRE tunnel, previously we only check the parameters for SIOCADDTUNNEL and SIOCCHGTUNNEL, after that commit, the check is moved for all commands. So, just check for SIOCADDTUNNEL and SIOCCHGTUNNEL. Also, the check for i_key, o_key etc. is suspicious too, which did not exist before, reset them before passing to ip_tunnel_ioctl(). Cc: Pravin B Shelar <pshelar@nicira.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
3d7b46cd20e300bd6989fb1f43d46f1b9645816e |
|
18-Jun-2013 |
Pravin B Shelar <pshelar@nicira.com> |
ip_tunnel: push generic protocol handling to ip_tunnel module. Process skb tunnel header before sending packet to protocol handler. this allows code sharing between gre and ovs gre modules. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
bf3d6a8f791b2a81279b9ce3201b4970f6fbe51a |
|
28-May-2013 |
Nicolas Dichtel <nicolas.dichtel@6wind.com> |
iptunnel: specify protocol outside IP header Before this patch, ip_tunnel_xmit() was using the field protocol from the IP header passed into argument. There is no functional change, this patch prepares the support of IPv4 over IPv4 for module sit. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
fd58156e456d9f68fe04486be378d0bc93641532 |
|
25-Mar-2013 |
Pravin B Shelar <pshelar@nicira.com> |
IPIP: Use ip-tunneling code. Reuse common ip-tunneling code which is re-factored from GRE module. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
c54419321455631079c7d6e60bc732dd0c5914c5 |
|
25-Mar-2013 |
Pravin B Shelar <pshelar@nicira.com> |
GRE: Refactor GRE tunneling code. Following patch refactors GRE code into ip tunneling code and GRE specific code. Common tunneling code is moved to ip_tunnel module. ip_tunnel module is written as generic library which can be used by different tunneling implementations. ip_tunnel module contains following components: - packet xmit and rcv generic code. xmit flow looks like (gre_xmit/ipip_xmit)->ip_tunnel_xmit->ip_local_out. - hash table of all devices. - lookup for tunnel devices. - control plane operations like device create, destroy, ioctl, netlink operations code. - registration for tunneling modules, like gre, ipip etc. - define single pcpu_tstats dev->tstats. - struct tnl_ptk_info added to pass parsed tunnel packet parameters. ipip.h header is renamed to ip_tunnel.h Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
6aed0c8bf7d2f389b472834053eb6e3bd6926999 |
|
09-Mar-2013 |
Cong Wang <xiyou.wangcong@gmail.com> |
tunnel: use iptunnel_xmit() again With recent patches from Pravin, most tunnels can't use iptunnel_xmit() any more, due to ip_select_ident() and skb->ip_summed. But we can just move these operations out of iptunnel_xmit(), so that tunnels can use it again. This by the way fixes a bug in vxlan (missing nf_reset()) for net-next. Cc: Pravin B Shelar <pshelar@nicira.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
4f3ed9209f7f75ff0ee21bc5052d76542dd75b5f |
|
08-Mar-2013 |
Pravin B Shelar <pshelar@nicira.com> |
ipip: capture inner headers during encapsulation Allow IPIP to make use of tx-checksum offloading. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
8344bfc6008d1c7b8b541bb25de7dfacb2188b95 |
|
08-Mar-2013 |
Pravin B Shelar <pshelar@nicira.com> |
ipip: Use tunnel_ip_select_ident() for tunnel IP-Identification. tunnel_ip_select_ident() is more efficient when generating ip-header id given inner packet is of ipv4 type. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
cef401de7be8c4e155c6746bfccf721a4fa5fab9 |
|
25-Jan-2013 |
Eric Dumazet <edumazet@google.com> |
net: fix possible wrong checksum generation Pravin Shelar mentioned that GSO could potentially generate wrong TX checksum if skb has fragments that are overwritten by the user between the checksum computation and transmit. He suggested to linearize skbs but this extra copy can be avoided for normal tcp skbs cooked by tcp_sendmsg(). This patch introduces a new SKB_GSO_SHARED_FRAG flag, set in skb_shinfo(skb)->gso_type if at least one frag can be modified by the user. Typical sources of such possible overwrites are {vm}splice(), sendfile(), and macvtap/tun/virtio_net drivers. Tested: $ netperf -H 7.7.8.84 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.8.84 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 3959.52 $ netperf -H 7.7.8.84 -t TCP_SENDFILE TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.8.84 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 3216.80 Performance of the SENDFILE is impacted by the extra allocation and copy, and because we use order-0 pages, while the TCP_STREAM uses bigger pages. Reported-by: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
52e804c6dfaa5df1e4b0e290357b82ad4e4cda2c |
|
16-Nov-2012 |
Eric W. Biederman <ebiederm@xmission.com> |
net: Allow userns root to control ipv4 Allow an unpriviled user who has created a user namespace, and then created a network namespace to effectively use the new network namespace, by reducing capable(CAP_NET_ADMIN) and capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns, CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls. Settings that merely control a single network device are allowed. Either the network device is a logical network device where restrictions make no difference or the network device is hardware NIC that has been explicity moved from the initial network namespace. In general policy and network stack state changes are allowed while resource control is left unchanged. Allow creating raw sockets. Allow the SIOCSARP ioctl to control the arp cache. Allow the SIOCSIFFLAG ioctl to allow setting network device flags. Allow the SIOCSIFADDR ioctl to allow setting a netdevice ipv4 address. Allow the SIOCSIFBRDADDR ioctl to allow setting a netdevice ipv4 broadcast address. Allow the SIOCSIFDSTADDR ioctl to allow setting a netdevice ipv4 destination address. Allow the SIOCSIFNETMASK ioctl to allow setting a netdevice ipv4 netmask. Allow the SIOCADDRT and SIOCDELRT ioctls to allow adding and deleting ipv4 routes. Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for adding, changing and deleting gre tunnels. Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for adding, changing and deleting ipip tunnels. Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL and SIOCDELTUNNEL ioctls for adding, changing and deleting ipsec virtual tunnel interfaces. Allow setting the MRT_INIT, MRT_DONE, MRT_ADD_VIF, MRT_DEL_VIF, MRT_ADD_MFC, MRT_DEL_MFC, MRT_ASSERT, MRT_PIM, MRT_TABLE socket options on multicast routing sockets. Allow setting and receiving IPOPT_CIPSO, IP_OPT_SEC, IP_OPT_SID and arbitrary ip options. Allow setting IP_SEC_POLICY/IP_XFRM_POLICY ipv4 socket option. Allow setting the IP_TRANSPARENT ipv4 socket option. Allow setting the TCP_REPAIR socket option. Allow setting the TCP_CONGESTION socket option. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
fea379b2db31f5c44f2a24645de5ea29721f22aa |
|
15-Nov-2012 |
Nicolas Dichtel <nicolas.dichtel@6wind.com> |
ipip: fix sparse warnings in ipip_netlink_parms() This change fixes two sparse warnings triggered by casting the ip addresses from netlink messages in an u32 instead of be32. This change corrects that in order to resolve the sparse warnings. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
be42da0e1012bf67d8f6899b7d9162e35527da4b |
|
14-Nov-2012 |
Nicolas Dichtel <nicolas.dichtel@6wind.com> |
ipip: add support of link creation via rtnl This patch add the support of 'ip link .. type ipip'. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
befe2aa1b2c7b9b7e20e97906f99b58475608867 |
|
14-Nov-2012 |
Nicolas Dichtel <nicolas.dichtel@6wind.com> |
ipip/rtnl: add IFLA_IPTUN_PMTUDISC on dump This parameter was missing in the dump. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
c38cc4b599c2fe5815af4b0c6acac48e904977b4 |
|
14-Nov-2012 |
Nicolas Dichtel <nicolas.dichtel@6wind.com> |
ipip: always notify change when params are updated netdev_state_change() was called only when end points or link was updated. Now that all parameters are advertised via netlink, we must advertise any change. This patch also prepares the support of ipip tunnels management via rtnl. The code which update tunnels will be put in a new function. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
e086cadc08e259150b2ab8f7f4a16dbf9e3c2f22 |
|
11-Nov-2012 |
Amerigo Wang <amwang@redhat.com> |
net: unify for_each_ip_tunnel_rcu() The defitions of for_each_ip_tunnel_rcu() are same, so unify it. Also, don't hide the parameter 't'. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
aa0010f880ab542da3ad0e72992f2dc518ac68a0 |
|
11-Nov-2012 |
Amerigo Wang <amwang@redhat.com> |
net: convert __IPTUNNEL_XMIT() to an inline function __IPTUNNEL_XMIT() is an ugly macro, convert it to a static inline function, so make it more readable. IPTUNNEL_XMIT() is unused, just remove it. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
0974658da47cb399b76794057823bf3cd22acf37 |
|
09-Nov-2012 |
Nicolas Dichtel <nicolas.dichtel@6wind.com> |
ipip: advertise tunnel param via rtnl It is usefull for daemons that monitor link event to have the full parameters of these interfaces when a rtnl message is sent. It allows also to dump them via rtnetlink. It is based on what is done for GRE tunnels. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
c3b89fbba339aae533e380839fa078787635356e |
|
08-Nov-2012 |
Eric Dumazet <edumazet@google.com> |
ipip: add GSO support In commit 6b78f16e4b (gre: add GSO support) we added GSO support to GRE tunnels. This patch does the same for IPIP tunnels. Performance of single TCP flow over an IPIP tunnel is increased by 40% Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
eccc1bb8d4b4cf68d3c9becb083fa94ada7d495c |
|
25-Sep-2012 |
stephen hemminger <shemminger@vyatta.com> |
tunnel: drop packet if ECN present with not-ECT Linux tunnels were written before RFC6040 and therefore never implemented the corner case of ECN getting set in the outer header and the inner header not being ready for it. Section 4.2. Default Tunnel Egress Behaviour. o If the inner ECN field is Not-ECT, the decapsulator MUST NOT propagate any other ECN codepoint onwards. This is because the inner Not-ECT marking is set by transports that rely on dropped packets as an indication of congestion and would not understand or respond to any other ECN codepoint [RFC4774]. Specifically: * If the inner ECN field is Not-ECT and the outer ECN field is CE, the decapsulator MUST drop the packet. * If the inner ECN field is Not-ECT and the outer ECN field is Not-ECT, ECT(0), or ECT(1), the decapsulator MUST forward the outgoing packet with the ECN field cleared to Not-ECT. This patch moves the ECN decap logic out of the individual tunnels into a common place. It also adds logging to allow detecting broken systems that set ECN bits incorrectly when tunneling (or an intermediate router might be changing the header). Overloads rx_frame_error to keep track of ECN related error. Thanks to Chris Wright who caught this while reviewing the new VXLAN tunnel. This code was tested by injecting faulty logic in other end GRE to send incorrectly encapsulated packets. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
b0558ef24a792906914fcad277f3befe2420e618 |
|
24-Sep-2012 |
stephen hemminger <shemminger@vyatta.com> |
xfrm: remove extranous rcu_read_lock The handlers for xfrm_tunnel are always invoked with rcu read lock already. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
f8126f1d5136be1ca1a3536d43ad7a710b5620f8 |
|
13-Jul-2012 |
David S. Miller <davem@davemloft.net> |
ipv4: Adjust semantics of rt->rt_gateway. In order to allow prefixed routes, we have to adjust how rt_gateway is set and interpreted. The new interpretation is: 1) rt_gateway == 0, destination is on-link, nexthop is iph->daddr 2) rt_gateway != 0, destination requires a nexthop gateway Abstract the fetching of the proper nexthop value using a new inline helper, rt_nexthop(), as suggested by Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net> Tested-by: Vijay Subramanian <subramanian.vijay@gmail.com>
|
6700c2709c08d74ae2c3c29b84a30da012dbc7f1 |
|
17-Jul-2012 |
David S. Miller <davem@davemloft.net> |
net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}() This will be used so that we can compose a full flow key. Even though we have a route in this context, we need more. In the future the routes will be without destination address, source address, etc. keying. One ipv4 route will cover entire subnets, etc. In this environment we have to have a way to possess persistent storage for redirects and PMTU information. This persistent storage will exist in the FIB tables, and that's why we'll need to be able to rebuild a full lookup flow key here. Using that flow key will do a fib_lookup() and create/update the persistent entry. Signed-off-by: David S. Miller <davem@davemloft.net>
|
55be7a9c6074f749d617a7fc1914c9a23505438c |
|
12-Jul-2012 |
David S. Miller <davem@davemloft.net> |
ipv4: Add redirect support to all protocol icmp error handlers. Signed-off-by: David S. Miller <davem@davemloft.net>
|
36393395536064e483b73d173f6afc103eadfbc4 |
|
15-Jun-2012 |
David S. Miller <davem@davemloft.net> |
ipv4: Handle PMTU in all ICMP error handlers. With ip_rt_frag_needed() removed, we have to explicitly update PMTU information in every ICMP error handler. Create two helper functions to facilitate this. 1) ipv4_sk_update_pmtu() This updates the PMTU when we have a socket context to work with. 2) ipv4_update_pmtu() Raw version, used when no socket context is available. For this interface, we essentially just pass in explicit arguments for the flow identity information we would have extracted from the socket. And you'll notice that ipv4_sk_update_pmtu() is simply implemented in terms of ipv4_update_pmtu() Note that __ip_route_output_key() is used, rather than something like ip_route_output_flow() or ip_route_output_key(). This is because we absolutely do not want to end up with a route that does IPSEC encapsulation and the like. Instead, we only want the route that would get us to the node described by the outermost IP header. Reported-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
5e73ea1a31c3612aa6dfe44f864ca5b7b6a4cff9 |
|
15-Apr-2012 |
Daniel Baluta <dbaluta@ixiacom.com> |
ipv4: fix checkpatch errors Fix checkpatch errors of the following type: * ERROR: "foo * bar" should be "foo *bar" * ERROR: "(foo*)" should be "(foo *)" Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
87b6d218f3adb00e6b58c7f96f8b5a74ff91abb4 |
|
12-Apr-2012 |
stephen hemminger <shemminger@vyatta.com> |
tunnel: implement 64 bits statistics Convert the per-cpu statistics kept for GRE, IPIP, and SIT tunnels to use 64 bit statistics. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
058bd4d2a4ff0aaa4a5381c67e776729d840c785 |
|
11-Mar-2012 |
Joe Perches <joe@perches.com> |
net: Convert printks to pr_<level> Use a more current kernel messaging style. Convert a printk block to print_hex_dump. Coalesce formats, align arguments. Use %s, __func__ instead of embedding function names. Some messages that were prefixed with <foo>_close are now prefixed with <foo>_fini. Some ah4 and esp messages are now not prefixed with "ip ". The intent of this patch is to later add something like #define pr_fmt(fmt) "IPv4: " fmt. to standardize the output messages. Text size is trivially reduced. (x86-32 allyesconfig) $ size net/ipv4/built-in.o* text data bss dec hex filename 887888 31558 249696 1169142 11d6f6 net/ipv4/built-in.o.new 887934 31558 249800 1169292 11d78c net/ipv4/built-in.o.old Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
658c8d964eb3cdb7e4230a59ba09c75a3359ee4a |
|
25-Jan-2012 |
David S. Miller <davem@davemloft.net> |
ipip: Fix bug added to ipip_tunnel_xmit(). We can remove the rt_gateway == 0 check but we shouldn't remove the 'dst' initialization too. Noticed by Eric Dumazet. Signed-off-by: David S. Miller <davem@davemloft.net>
|
496053f488fc2d859e41574f3421993826d2d0eb |
|
12-Jan-2012 |
David S. Miller <davem@davemloft.net> |
ipv4: Remove bogus checks of rt_gateway being zero. It can never actually happen. rt_gateway is either the fully resolved flow lookup key's destination address, or the non-zero FIB entry gateway address. Signed-off-by: David S. Miller <davem@davemloft.net>
|
cf778b00e96df6d64f8e21b8395d1f8a859ecdc7 |
|
12-Jan-2012 |
Eric Dumazet <eric.dumazet@gmail.com> |
net: reintroduce missing rcu_assign_pointer() calls commit a9b3cd7f32 (rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER) did a lot of incorrect changes, since it did a complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x, y). We miss needed barriers, even on x86, when y is not NULL. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Stephen Hemminger <shemminger@vyatta.com> CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
72b36015ba43a3cca5303f5534d2c3e1899eae29 |
|
08-Dec-2011 |
Ted Feng <artisdom@gmail.com> |
ipip, sit: copy parms.name after register_netdevice Same fix as 731abb9cb2 for ipip and sit tunnel. Commit 1c5cae815d removed an explicit call to dev_alloc_name in ipip_tunnel_locate and ipip6_tunnel_locate, because register_netdevice will now create a valid name, however the tunnel keeps a copy of the name in the private parms structure. Fix this by copying the name back after register_netdevice has successfully returned. This shows up if you do a simple tunnel add, followed by a tunnel show: $ sudo ip tunnel add mode ipip remote 10.2.20.211 $ ip tunnel tunl0: ip/ip remote any local any ttl inherit nopmtudisc tunl%d: ip/ip remote 10.2.20.211 local any ttl inherit $ sudo ip tunnel add mode sit remote 10.2.20.212 $ ip tunnel sit0: ipv6/ip remote any local any ttl 64 nopmtudisc 6rd-prefix 2002::/16 sit%d: ioctl 89f8 failed: No such device sit%d: ipv6/ip remote 10.2.20.212 local any ttl inherit Cc: stable@vger.kernel.org Signed-off-by: Ted Feng <artisdom@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
8ce120f11898c921329a5f618d01dcc1e8e69cac |
|
05-Nov-2011 |
Eric Dumazet <eric.dumazet@gmail.com> |
net: better pcpu data alignment Tunnels can force an alignment of their percpu data to reduce number of cache lines used in fast path, or read in .ndo_get_stats() percpu_alloc() is a very fine grained allocator, so any small hole will be used anyway. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
a9b3cd7f323b2e57593e7215362a7b02fc933e3a |
|
01-Aug-2011 |
Stephen Hemminger <shemminger@vyatta.com> |
rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER When assigning a NULL value to an RCU protected pointer, no barrier is needed. The rcu_assign_pointer, used to handle that but will soon change to not handle the special case. Convert all rcu_assign_pointer of NULL value. //smpl @@ expression P; @@ - rcu_assign_pointer(P, NULL) + RCU_INIT_POINTER(P, NULL) // </smpl> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
1c5cae815d19ffe02bdfda1260949ef2b1806171 |
|
30-Apr-2011 |
Jiri Pirko <jpirko@redhat.com> |
net: call dev_alloc_name from register_netdevice Force dev_alloc_name() to be called from register_netdevice() by dev_get_valid_name(). That allows to remove multiple explicit dev_alloc_name() calls. The possibility to call dev_alloc_name in advance remains. This also fixes veth creation regresion caused by 84c49d8c3e4abefb0a41a77b25aa37ebe8d6b743 Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
69458cb194e82972347a004054e0baed719ed008 |
|
04-May-2011 |
David S. Miller <davem@davemloft.net> |
ipv4: Use flowi4->{daddr,saddr} in ipip_tunnel_xmit(). Instead of rt->rt_{dst,src} Signed-off-by: David S. Miller <davem@davemloft.net>
|
31e4543db29fb85496a122b965d6482c8d1a2bfe |
|
04-May-2011 |
David S. Miller <davem@davemloft.net> |
ipv4: Make caller provide on-stack flow key to ip_route_output_ports(). Signed-off-by: David S. Miller <davem@davemloft.net>
|
b71d1d426d263b0b6cb5760322efebbfc89d4463 |
|
22-Apr-2011 |
Eric Dumazet <eric.dumazet@gmail.com> |
inet: constify ip headers and in6_addr Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers where possible, to make code intention more obvious. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
78fbfd8a653ca972afe479517a40661bfff6d8c3 |
|
12-Mar-2011 |
David S. Miller <davem@davemloft.net> |
ipv4: Create and use route lookup helpers. The idea here is this minimizes the number of places one has to edit in order to make changes to how flows are defined and used. Signed-off-by: David S. Miller <davem@davemloft.net>
|
8909c9ad8ff03611c9c96c9a92656213e4bb495b |
|
01-Mar-2011 |
Vasiliy Kulikov <segoon@openwall.com> |
net: don't allow CAP_NET_ADMIN to load non-netdev kernel modules Since a8f80e8ff94ecba629542d9b4b5f5a8ee3eb565c any process with CAP_NET_ADMIN may load any module from /lib/modules/. This doesn't mean that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are limited to /lib/modules/**. However, CAP_NET_ADMIN capability shouldn't allow anybody load any module not related to networking. This patch restricts an ability of autoloading modules to netdev modules with explicit aliases. This fixes CVE-2011-1019. Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior of loading netdev modules by name (without any prefix) for processes with CAP_SYS_MODULE to maintain the compatibility with network scripts that use autoloading netdev modules by aliases like "eth0", "wlan0". Currently there are only three users of the feature in the upstream kernel: ipip, ip_gre and sit. root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) -- root@albatros:~# grep Cap /proc/$$/status CapInh: 0000000000000000 CapPrm: fffffff800001000 CapEff: fffffff800001000 CapBnd: fffffff800001000 root@albatros:~# modprobe xfs FATAL: Error inserting xfs (/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted root@albatros:~# lsmod | grep xfs root@albatros:~# ifconfig xfs xfs: error fetching interface information: Device not found root@albatros:~# lsmod | grep xfs root@albatros:~# lsmod | grep sit root@albatros:~# ifconfig sit sit: error fetching interface information: Device not found root@albatros:~# lsmod | grep sit root@albatros:~# ifconfig sit0 sit0 Link encap:IPv6-in-IPv4 NOARP MTU:1480 Metric:1 root@albatros:~# lsmod | grep sit sit 10457 0 tunnel4 2957 1 sit For CAP_SYS_MODULE module loading is still relaxed: root@albatros:~# grep Cap /proc/$$/status CapInh: 0000000000000000 CapPrm: ffffffffffffffff CapEff: ffffffffffffffff CapBnd: ffffffffffffffff root@albatros:~# ifconfig xfs xfs: error fetching interface information: Device not found root@albatros:~# lsmod | grep xfs xfs 745319 0 Reference: https://lkml.org/lkml/2011/2/24/203 Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: James Morris <jmorris@namei.org>
|
b23dd4fe42b455af5c6e20966b7d6959fa8352ea |
|
02-Mar-2011 |
David S. Miller <davem@davemloft.net> |
ipv4: Make output route lookup return rtable directly. Instead of on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>
|
8afe7c8acd33bc52c56546e73e46e9d546269e2c |
|
29-Nov-2010 |
stephen hemminger <shemminger@vyatta.com> |
ipip: add module alias for tunl0 tunnel device If ipip is built as a module the 'ip tunnel add' command would fail because the ipip module was not being autoloaded. Adding an alias for the tunl0 device name cause dev_load() to autoload it when needed. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
5811662b15db018c740c57d037523683fd3e6123 |
|
12-Nov-2010 |
Changli Gao <xiaosuo@gmail.com> |
net: use the macros defined for the members of flowi Use the macros defined for the members of flowi to clean the code up. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
74b0b85b88aaa952023762e0280799aaae849841 |
|
27-Oct-2010 |
Pavel Emelyanov <xemul@parallels.com> |
tunnels: Fix tunnels change rcu protection After making rcu protection for tunnels (ipip, gre, sit and ip6) a bug was introduced into the SIOCCHGTUNNEL code. The tunnel is first unlinked, then addresses change, then it is linked back probably into another bucket. But while changing the parms, the hash table is unlocked to readers and they can lookup the improper tunnel. Respective commits are b7285b79 (ipip: get rid of ipip_lock), 1507850b (gre: get rid of ipgre_lock), 3a43be3c (sit: get rid of ipip6_lock) and 94767632 (ip6tnl: get rid of ip6_tnl_lock). The quick fix is to wait for quiescent state to pass after unlinking, but if it is inappropriate I can invent something better, just let me know. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
caf586e5f23cebb2a68cbaf288d59dbbf2d74052 |
|
30-Sep-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
net: add a core netdev->rx_dropped counter In various situations, a device provides a packet to our stack and we drop it before it enters protocol stack : - softnet backlog full (accounted in /proc/net/softnet_stat) - bad vlan tag (not accounted) - unknown/unregistered protocol (not accounted) We can handle a per-device counter of such dropped frames at core level, and automatically adds it to the device provided stats (rx_dropped), so that standard tools can be used (ifconfig, ip link, cat /proc/net/dev) This is a generalization of commit 8990f468a (net: rx_dropped accounting), thus reverting it. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
153f0943382e9ae0bff7caa110a1a4656088d0d4 |
|
28-Sep-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
ipip: enable lockless xmits IPIP tunnels can benefit from lockless xmits, using NETIF_F_LLTX Bench on a 16 cpus machine (dual E5540 cpus), 16 threads sending 10000000 UDP frames via one ipip tunnel (size:200 bytes per frame) Before patch : real 2m53.321s user 0m10.277s sys 46m0.597s After patch: real 0m32.063s user 0m9.237s sys 8m16.255s Last problem to solve is the contention on dst : 16118.00 28.3% __ip_route_output_key vmlinux 6135.00 10.8% dst_release vmlinux 3220.00 5.6% ip_finish_output vmlinux 2149.00 3.8% ip_route_output_flow vmlinux 1575.00 2.8% ip_append_data vmlinux 1481.00 2.6% ip_push_pending_frames vmlinux 1349.00 2.4% __xfrm_lookup vmlinux 1216.00 2.1% csum_partial_copy_generic vmlinux 1208.00 2.1% udp_sendmsg vmlinux Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
fada5636fe41fd1423fe4e6af7b9f609378acde6 |
|
28-Sep-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
ipip: fix percpu stats accounting commit 3c97af99a5aa1 (ipip: percpu stats accounting) forgot the fallback tunnel case (tunl0), and can crash pretty fast. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
3c97af99a5aa17feaebb4eb0f85f51ab6c055797 |
|
27-Sep-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
ipip: percpu stats accounting Maintain per_cpu tx_bytes, tx_packets, rx_bytes, rx_packets. Other seldom used fields are kept in netdev->stats structure, possibly unsafe. This is a preliminary work to support lockless transmit path, and correct RX stats, that are already unsafe. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
8990f468ae9010ab0af4be8f51bf7ab833a67202 |
|
20-Sep-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
net: rx_dropped accounting Under load, netif_rx() can drop incoming packets but administrators dont have a chance to spot which device needs some tuning (RPS activation for example) This patch adds rx_dropped accounting in vlans and tunnels. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
b7285b7912776a4492744949c747c88d539006fa |
|
15-Sep-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
ipip: get rid of ipip_lock As RTNL is held while doing tunnels inserts and deletes, we can remove ipip_lock spinlock. My initial RCU conversion was conservative and converted the rwlock to spinlock, with no RTNL requirement. Use appropriate rcu annotations and modern lockdep checks as well. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
6dcd814bd08bc7989f7f3eac9bbe8b20aec0182a |
|
30-Aug-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
net: struct xfrm_tunnel in read_mostly section tunnel4_handlers chain being scanned for each incoming packet, make sure it doesnt share an often dirtied cache line. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
d8d1f30b95a635dbd610dcc5eb641aca8f4768cf |
|
11-Jun-2010 |
Changli Gao <xiaosuo@gmail.com> |
net-next: remove useless union keyword remove useless union keyword in rtable, rt6_info and dn_route. Since there is only one member in a union, the union keyword isn't useful. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
d19d56ddc88e7895429ef118db9c83c7bbe3ce6a |
|
18-May-2010 |
Eric Dumazet <eric.dumazet@gmail.com> |
net: Introduce skb_tunnel_rx() helper skb rxhash should be cleared when a skb is handled by a tunnel before being delivered again, so that correct packet steering can take place. There are other cleanups and accounting that we can factorize in a new helper, skb_tunnel_rx() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
5a0e3ad6af8660be21ca98a971cd00f331318c05 |
|
24-Mar-2010 |
Tejun Heo <tj@kernel.org> |
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
|
d5aa407f59f5b83d2c50ec88f5bf56d40f1f8978 |
|
16-Feb-2010 |
Alexey Dobriyan <adobriyan@gmail.com> |
tunnels: fix netns vs proto registration ordering Same stuff as in ip_gre patch: receive hook can be called before netns setup is done, oopsing in net_generic(). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
2c8c1e7297e19bdef3c178c3ea41d898a7716e3e |
|
17-Jan-2010 |
Alexey Dobriyan <adobriyan@gmail.com> |
net: spread __net_init, __net_exit __net_init/__net_exit are apparently not going away, so use them to full extent. In some cases __net_init was removed, because it was called from __net_exit code. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
86de8a631e90a96d136ffd877719471a0b8d8b6d |
|
29-Nov-2009 |
Eric W. Biederman <ebiederm@xmission.com> |
net: Simplify ipip pernet operations. Take advantage of the new pernet automatic storage management, and stop using compatibility network namespace functions. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
f99189b186f3922ede4fa33c02f6edc735b8c981 |
|
17-Nov-2009 |
Eric Dumazet <eric.dumazet@gmail.com> |
netns: net_identifiers should be read_mostly Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
23ca0c989e46924393f1d54bec84801d035dd28e |
|
06-Nov-2009 |
Herbert Xu <herbert@gondor.apana.org.au> |
ipip: Fix handling of DF packets when pmtudisc is OFF RFC 2003 requires the outer header to have DF set if DF is set on the inner header, even when PMTU discovery is off for the tunnel. Our implementation does exactly that. For this to work properly the IPIP gateway also needs to engate in PMTU when the inner DF bit is set. As otherwise the original host would not be able to carry out its PMTU successfully since part of the path is only visible to the gateway. Unfortunately when the tunnel PMTU discovery setting is off, we do not collect the necessary soft state, resulting in blackholes when the original host tries to perform PMTU discovery. This problem is not reproducible on the IPIP gateway itself as the inner packet usually has skb->local_df set. This is not correctly cleared (an unrelated bug) when the packet passes through the tunnel, which allows fragmentation to occur. For hosts behind the IPIP gateway it is readily visible with a simple ping. This patch fixes the problem by performing PMTU discovery for all packets with the inner DF bit set, regardless of the PMTU discovery setting on the tunnel itself. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
0694c4c016df34c718b9f9ef6ba5aca2e178163a |
|
27-Oct-2009 |
Eric Dumazet <eric.dumazet@gmail.com> |
ipip: Optimize multiple unregistration Speedup module unloading by factorizing synchronize_rcu() calls Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
8f95dd63a2ab6fe7243c4f0bd2c3266e3a5525ab |
|
23-Oct-2009 |
Eric Dumazet <eric.dumazet@gmail.com> |
ipip: convert hash tables locking to RCU IPIP tunnels use one rwlock to protect their hash tables. This locking scheme can be converted to RCU for free, since netdevice already must wait for a RCU grace period at dismantle time. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
0bfbedb14a8a96c529341bec88991a92b41fac72 |
|
05-Oct-2009 |
Eric Dumazet <eric.dumazet@gmail.com> |
tunnels: Optimize tx path We currently dirty a cache line to update tunnel device stats (tx_packets/tx_bytes). We better use the txq->tx_bytes/tx_packets counters that already are present in cpu cache, in the cache line shared with txq->_xmit_lock This patch extends IPTUNNEL_XMIT() macro to use txq pointer provided by the caller. Also &tunnel->dev->stats can be replaced by &dev->stats Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
a43912ab1925788765208da5cd664b6f8e011d08 |
|
23-Sep-2009 |
Eric Dumazet <eric.dumazet@gmail.com> |
tunnel: eliminate recursion field It seems recursion field from "struct ip_tunnel" is not anymore needed. recursion prevention is done at the upper level (in dev_queue_xmit()), since we use HARD_TX_LOCK protection for tunnels. This avoids a cache line ping pong on "struct ip_tunnel" : This structure should be now mostly read on xmit and receive paths. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
6fef4c0c8eeff7de13007a5f56113475444a253d |
|
31-Aug-2009 |
Stephen Hemminger <shemminger@vyatta.com> |
netdev: convert pseudo-devices to netdev_tx_t Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
6ed106549d17474ca17a16057f4c0ed4eba5a7ca |
|
23-Jun-2009 |
Patrick McHardy <kaber@trash.net> |
net: use NETDEV_TX_OK instead of 0 in ndo_start_xmit() functions This patch is the result of an automatic spatch transformation to convert all ndo_start_xmit() return values of 0 to NETDEV_TX_OK. Some occurences are missed by the automatic conversion, those will be handled in a seperate patch. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
adf30907d63893e4208dfe3f5c88ae12bc2f25d5 |
|
02-Jun-2009 |
Eric Dumazet <eric.dumazet@gmail.com> |
net: skb->dst accessors Define three accessors to get/set dst attached to a skb struct dst_entry *skb_dst(const struct sk_buff *skb) void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst) void skb_dst_drop(struct sk_buff *skb) This one should replace occurrences of : dst_release(skb->dst) skb->dst = NULL; Delete skb->dst field Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
511c3f92ad5b6d9f8f6464be1b4f85f0422be91a |
|
02-Jun-2009 |
Eric Dumazet <eric.dumazet@gmail.com> |
net: skb->rtable accessor Define skb_rtable(const struct sk_buff *skb) accessor to get rtable from skb Delete skb->rtable field Setting rtable is not allowed, just set dst instead as rtable is an alias. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
28e72216d7e7af7050f171a87c1eecba93d01ea6 |
|
28-May-2009 |
Eric Dumazet <eric.dumazet@gmail.com> |
net: unset IFF_XMIT_DST_RELEASE in ipip_tunnel_setup() ipip_tunnel_xmit() might need skb->dst, so tell dev_hard_start_xmit() to no release it. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
26d94b46d09c97adb3c78c744c195e74ede699b2 |
|
25-Feb-2009 |
Wei Yongjun <yjwei@cn.fujitsu.com> |
ipip: used time_before for comparing jiffies The functions time_before is more robust for comparing jiffies against other values. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
5747a1aacde268017784a6a56df06c3b40194381 |
|
22-Feb-2009 |
Stephen Hemminger <shemminger@vyatta.com> |
ip: ipip compile warning Get rid of compile warning about non-const format Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
be77e5930725c3e77bcc0fb1def28e016080d0a1 |
|
24-Nov-2008 |
Alexey Dobriyan <adobriyan@gmail.com> |
net: fix tunnels in netns after ndo_ changes dev_net_set() should be the very first thing after alloc_netdev(). "ndo_" changes turned simple assignment (which is OK to do before netns assignment) into quite non-trivial operation (which is not OK, init_net was used). This leads to incomplete initialisation of tunnel device in netns. BUG: unable to handle kernel NULL pointer dereference at 00000004 IP: [<c02efdb5>] ip6_tnl_exit_net+0x37/0x4f *pde = 00000000 Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC last sysfs file: /sys/class/net/lo/operstate Pid: 10, comm: netns Not tainted (2.6.28-rc6 #1) EIP: 0060:[<c02efdb5>] EFLAGS: 00010246 CPU: 0 EIP is at ip6_tnl_exit_net+0x37/0x4f EAX: 00000000 EBX: 00000020 ECX: 00000000 EDX: 00000003 ESI: c5caef30 EDI: c782bbe8 EBP: c7909f50 ESP: c7909f48 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Process netns (pid: 10, ti=c7908000 task=c7905780 task.ti=c7908000) Stack: c03e75e0 c7390bc8 c7909f60 c0245448 c7390bd8 c7390bf0 c7909fa8 c012577a 00000000 00000002 00000000 c0125736 c782bbe8 c7909f90 c0308fe3 c782bc04 c7390bd4 c0245406 c084b718 c04f0770 c03ad785 c782bbe8 c782bc04 c782bc0c Call Trace: [<c0245448>] ? cleanup_net+0x42/0x82 [<c012577a>] ? run_workqueue+0xd6/0x1ae [<c0125736>] ? run_workqueue+0x92/0x1ae [<c0308fe3>] ? schedule+0x275/0x285 [<c0245406>] ? cleanup_net+0x0/0x82 [<c0125ae1>] ? worker_thread+0x81/0x8d [<c0128344>] ? autoremove_wake_function+0x0/0x33 [<c0125a60>] ? worker_thread+0x0/0x8d [<c012815c>] ? kthread+0x39/0x5e [<c0128123>] ? kthread+0x0/0x5e [<c0103b9f>] ? kernel_thread_helper+0x7/0x10 Code: db e8 05 ff ff ff 89 c6 e8 dc 04 f6 ff eb 08 8b 40 04 e8 38 89 f5 ff 8b 44 9e 04 85 c0 75 f0 43 83 fb 20 75 f2 8b 86 84 00 00 00 <8b> 40 04 e8 1c 89 f5 ff e8 98 04 f6 ff 89 f0 e8 f8 63 e6 ff 5b EIP: [<c02efdb5>] ip6_tnl_exit_net+0x37/0x4f SS:ESP 0068:c7909f48 ---[ end trace 6c2f2328fccd3e0c ]--- Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
23a12b14715e2dcd34dc8002927263ad3437344c |
|
21-Nov-2008 |
Stephen Hemminger <shemminger@vyatta.com> |
ipip: convert to net_device_ops Convert to network device ops. Needed to change to directly call the init routine since two sides share same ops. In the process found by inspection a device ref count leak if register_netdevice failed. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
5a5f3a8db9d70c90e9d55b46e02b2d8deb1c2c2e |
|
03-Nov-2008 |
Jianjun Kong <jianjun@zeuux.org> |
net: clean up net/ipv4/ipip.c raw.c tcp.c tcp_minisocks.c tcp_yeah.c xfrm4_policy.c Signed-off-by: Jianjun Kong <jianjun@zeuux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
113aa838ec3a235d883f8357d31d90e16c47fc89 |
|
14-Oct-2008 |
Alan Cox <alan@redhat.com> |
net: Rationalise email address: Network Specific Parts Clean up the various different email addresses of mine listed in the code to a single current and valid address. As Dave says his network merges for 2.6.28 are now done this seems a good point to send them in where they won't risk disrupting real changes. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
0b040829952d84bf2a62526f0e24b624e0699447 |
|
11-Jun-2008 |
Adrian Bunk <bunk@kernel.org> |
net: remove CVS keywords This patch removes CVS keywords that weren't updated for a long time from comments. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
071f92d05967a0c8422f1c8587ce0b4d90a8b447 |
|
22-May-2008 |
Rami Rosen <ramirose@gmail.com> |
net: The world is not perfect patch. Unless there will be any objection here, I suggest consider the following patch which simply removes the code for the -DI_WISH_WORLD_WERE_PERFECT in the three methods which use it. The compilation errors we get when using -DI_WISH_WORLD_WERE_PERFECT show that this code was not built and not used for really a long time. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
50f59cea075875d84018a5fc62cf2f5e6173a919 |
|
21-May-2008 |
Pavel Emelyanov <xemul@openvz.org> |
ipip: Use on-device stats instead of private ones. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
0a826406d4adf0c4b7cd47f116cb8e8ef65b92a3 |
|
16-Apr-2008 |
Pavel Emelyanov <xemul@openvz.org> |
[IPIP]: Allow to create IPIP tunnels in net namespaces. Set the proper net before calling register_netdev and disable the tunnel device netns changing. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
b99f0152e5f96dde31d2b9060b4f1029abc11078 |
|
16-Apr-2008 |
Pavel Emelyanov <xemul@openvz.org> |
[IPIP]: Use proper net in (mostly) routing calls. There are some ip_route_output_key() calls in there that require a proper net so give one to them. Besides - give a proper net to a single __get_dev_by_index call in ipip_tunnel_bind_dev(). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
44d3c299dcfee094f10e0c686ad6588fd36d4f8f |
|
16-Apr-2008 |
Pavel Emelyanov <xemul@openvz.org> |
[IPIP]: Make tunnels hashes per net. Either net or ipip_net already exists in all the required places, so just use one. Besides, tune net_init and net_exit calls to respectively initialize the hashes and destroy devices. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
cec3ffae1a019f02cd6b5fa291f279c8e9f86157 |
|
16-Apr-2008 |
Pavel Emelyanov <xemul@openvz.org> |
[IPIP]: Use proper net in hash-lookup functions. This is the part#2 of the previous patch - get the proper net for these functions. I make it in a separate patch, so that this change does not get lost in a large previous patch. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
b9fae5c9138086d27715a8a0f29d5b55239db35c |
|
16-Apr-2008 |
Pavel Emelyanov <xemul@openvz.org> |
[IPIP]: Add net/ipip_net argument to some functions. The hashes of tunnels will be per-net too, so prepare all the functions that uses them for this change by adding an argument. Use init_net temporarily in places, where the net does not exist explicitly yet. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
b9855c54dadc0768dcc3804df1972488783d2267 |
|
16-Apr-2008 |
Pavel Emelyanov <xemul@openvz.org> |
[IPIP]: Make the fallback tunnel device per-net. Create on in ipip_init_net(), use it all over the code (the proper place to get the net from already exists) and destroy in ipip_net_exit(). Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
10dc4c7bb70533d16184aaaa69e137a7d2b9a3a8 |
|
16-Apr-2008 |
Pavel Emelyanov <xemul@openvz.org> |
[IPIP]: Introduce empty ipip_net structure and net init/exit ops. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
ee6b967301b4aa5d4a4b61e2f682f086266db9fb |
|
06-Mar-2008 |
Eric Dumazet <dada1@cosmosbay.com> |
[IPV4]: Add 'rtable' field in struct sk_buff to alias 'dst' and avoid casts (Anonymous) unions can help us to avoid ugly casts. A common cast it the (struct rtable *)skb->dst one. Defining an union like : union { struct dst_entry *dst; struct rtable *rtable; }; permits to use skb->rtable in place. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
b37d428b24ad38034f56b614de05686ba151b614 |
|
27-Feb-2008 |
Pavel Emelyanov <xemul@openvz.org> |
[INET]: Don't create tunnels with '%' in name. Four tunnel drivers (ip_gre, ipip, ip6_tunnel and sit) can receive a pre-defined name for a device from the userspace. Since these drivers call the register_netdevice() (rtnl_lock, is held), which does _not_ generate the device's name, this name may contain a '%' character. Not sure how bad is this to have a device with a '%' in its name, but all the other places either use the register_netdev(), which call the dev_alloc_name(), or explicitly call the dev_alloc_name() before registering, i.e. do not allow for such names. This had to be prior to the commit 34cc7b, but I forgot to number the patches and this one got lost, sorry. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
34cc7ba6398203aab4056917fa1e2aa5988487aa |
|
24-Feb-2008 |
Pavel Emelyanov <xemul@openvz.org> |
[IP_TUNNEL]: Don't limit the number of tunnels with generic name explicitly. Use the added dev_alloc_name() call to create tunnel device name, rather than iterate in a hand-made loop with an artificial limit. Thanks Patrick for noticing this. [ The way this works is, when the device is actually registered, the generic code noticed the '%' in the name and invokes dev_alloc_name() to fully resolve the name. -DaveM ] Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
f206351a50ea86250fabea96b9af8d8f8fc02603 |
|
23-Jan-2008 |
Denis V. Lunev <den@openvz.org> |
[NETNS]: Add namespace parameter to ip_route_output_key. Needed to propagate it down to the ip_route_output_flow. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
5533995b62d02dbbf930f2e59221c2d5ea05aab7 |
|
12-Dec-2007 |
Michal Schmidt <mschmidt@redhat.com> |
[IPIP]: Allow rebinding the tunnel to another interface Once created, an IP tunnel can't be bound to another device. (reported as https://bugzilla.redhat.com/show_bug.cgi?id=419671) To reproduce: # create a tunnel: ip tunnel add tunneltest0 mode ipip remote 10.0.0.1 dev eth0 # try to change the bounding device from eth0 to eth1: ip tunnel change tunneltest0 dev eth1 # show the result: ip tunnel show tunneltest0 tunneltest0: ip/ip remote 10.0.0.1 local any dev eth0 ttl inherit Notice the bound device has not changed from eth0 to eth1. This patch fixes it. When changing the binding, it also recalculates the MTU according to the new bound device's MTU. If the change is acceptable, I'll do the same for GRE and SIT tunnels. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
c2636b4d9e8ab8d16b9e2bf0f0744bb8418d4026 |
|
24-Oct-2007 |
Chuck Lever <chuck.lever@oracle.com> |
[NET]: Treat the sign of the result of skb_headroom() consistently In some places, the result of skb_headroom() is compared to an unsigned integer, and in others, the result is compared to a signed integer. Make the comparisons consistent and correct. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
10d024c1b2fd58af8362670d7d6e5ae52fc33353 |
|
17-Sep-2007 |
Ralf Baechle <ralf@linux-mips.org> |
[NET]: Nuke SET_MODULE_OWNER macro. It's been a useless no-op for long enough in 2.6 so I figured it's time to remove it. The number of people that could object because they're maintaining unified 2.4 and 2.6 drivers is probably rather small. [ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
881d966b48b035ab3f3aeaae0f3d3f9b584f45b2 |
|
17-Sep-2007 |
Eric W. Biederman <ebiederm@xmission.com> |
[NET]: Make the device list and device lookups per namespace. This patch makes most of the generic device layer network namespace safe. This patch makes dev_base_head a network namespace variable, and then it picks up a few associated variables. The functions: dev_getbyhwaddr dev_getfirsthwbytype dev_get_by_flags dev_get_by_name __dev_get_by_name dev_get_by_index __dev_get_by_index dev_ioctl dev_ethtool dev_load wireless_process_ioctl were modified to take a network namespace argument, and deal with it. vlan_ioctl_set and brioctl_set were modified so their hooks will receive a network namespace argument. So basically anthing in the core of the network stack that was affected to by the change of dev_base was modified to handle multiple network namespaces. The rest of the network stack was simply modified to explicitly use &init_net the initial network namespace. This can be fixed when those components of the network stack are modified to handle multiple network namespaces. For now the ifindex generator is left global. Fundametally ifindex numbers are per namespace, or else we will have corner case problems with migration when we get that far. At the same time there are assumptions in the network stack that the ifindex of a network device won't change. Making the ifindex number global seems a good compromise until the network stack can cope with ifindex changes when you change namespaces, and the like. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
cfbba49d80be6cf8d3872b66fc5421f119843b36 |
|
10-Jul-2007 |
Patrick McHardy <kaber@trash.net> |
[NET]: Avoid copying writable clones in tunnel drivers Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
87d1a164df0b5e297cda698724ea7984d8392b06 |
|
24-Apr-2007 |
YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> |
[IPV4] IPIP: Unify code path to get hash array index. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
|
b0e380b1d8a8e0aca215df97702f99815f05c094 |
|
11-Apr-2007 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
[SK_BUFF]: unions of just one member don't get anything done, kill them Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and skb->mac to skb->mac_header, to match the names of the associated helpers (skb[_[re]set]_{transport,network,mac}_header). Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
88c7664f13bd1a36acb8566b93892a4c58759ac6 |
|
13-Mar-2007 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
[SK_BUFF]: Introduce icmp_hdr(), remove skb->h.icmph Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
eddc9ec53be2ecdbf4efe0efd4a83052594f0ac0 |
|
21-Apr-2007 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
[SK_BUFF]: Introduce ip_hdr(), remove skb->nh.iph Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
e2d1bca7e6134671bcb19810d004a252aa6a644d |
|
11-Apr-2007 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
[SK_BUFF]: Use skb_reset_network_header in skb_push cases skb_push updates and returns skb->data, so we can just call skb_reset_network_header after the call to skb_push. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
c1d2bbe1cd6c7bbdc6d532cefebb66c7efb789ce |
|
11-Apr-2007 |
Arnaldo Carvalho de Melo <acme@redhat.com> |
[SK_BUFF]: Introduce skb_reset_network_header(skb) For the common, open coded 'skb->nh.raw = skb->data' operation, so that we can later turn skb->nh.raw into a offset, reducing the size of struct sk_buff in 64bit land while possibly keeping it as a pointer on 32bit. This one touches just the most simple case, next will handle the slightly more "complex" cases. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
cd354f1ae75e6466a7e31b727faede57a1f89ca5 |
|
14-Feb-2007 |
Tim Schmielau <tim@physik3.uni-rostock.de> |
[PATCH] remove many unneeded #includes of sched.h After Al Viro (finally) succeeded in removing the sched.h #include in module.h recently, it makes sense again to remove other superfluous sched.h includes. There are quite a lot of files which include it but don't actually need anything defined in there. Presumably these includes were once needed for macros that used to live in sched.h, but moved to other header files in the course of cleaning it up. To ease the pain, this time I did not fiddle with any header files and only removed #includes from .c-files, which tend to cause less trouble. Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha, arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig, allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all configs in arch/arm/configs on arm. I also checked that no new warnings were introduced by the patch (actually, some warnings are removed that were emitted by unnecessarily included header files). Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
c0d56408e3ff52d635441e0f08d12164a63728cf |
|
13-Feb-2007 |
Kazunori MIYAZAWA <miyazawa@linux-ipv6.org> |
[IPSEC]: Changing API of xfrm4_tunnel_register. This patch changes xfrm4_tunnel register and deregister interface to prepare for solving the conflict of device tunnels with inter address family IPsec tunnel. Signed-off-by: Kazunori MIYAZAWA <miyazawa@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
e905a9edab7f4f14f9213b52234e4a346c690911 |
|
09-Feb-2007 |
YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> |
[NET] IPV4: Fix whitespace errors. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
22f8cde5bc336fd19603bb8c4572b33d14f14f87 |
|
07-Feb-2007 |
Stephen Hemminger <shemminger@osdl.org> |
[NET]: unregister_netdevice as void There was no real useful information from the unregister_netdevice() return code, the only error occurred in a situation that was a driver bug. So change it to a void function. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
|
d5a0a1e3109339090769e40fdaa62482fcf2a717 |
|
08-Nov-2006 |
Al Viro <viro@zeniv.linux.org.uk> |
[IPV4]: encapsulation annotations Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
|
c55e2f4997a104d66b59bdf1aa8ab125d09ae00a |
|
19-Sep-2006 |
Al Viro <viro@zeniv.linux.org.uk> |
[IPV4]: ipip and ip_gre encapsulation bugs Handling of ipip and ip_gre ICMP error relaying is b0rken; it accesses 8bit field + 3 reserved octets as host-endian 32bit, does comparison, subtraction and stuffs the result back. That breaks on big-endian. Fixed, made endian-clean. [ Note that this effected code is permanently commented out with and ifdef, so this error couldn't actually cause problems for anyone. -DaveM ] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
|
5d9c5a32920c5c0e6716b0f6ed16157783dc56a4 |
|
21-Jul-2006 |
Herbert Xu <herbert@gondor.apana.org.au> |
[IPV4]: Get rid of redundant IPCB->opts initialisation Now that we always zero the IPCB->opts in ip_rcv, it is no longer necessary to do so before calling netif_rx for tunneled packets. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
6ab3d5624e172c553004ecc862bfeac16d9d68b7 |
|
30-Jun-2006 |
Jörn Engel <joern@wohnheim.fh-wedel.de> |
Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
50fba2aa7cefa6b0e1768cb350c9e69042320c03 |
|
04-Apr-2006 |
Herbert Xu <herbert@gondor.apana.org.au> |
[INET]: Move no-tunnel ICMP error to tunnel4/tunnel6 This patch moves the sending of ICMP messages when there are no IPv4/IPv6 tunnels present to tunnel4/tunnel6 respectively. Please note that for now if xfrm4_tunnel/xfrm6_tunnel is loaded then no ICMP messages will ever be sent. This is similar to how we handle AH/ESP/IPCOMP. This move fixes the bug where we always send an ICMP message when there is no ip6_tunnel device present for a given packet even if it is later handled by IPsec. It also causes ICMP messages to be sent when no IPIP tunnel is present. I've decided to use the "port unreachable" ICMP message over the current value of "address unreachable" (and "protocol unreachable" by GRE) because it is not ambiguous unlike the other ones which can be triggered by other conditions. There seems to be no standard specifying what value must be used so this change should be OK. In fact we should change GRE to use this value as well. Incidentally, this patch also fixes a fairly serious bug in xfrm6_tunnel where we don't check whether the embedded IPv6 header is present before dereferencing it for the inside source address. This patch is inspired by a previous patch by Hugo Santos <hsantos@av.it.pt>. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
d2acc3479cbccd5cfbca6c787be713ef1de12ec6 |
|
28-Mar-2006 |
Herbert Xu <herbert@gondor.apana.org.au> |
[INET]: Introduce tunnel4/tunnel6 Basically this patch moves the generic tunnel protocol stuff out of xfrm4_tunnel/xfrm6_tunnel and moves it into the new files of tunnel4.c and tunnel6 respectively. The reason for this is that the problem that Hugo uncovered is only the tip of the iceberg. The real problem is that when we removed the dependency of ipip on xfrm4_tunnel we didn't really consider the module case at all. For instance, as it is it's possible to build both ipip and xfrm4_tunnel as modules and if the latter is loaded then ipip simply won't load. After considering the alternatives I've decided that the best way out of this is to restore the dependency of ipip on the non-xfrm-specific part of xfrm4_tunnel. This is acceptable IMHO because the intention of the removal was really to be able to use ipip without the xfrm subsystem. This is still preserved by this patch. So now both ipip/xfrm4_tunnel depend on the new tunnel4.c which handles the arbitration between the two. The order of processing is determined by a simple integer which ensures that ipip gets processed before xfrm4_tunnel. The situation for ICMP handling is a little bit more complicated since we may not have enough information to determine who it's for. It's not a big deal at the moment since the xfrm ICMP handlers are basically no-ops. In future we can deal with this when we look at ICMP caching in general. The user-visible change to this is the removal of the TUNNEL Kconfig prompts. This makes sense because it can only be used through IPCOMP as it stands. The addition of the new modules shouldn't introduce any problems since module dependency will cause them to be loaded. Oh and I also turned some unnecessary pskb's in IPv6 related to this patch to skb's. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
|
48d5cad87c3a4998d0bda16ccfb5c60dfe4de5fb |
|
16-Feb-2006 |
Patrick McHardy <kaber@trash.net> |
[XFRM]: Fix SNAT-related crash in xfrm4_output_finish When a packet matching an IPsec policy is SNATed so it doesn't match any policy anymore it looses its xfrm bundle, which makes xfrm4_output_finish crash because of a NULL pointer dereference. This patch directs these packets to the original output path instead. Since the packets have already passed the POST_ROUTING hook, but need to start at the beginning of the original output path which includes another POST_ROUTING invocation, a flag is added to the IPCB to indicate that the packet was rerouted and doesn't need to pass the POST_ROUTING hook again. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
4fc268d24ceb9f4150777c1b5b2b8e6214e56b2b |
|
11-Jan-2006 |
Randy Dunlap <rdunlap@xenotime.net> |
[PATCH] capable/capability.h (net/) net: Use <linux/capability.h> where capable() is used. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
2941a4863154982918d39a639632c76eeacfa884 |
|
09-Jan-2006 |
Patrick McHardy <kaber@trash.net> |
[NET]: Convert net/{ipv4,ipv6,sched} to netdev_priv Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
3e3850e989c5d2eb1aab6f0fd9257759f0f4cbc6 |
|
07-Jan-2006 |
Patrick McHardy <kaber@trash.net> |
[NETFILTER]: Fix xfrm lookup in ip_route_me_harder/ip6_route_me_harder ip_route_me_harder doesn't use the port numbers of the xfrm lookup and uses ip_route_input for non-local addresses which doesn't do a xfrm lookup, ip6_route_me_harder doesn't do a xfrm lookup at all. Use xfrm_decode_session and do the lookup manually, make sure both only do the lookup if the packet hasn't been transformed already. Makeing sure the lookup only happens once needs a new field in the IP6CB, which exceeds the size of skb->cb. The size of skb->cb is increased to 48b. Apparently the IPv6 mobile extensions need some more room anyway. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
8cdfab8a43bb4b3da686ea503a702cb6f9f6a803 |
|
07-Jan-2006 |
Patrick McHardy <kaber@trash.net> |
[IPV4]: reset IPCB flags when neccessary Reset IPSKB_XFRM_TUNNEL_SIZE flags in ipip and ip_gre hard_start_xmit function before the packet reenters IP. This is neccessary so the encapsulated packets are checked not to be oversized in xfrm4_output.c again. Reset all flags in sit when a packet changes its address family. Also remove some obsolete IPSKB flags. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
46f25dffbaba48c571d75f5f574f31978287b8d2 |
|
06-Jan-2006 |
Kris Katterjohn <kjak@users.sourceforge.net> |
[NET]: Change 1500 to ETH_DATA_LEN in some files These patches add the header linux/if_ether.h and change 1500 to ETH_DATA_LEN in some files. Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
db44575f6fd55df6ff67ddd21f7ad5be5a741136 |
|
31-Jul-2005 |
Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> |
[NET]: fix oops after tunnel module unload Tunnel modules used to obtain module refcount each time when some tunnel was created, which meaned that tunnel could be unloaded only after all the tunnels are deleted. Since killing old MOD_*_USE_COUNT macros this protection has gone. It is possible to return it back as module_get/put, but it looks more natural and practically useful to force destruction of all the child tunnels on module unload. Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
|
0303770deb834c15ca664a9d741d40f893c92f4e |
|
19-Jul-2005 |
Patrick McHardy <kaber@trash.net> |
[NET]: Make ipip/ip6_tunnel independant of XFRM Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
|
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 |
|
17-Apr-2005 |
Linus Torvalds <torvalds@ppc970.osdl.org> |
Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
|