- Sep 19, 2012
-
-
Eric Dumazet authored
Add GSO support to GRE tunnels. Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Maciej Żenczykowski <maze@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 20, 2012
-
-
David S. Miller authored
In order to allow prefixed routes, we have to adjust how rt_gateway is set and interpreted. The new interpretation is: 1) rt_gateway == 0, destination is on-link, nexthop is iph->daddr 2) rt_gateway != 0, destination requires a nexthop gateway Abstract the fetching of the proper nexthop value using a new inline helper, rt_nexthop(), as suggested by Joe Perches. Signed-off-by:
David S. Miller <davem@davemloft.net> Tested-by:
Vijay Subramanian <subramanian.vijay@gmail.com>
-
- Jul 17, 2012
-
-
David S. Miller authored
This will be used so that we can compose a full flow key. Even though we have a route in this context, we need more. In the future the routes will be without destination address, source address, etc. keying. One ipv4 route will cover entire subnets, etc. In this environment we have to have a way to possess persistent storage for redirects and PMTU information. This persistent storage will exist in the FIB tables, and that's why we'll need to be able to rebuild a full lookup flow key here. Using that flow key will do a fib_lookup() and create/update the persistent entry. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 12, 2012
-
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 15, 2012
-
-
David S. Miller authored
With ip_rt_frag_needed() removed, we have to explicitly update PMTU information in every ICMP error handler. Create two helper functions to facilitate this. 1) ipv4_sk_update_pmtu() This updates the PMTU when we have a socket context to work with. 2) ipv4_update_pmtu() Raw version, used when no socket context is available. For this interface, we essentially just pass in explicit arguments for the flow identity information we would have extracted from the socket. And you'll notice that ipv4_sk_update_pmtu() is simply implemented in terms of ipv4_update_pmtu() Note that __ip_route_output_key() is used, rather than something like ip_route_output_flow() or ip_route_output_key(). This is because we absolutely do not want to end up with a route that does IPSEC encapsulation and the like. Instead, we only want the route that would get us to the node described by the outermost IP header. Reported-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Apr 15, 2012
-
-
Daniel Baluta authored
Fix checkpatch errors of the following type: * ERROR: "foo * bar" should be "foo *bar" * ERROR: "(foo*)" should be "(foo *)" Signed-off-by:
Daniel Baluta <dbaluta@ixiacom.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Apr 14, 2012
-
-
stephen hemminger authored
Convert the per-cpu statistics kept for GRE, IPIP, and SIT tunnels to use 64 bit statistics. Signed-off-by:
Stephen Hemminger <shemminger@vyatta.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Apr 02, 2012
-
-
David S. Miller authored
These macros contain a hidden goto, and are thus extremely error prone and make code hard to audit. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Mar 13, 2012
-
-
Joe Perches authored
Add #define pr_fmt(fmt) as appropriate. Add "IPv4: ", "TCP: ", and "IPsec: " to appropriate files. Standardize on "UDPLite: " for appropriate uses. Some prefixes were previously "UDPLITE: " and "UDP-Lite: ". Add KBUILD_MODNAME ": " to icmp and gre. Remove embedded prefixes as appropriate. Add missing "\n" to pr_info in gre.c. Signed-off-by:
Joe Perches <joe@perches.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Mar 12, 2012
-
-
Joe Perches authored
Use a more current kernel messaging style. Convert a printk block to print_hex_dump. Coalesce formats, align arguments. Use %s, __func__ instead of embedding function names. Some messages that were prefixed with <foo>_close are now prefixed with <foo>_fini. Some ah4 and esp messages are now not prefixed with "ip ". The intent of this patch is to later add something like #define pr_fmt(fmt) "IPv4: " fmt. to standardize the output messages. Text size is trivially reduced. (x86-32 allyesconfig) $ size net/ipv4/built-in.o* text data bss dec hex filename 887888 31558 249696 1169142 11d6f6 net/ipv4/built-in.o.new 887934 31558 249800 1169292 11d78c net/ipv4/built-in.o.old Signed-off-by:
Joe Perches <joe@perches.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Feb 24, 2012
-
-
stephen hemminger authored
The original spelling and bad word choice makes these comments hard to read. Signed-off-by:
Stephen Hemminger <shemminger@vyatta.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Feb 15, 2012
-
-
Danny Kukawka authored
Replace usage of random_ether_addr() with eth_hw_addr_random() to set addr_assign_type correctly to NET_ADDR_RANDOM. Change the trivial cases. v2: adapt to renamed eth_hw_addr_random() Signed-off-by:
Danny Kukawka <danny.kukawka@bisect.de> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jan 28, 2012
-
-
David S. Miller authored
The conversion is very similar to that made to ipv6's SIT code. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jan 26, 2012
-
-
Willem de Bruijn authored
Tunnel devices set NETIF_F_LLTX to bypass HARD_TX_LOCK. Sit and ipip set this unconditionally in ops->setup, but gre enables it conditionally after parameter passing in ops->newlink. This is not called during tunnel setup as below, however, so GRE tunnels are still taking the lock. modprobe ip_gre ip tunnel add test0 mode gre remote 10.5.1.1 dev lo ip link set test0 up ip addr add 10.6.0.1 dev test0 # cat /sys/class/net/test0/features # $DIR/test_tunnel_xmit 10 10.5.2.1 ip route add 10.5.2.0/24 dev test0 ip tunnel del test0 The newlink callback is only called in rtnl_netlink, and only if the device is new, as it calls register_netdevice internally. Gre tunnels are created at 'ip tunnel add' with ioctl SIOCADDTUNNEL, which calls ipgre_tunnel_locate, which calls register_netdev. rtnl_newlink is called at 'ip link set', but skips ops->newlink and the device is up with locking still enabled. The equivalent ipip tunnel works fine, btw (just substitute 'method gre' for 'method ipip'). On kernels before /sys/class/net/*/features was removed [1], the first commented out line returns 0x6000 with method gre, which indicates that NETIF_F_LLTX (0x1000) is not set. With ipip, it reports 0x7000. This test cannot be used on recent kernels where the sysfs file is removed (and ETHTOOL_GFEATURES does not currently work for tunnel devices, because they lack dev->ethtool_ops). The second commented out line calls a simple transmission test [2] that sends on 24 cores at maximum rate. Results of a single run: ipip: 19,372,306 gre before patch: 4,839,753 gre after patch: 19,133,873 This patch replicates the condition check in ipgre_newlink to ipgre_tunnel_locate. It works for me, both with oseq on and off. This is the first time I looked at rtnetlink and iproute2 code, though, so someone more knowledgeable should probably check the patch. Thanks. The tail of both functions is now identical, by the way. To avoid code duplication, I'll be happy to rework this and merge the two. [1] http://patchwork.ozlabs.org/patch/104610/ [2] http://kernel.googlecode.com/files/xmit_udp_parallel.c Signed-off-by:
Willem de Bruijn <willemb@google.com> Acked-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jan 25, 2012
-
-
David S. Miller authored
We can remove the rt_gateway == 0 check but we shouldn't remove the 'dst' initialization too. Noticed by Eric Dumazet. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
It can never actually happen. rt_gateway is either the fully resolved flow lookup key's destination address, or the non-zero FIB entry gateway address. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Dec 12, 2011
-
-
Eric Dumazet authored
Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Dec 05, 2011
-
-
David Miller authored
To reflect the fact that a refrence is not obtained to the resulting neighbour entry. Signed-off-by:
David S. Miller <davem@davemloft.net> Acked-by:
Roland Dreier <roland@purestorage.com>
-
- Nov 18, 2011
-
-
Herbert Xu authored
ip_gre: Set needed_headroom dynamically again Now that all needed_headroom users have been fixed up so that we can safely increase needed_headroom, this patch restore the dynamic update of needed_headroom. Signed-off-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Nov 08, 2011
-
-
Eric Dumazet authored
Tunnels can force an alignment of their percpu data to reduce number of cache lines used in fast path, or read in .ndo_get_stats() percpu_alloc() is a very fine grained allocator, so any small hole will be used anyway. Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Oct 20, 2011
-
-
Eric Dumazet authored
It seems ip_gre is able to change dev->needed_headroom on the fly. Its is not legal unfortunately and triggers a BUG in raw_sendmsg() skb = sock_alloc_send_skb(sk, ... + LL_ALLOCATED_SPACE(rt->dst.dev) < another cpu change dev->needed_headromm (making it bigger) ... skb_reserve(skb, LL_RESERVED_SPACE(rt->dst.dev)); We end with LL_RESERVED_SPACE() being bigger than LL_ALLOCATED_SPACE() -> we crash later because skb head is exhausted. Bug introduced in commit 243aad83 in 2.6.34 (ip_gre: include route header_len in max_headroom calculation) Reported-by:
Elmar Vonlanthen <evonlanthen@gmail.com> Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> CC: Timo Teräs <timo.teras@iki.fi> CC: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jul 18, 2011
-
-
David S. Miller authored
dst_{get,set}_neighbour() Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- May 05, 2011
-
-
Jiri Pirko authored
Force dev_alloc_name() to be called from register_netdevice() by dev_get_valid_name(). That allows to remove multiple explicit dev_alloc_name() calls. The possibility to call dev_alloc_name in advance remains. This also fixes veth creation regresion caused by 84c49d8c Signed-off-by:
Jiri Pirko <jpirko@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- May 04, 2011
-
-
David S. Miller authored
First, make callers pass on-stack flowi4 to ip_route_output_gre() so they can get at the fully resolved flow key. Next, use that in ipgre_tunnel_xmit() to avoid the need to use rt->rt_{dst,src}. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Apr 22, 2011
-
-
Eric Dumazet authored
Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers where possible, to make code intention more obvious. Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Mar 13, 2011
-
-
David S. Miller authored
The idea here is this minimizes the number of places one has to edit in order to make changes to how flows are defined and used. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Mar 10, 2011
-
-
Vasiliy Kulikov authored
Since a8f80e8f any process with CAP_NET_ADMIN may load any module from /lib/modules/. This doesn't mean that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are limited to /lib/modules/**. However, CAP_NET_ADMIN capability shouldn't allow anybody load any module not related to networking. This patch restricts an ability of autoloading modules to netdev modules with explicit aliases. This fixes CVE-2011-1019. Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior of loading netdev modules by name (without any prefix) for processes with CAP_SYS_MODULE to maintain the compatibility with network scripts that use autoloading netdev modules by aliases like "eth0", "wlan0". Currently there are only three users of the feature in the upstream kernel: ipip, ip_gre and sit. root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) -- root@albatros:~# grep Cap /proc/$$/status CapInh: 0000000000000000 CapPrm: fffffff800001000 CapEff: fffffff800001000 CapBnd: fffffff800001000 root@albatros:~# modprobe xfs FATAL: Error inserting xfs (/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted root@albatros:~# lsmod | grep xfs root@albatros:~# ifconfig xfs xfs: error fetching interface information: Device not found root@albatros:~# lsmod | grep xfs root@albatros:~# lsmod | grep sit root@albatros:~# ifconfig sit sit: error fetching interface information: Device not found root@albatros:~# lsmod | grep sit root@albatros:~# ifconfig sit0 sit0 Link encap:IPv6-in-IPv4 NOARP MTU:1480 Metric:1 root@albatros:~# lsmod | grep sit sit 10457 0 tunnel4 2957 1 sit For CAP_SYS_MODULE module loading is still relaxed: root@albatros:~# grep Cap /proc/$$/status CapInh: 0000000000000000 CapPrm: ffffffffffffffff CapEff: ffffffffffffffff CapBnd: ffffffffffffffff root@albatros:~# ifconfig xfs xfs: error fetching interface information: Device not found root@albatros:~# lsmod | grep xfs xfs 745319 0 Reference: https://lkml.org/lkml/2011/2/24/203 Signed-off-by:
Vasiliy Kulikov <segoon@openwall.com> Signed-off-by:
Michael Tokarev <mjt@tls.msk.ru> Acked-by:
David S. Miller <davem@davemloft.net> Acked-by:
Kees Cook <kees.cook@canonical.com> Signed-off-by:
James Morris <jmorris@namei.org>
-
- Mar 02, 2011
-
-
David S. Miller authored
Instead of on the stack. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Feb 11, 2011
-
-
Steffen Klassert authored
Commit 5811662b ("net: use the macros defined for the members of flowi") accidentally removed the setting of IPPROTO_GRE from the struct flowi in ipgre_tunnel_xmit. This patch restores it. Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> Acked-by:
Changli Gao <xiaosuo@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Dec 13, 2010
-
-
David S. Miller authored
Always go through a new ip4_dst_hoplimit() helper, just like ipv6. This allowed several simplifications: 1) The interim dst_metric_hoplimit() can go as it's no longer userd. 2) The sysctl_ip_default_ttl entry no longer needs to use ipv4_doint_and_flush, since the sysctl is not cached in routing cache metrics any longer. 3) ipv4_doint_and_flush no longer needs to be exported and therefore can be marked static. When ipv4_doint_and_flush_strategy was removed some time ago, the external declaration in ip.h was mistakenly left around so kill that off too. We have to move the sysctl_ip_default_ttl declaration into ipv4's route cache definition header net/route.h, because currently net/ip.h (where the declaration lives now) has a back dependency on net/route.h Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Dec 09, 2010
-
-
David S. Miller authored
Use helper functions to hide all direct accesses, especially writes, to dst_entry metrics values. This will allow us to: 1) More easily change how the metrics are stored. 2) Implement COW for metrics. In particular this will help us put metrics into the inetpeer cache if that is what we end up doing. We can make the _metrics member a pointer instead of an array, initially have it point at the read-only metrics in the FIB, and then on the first set grab an inetpeer entry and point the _metrics member there. Signed-off-by:
David S. Miller <davem@davemloft.net> Acked-by:
Eric Dumazet <eric.dumazet@gmail.com>
-
- Dec 01, 2010
-
-
stephen hemminger authored
If gre is built as a module the 'ip tunnel add' command would fail because the ip_gre module was not being autoloaded. Adding an alias for the gre0 device name cause dev_load() to autoload it when needed. Signed-off-by:
Stephen Hemminger <shemminger@vyatta.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
stephen hemminger authored
Use strcpy() rather the sprintf() for the case where name is getting generated. Fix indentation. Signed-off-by:
Stephen Hemminger <shemminger@vyatta.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Nov 17, 2010
-
-
Changli Gao authored
Use the macros defined for the members of flowi to clean the code up. Signed-off-by:
Changli Gao <xiaosuo@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Nov 15, 2010
-
-
Timo Teräs authored
The GRE Key field is intended to be used for identifying an individual traffic flow within a tunnel. It is useful to be able to have XFRM policy selector matches to have different policies for different GRE tunnels. Signed-off-by:
Timo Teräs <timo.teras@iki.fi> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Nov 12, 2010
-
-
David S. Miller authored
When we test rt->fl.iif against zero, we're seeing if it's an output or an input route. Make that explicit with some helper functions. Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Oct 31, 2010
-
-
Eric Dumazet authored
Before making the fallback tunnel visible to lookups, we should make sure it is completely setup, once ipgre_tunnel_init() had been called and tstats per_cpu pointer allocated. move rcu_assign_pointer(ign->tunnels_wc[0], tunnel); from ipgre_fb_tunnel_init() to ipgre_init_net() Based on a patch from Pavel Emelyanov Reported-by:
Pavel Emelyanov <xemul@openvz.org> Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Acked-by:
Pavel Emelyanov <xemul@openvz.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Oct 27, 2010
-
-
Pavel Emelyanov authored
After making rcu protection for tunnels (ipip, gre, sit and ip6) a bug was introduced into the SIOCCHGTUNNEL code. The tunnel is first unlinked, then addresses change, then it is linked back probably into another bucket. But while changing the parms, the hash table is unlocked to readers and they can lookup the improper tunnel. Respective commits are b7285b79 (ipip: get rid of ipip_lock), 1507850b (gre: get rid of ipgre_lock), 3a43be3c (sit: get rid of ipip6_lock) and 94767632 (ip6tnl: get rid of ip6_tnl_lock). The quick fix is to wait for quiescent state to pass after unlinking, but if it is inappropriate I can invent something better, just let me know. Signed-off-by:
Pavel Emelyanov <xemul@openvz.org> Acked-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Oct 19, 2010
-
-
Eric Dumazet authored
Convert inetdev_by_index() to not increment in_dev refcount. Callers hold RCU or RTNL, and should not decrement in_dev refcount. Signed-off-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-