- Jun 20, 2013
-
-
Pravin B Shelar authored
This is required for OVS GRE offloading. Signed-off-by:
Pravin B Shelar <pshelar@nicira.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Pravin B Shelar authored
This is required for ovs gre module. Signed-off-by:
Pravin B Shelar <pshelar@nicira.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Pravin B Shelar authored
Currently there is only one user is allowed to register for gre protocol. Following patch adds de-multiplexer. So that multiple modules can listen on gre protocol e.g. kernel gre devices and ovs. Signed-off-by:
Pravin B Shelar <pshelar@nicira.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Pravin B Shelar authored
Use cmpxchg() for atomic protocol registration which saves code and data space. Signed-off-by:
Pravin B Shelar <pshelar@nicira.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 13, 2013
-
-
Eric Dumazet authored
If CONFIG_NET_NS is not set then __net_init is the same as __init and __net_exit is the same as __exit. These functions will be removed from memory after the module loads or is removed. Functions that are exported for use by other functions should never be labeled for removal. Bug introduced by commit c5441932 ("GRE: Refactor GRE tunneling code.") Reported-by:
Steinar H. Gunderson <sgunderson@bigfoot.com> Signed-off-by:
Steven Rostedt <rostedt@goodmis.org> Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Saurabh Mohan authored
If users apply shaper to vti tunnel then it will cause a kernel crash. The problem seems to be due to the vti_tunnel_xmit function not clearing skb->opt field before passing the packet to xfrm tunneling code. Signed-off-by:
Saurabh Mohan <saurabh@vyatta.com> Acked-by:
Stephen Hemminger <stephen@networkplumber.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
Linux sends new unset data during disorder and recovery state if all (suspected) lost packets have been retransmitted ( RFC5681, section 3.2 step 1 & 2, RFC3517 section 4, NexSeg() Rule 2). One requirement is to keep the receive window about twice the estimated sender's congestion window (tcp_rcv_space_adjust()), assuming the fast retransmits repair the losses in the next round trip. But currently it's not the case on the first round trip in either normal or Fast Open connection, beucase the initial receive window is identical to (expected) sender's initial congestion window. The fix is to double it. Signed-off-by:
Yuchung Cheng <ycheng@google.com> Acked-by:
Neal Cardwell <ncardwell@google.com> Acked-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Joe Perches authored
Reduce the uses of this unnecessary typedef. Done via perl script: $ git grep --name-only -w ctl_table net | \ xargs perl -p -i -e '\ sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \ s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge' Reflow the modified lines that now exceed 80 columns. Signed-off-by:
Joe Perches <joe@perches.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Wu Fengguang authored
net/ipv4/ping.c:286:5: sparse: symbol 'ping_check_bind_addr' was not declared. Should it be static? net/ipv4/ping.c:355:6: sparse: symbol 'ping_set_saddr' was not declared. Should it be static? net/ipv4/ping.c:370:6: sparse: symbol 'ping_clear_saddr' was not declared. Should it be static? net/ipv6/ping.c:60:5: sparse: symbol 'dummy_ipv6_recv_error' was not declared. Should it be static? net/ipv6/ping.c:64:5: sparse: symbol 'dummy_ip6_datagram_recv_ctl' was not declared. Should it be static? net/ipv6/ping.c:69:5: sparse: symbol 'dummy_icmpv6_err_convert' was not declared. Should it be static? net/ipv6/ping.c:73:6: sparse: symbol 'dummy_ipv6_icmp_error' was not declared. Should it be static? net/ipv6/ping.c:75:5: sparse: symbol 'dummy_ipv6_chk_addr' was not declared. Should it be static? net/ipv6/ping.c:201:5: sparse: symbol 'ping_v6_seq_show' was not declared. Should it be static? Signed-off-by:
Fengguang Wu <fengguang.wu@intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
commit ba418fa3 ("soreuseport: UDP/IPv4 implementation") added following sparse errors : net/ipv4/udp.c:433:60: warning: cast from restricted __be16 net/ipv4/udp.c:433:60: warning: incorrect type in argument 1 (different base types) net/ipv4/udp.c:433:60: expected unsigned short [unsigned] [usertype] val net/ipv4/udp.c:433:60: got restricted __be16 [usertype] sport net/ipv4/udp.c:433:60: warning: cast from restricted __be16 net/ipv4/udp.c:433:60: warning: cast from restricted __be16 net/ipv4/udp.c:514:60: warning: cast from restricted __be16 net/ipv4/udp.c:514:60: warning: incorrect type in argument 1 (different base types) net/ipv4/udp.c:514:60: expected unsigned short [unsigned] [usertype] val net/ipv4/udp.c:514:60: got restricted __be16 [usertype] sport net/ipv4/udp.c:514:60: warning: cast from restricted __be16 net/ipv4/udp.c:514:60: warning: cast from restricted __be16 Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Fix following sparse error : net/ipv4/af_inet.c:1410:59: warning: restricted __be16 degrades to integer added in commit db8caf3d ("gro: should aggregate frames without DF") Reported-by:
kbuild test robot <fengguang.wu@intel.com> From: Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 12, 2013
-
-
Eric Dumazet authored
Fix following sparse errors : net/ipv4/igmp.c:1222:25: warning: cast from restricted __be32 net/ipv4/igmp.c:1234:31: warning: incorrect type in assignment (different address spaces) net/ipv4/igmp.c:1234:31: expected struct ip_mc_list [noderef] <asn:4>*next_hash net/ipv4/igmp.c:1234:31: got struct ip_mc_list *<noident> net/ipv4/igmp.c:1250:31: warning: incorrect type in assignment (different address spaces) net/ipv4/igmp.c:1250:31: expected struct ip_mc_list [noderef] <asn:4>*next_hash net/ipv4/igmp.c:1250:31: got struct ip_mc_list *<noident> net/ipv4/igmp.c:2380:37: warning: cast from restricted __be32 These were added by commit e9897071 ("igmp: hash a hash table to speedup ip_check_mc_rcu()") Reported-by:
kbuild test robot <fengguang.wu@intel.com> Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Similarly to TCP offloading and UDPv6 offloading, move all related UDPv4 functions to udp_offload.c to make things more explicit. Also, by this, we can make those functions static. Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Shawn Bohrer authored
ip_mc_init_dev() is passed a freshly kzalloc'd in_device so it is unnecessary to explicitly zero out the members. Signed-off-by:
Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
After IP route cache removal, multicast applications using a lot of multicast addresses hit a O(N) behavior in ip_check_mc_rcu() Add a per in_device hash table to get faster lookup. This hash table is created only if the number of items in mc_list is above 4. Reported-by:
Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by:
Eric Dumazet <edumazet@google.com> Tested-by:
Shawn Bohrer <sbohrer@rgmadvisors.com> Reviewed-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 11, 2013
-
-
Cong Wang authored
Similar to the following commits: commit 00f97da1 (netpoll: fix position of network header) commit 525cebed (pktgen: Fix position of ip and udp header) using skb_tail_offset() seems not correct since the offset is based on head pointer. With the last caller removed, skb_tail_offset() can be killed finally. Cc: Thomas Graf <tgraf@suug.ch> Cc: Daniel Borkmann <dborkmann@redhat.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by:
Cong Wang <amwang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eliezer Tamir authored
Adds low latency socket poll support for TCP. In tcp_v[46]_rcv() add a call to sk_mark_ll() to copy the napi_id from the skb to the sk. In tcp_recvmsg(), when there is no data in the socket we busy-poll. This is a good example of how to add busy-poll support to more protocols. Signed-off-by:
Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by:
Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by:
Eliezer Tamir <eliezer.tamir@linux.intel.com> Acked-by:
Eric Dumazet <edumazet@google.com> Tested-by:
Willem de Bruijn <willemb@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eliezer Tamir authored
Add upport for busy-polling on UDP sockets. In __udp[46]_lib_rcv add a call to sk_mark_ll() to copy the napi_id from the skb into the sk. This is done at the earliest possible moment, right after we identify which socket this skb is for. In __skb_recv_datagram When there is no data and the user tries to read we busy poll. Signed-off-by:
Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by:
Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by:
Eliezer Tamir <eliezer.tamir@linux.intel.com> Acked-by:
Eric Dumazet <edumazet@google.com> Tested-by:
Willem de Bruijn <willemb@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eliezer Tamir authored
Adds an ndo_ll_poll method and the code that supports it. This method can be used by low latency applications to busy-poll Ethernet device queues directly from the socket code. sysctl_net_ll_poll controls how many microseconds to poll. Default is zero (disabled). Individual protocol support will be added by subsequent patches. Signed-off-by:
Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by:
Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by:
Eliezer Tamir <eliezer.tamir@linux.intel.com> Acked-by:
Eric Dumazet <edumazet@google.com> Tested-by:
Willem de Bruijn <willemb@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 07, 2013
-
-
Daniel Borkmann authored
Would be good to make things explicit and move those functions to a new file called tcp_offload.c, thus make this similar to tcpv6_offload.c. While moving all related functions into tcp_offload.c, we can also make some of them static, since they are only used there. Also, add an explicit registration function. Suggested-by:
Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Acked-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
We have the minimal inline helper tcp_skb_mss to access skb_shinfo(skb)->gso_size, so also use it here to get mss. Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 05, 2013
-
-
Cong Wang authored
If we don't need scope id, we should initialize it to zero. Same for ->sin6_flowinfo. Cc: Lorenzo Colitti <lorenzo@google.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by:
Cong Wang <amwang@redhat.com> Acked-by:
Lorenzo Colitti <lorenzo@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jean Sacren authored
Commit 202dc3fc (Documentation: remove obsolete networking/multicast.txt file) deleted the obsolete file. After the file has been removed, clean up a couple of places where references to the deleted file were made so that users wouldn't be confused when they consult the Help menu. Signed-off-by:
Jean Sacren <sakiwit@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 04, 2013
-
-
Lorenzo Colitti authored
The format is based on /proc/net/icmp and /proc/net/{udp,raw}6. Compiles and displays reasonable results with CONFIG_IPV6={n,m,y} Couldn't figure out how to test without CONFIG_PROC_FS enabled. Signed-off-by:
Lorenzo Colitti <lorenzo@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Lorenzo Colitti authored
Introduce a ping_seq_afinfo structure (similar to its UDP equivalent) and use it to make some of the ping /proc functions address-family independent. Rename the remaining ping /proc functions from ping_* to ping_v4_*. Compiles and displays reasonable results with CONFIG_IPV6={n,m,y} Signed-off-by:
Lorenzo Colitti <lorenzo@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 03, 2013
-
-
Cong Wang authored
struct icmp_bxm is a large struct, reduce stack usage by allocating it on heap. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Joe Perches <joe@perches.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by:
Cong Wang <amwang@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Rami Rosen authored
ICMP_PARAMETERPROB is handled by icmp_unreach(); This patch adds ICMP_PARAMETERPROB to the list of ICMP message types handled by icmp_unreach(). Signed-off-by:
Rami Rosen <ramirose@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Timo Teräs authored
commit 13d82bf5 (ipv4: Fix flushing of cached routing informations) added the support to flush learned pmtu information. However, using rt_genid is quite heavy as it is bumped on route add/change and multicast events amongst other places. These can happen quite often, especially if using dynamic routing protocols. While this is ok with routes (as they are just recreated locally), the pmtu information is learned from remote systems and the icmp notification can come with long delays. It is worthy to have separate genid to avoid excessive pmtu resets. Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
Timo Teräs <timo.teras@iki.fi> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Timo Teräs authored
The tunnel devices call update_pmtu for each packet sent, this causes contention on the fnhe_lock. Ignore the pmtu update if pmtu is not actually changed, and there is still plenty of time before the entry expires. Signed-off-by:
Timo Teräs <timo.teras@iki.fi> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Timo Teräs authored
This reverts commit 05ab86c5 (xfrm4: Invalidate all ipv4 routes on IPsec pmtu events). Flushing all cached entries is not needed. Instead, invalidate only the related next hop dsts to recheck for the added next hop exception where needed. This also fixes a subtle race due to bumping generation id's before updating the pmtu. Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
Timo Teräs <timo.teras@iki.fi> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Jun 01, 2013
-
-
Nicolas Dichtel authored
This patch adds the support of IPv4 over Ipv4 for the module sit. The gain of this feature is to be able to have 4in4 and 6in4 over the same interface instead of having one interface for 6in4 and another for 4in4 even if encapsulation addresses are the same. To avoid conflicting with ipip module, sit IPv4 over IPv4 protocol is registered with a smaller priority. Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Nicolas Dichtel authored
Before this patch, ip_tunnel_xmit() was using the field protocol from the IP header passed into argument. There is no functional change, this patch prepares the support of IPv4 over IPv4 for module sit. Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
GRO on IPv4 doesn't aggregate frames if they don't have DF bit set. Some servers use IP_MTU_DISCOVER/IP_PMTUDISC_PROBE, so linux receivers are unable to aggregate this kind of traffic. The right thing to do is to allow aggregation as long as the DF bit has same value on all segments. bnx2x LRO does this correctly. Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Jerry Chu <hkchu@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ben Hutchings <bhutchings@solarflare.com> Reviewed-by:
Ben Hutchings <bhutchings@solarflare.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
David Majnemer authored
The current state of affairs is that read()/write() will setup RFS (Receive Flow Steering) for internet protocol sockets while poll()/epoll() does not. When poll() gets called with a TCP or UDP socket, we should update the flow target. This permits to RFS (if enabled) to select the appropriate CPU for following incoming packets. Note: Only connected UDP sockets can benefit from RFS. Signed-off-by:
David Majnemer <majnemer@google.com> Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Paul Turner <pjt@google.com> Cc: Tom Herbert <therbert@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- May 31, 2013
-
-
Yuchung Cheng authored
If the receiver supports DSACK, sender can detect false recoveries and revert cwnd reductions triggered by either severe network reordering or concurrent reordering and loss event. Signed-off-by:
Yuchung Cheng <ycheng@google.com> Acked-by:
Neal Cardwell <ncardwell@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
Upon detecting spurious fast retransmit via timestamps during recovery, use PRR to clock out new data packet instead of retransmission. Once all retransmission are proven spurious, the sender then reverts the cwnd reduction and congestion state to open or disorder. The current code does the opposite: it undoes cwnd as soon as any retransmission is spurious and continues to retransmit until all data are acked. This nullifies the point to undo the cwnd because the sender is still retransmistting spuriously. This patch fixes it. The undo_ssthresh argument of tcp_undo_cwnd_reductiuon() is no longer needed and is removed. Signed-off-by:
Yuchung Cheng <ycheng@google.com> Acked-by:
Neal Cardwell <ncardwell@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
Refactor and relocate various functions or variables to prepare the undo fix. Remove some unused function arguments. Rename tcp_undo_cwr to tcp_undo_cwnd_reduction to be consistent with the rest of CWR related function names. Signed-off-by:
Yuchung Cheng <ycheng@google.com> Acked-by:
Neal Cardwell <ncardwell@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
This patch series fixes an undo bug in fast recovery: the sender mistakenly undos the cwnd too early but continues fast retransmits until all pending data are acked. This also multiplies the SNMP stat PARTIALUNDO events by the degree of the network reordering. The first patch prepares the fix by consolidating the accounting of newly_acked_sacked in tcp_cwnd_reduction(), instead of updating newly_acked_sacked everytime sacked_out is adjusted. Also pass acked and prior_unsacked as const type because they are readonly in the rest of recovery processing. Signed-off-by:
Yuchung Cheng <ycheng@google.com> Acked-by:
Neal Cardwell <ncardwell@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- May 29, 2013
-
-
Simon Horman authored
This corrects an regression introduced by "net: Use 16bits for *_headers fields of struct skbuff" when NET_SKBUFF_DATA_USES_OFFSET is not set. In that case skb->tail will be a pointer however skb->network_header is now an offset. This patch corrects the problem by adding a wrapper to return skb tail as an offset regardless of the value of NET_SKBUFF_DATA_USES_OFFSET. It seems that skb->tail that this offset may be more than 64k and some care has been taken to treat such cases as an error. Signed-off-by:
Simon Horman <horms@verge.net.au> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Simon Horman authored
This corrects an regression introduced by "net: Use 16bits for *_headers fields of struct skbuff" when NET_SKBUFF_DATA_USES_OFFSET is not set. In that case skb->tail will be a pointer whereas skb->transport_header will be an offset from head. This is corrected by using wrappers that ensure that comparisons and calculations are always made using pointers. Signed-off-by:
Simon Horman <horms@verge.net.au> Signed-off-by:
David S. Miller <davem@davemloft.net>
-