Skip to content
Snippets Groups Projects
  1. Sep 19, 2012
  2. Sep 10, 2012
  3. Sep 09, 2012
  4. Sep 07, 2012
  5. Sep 06, 2012
    • Eric Dumazet's avatar
      tcp: fix TFO regression · 7ab4551f
      Eric Dumazet authored
      
      Fengguang Wu reported various panics and bisected to commit
      8336886f (tcp: TCP Fast Open Server - support TFO listeners)
      
      Fix this by making sure socket is a TCP socket before accessing TFO data
      structures.
      
      [  233.046014] kfree_debugcheck: out of range ptr ea6000000bb8h.
      [  233.047399] ------------[ cut here ]------------
      [  233.048393] kernel BUG at /c/kernel-tests/src/stable/mm/slab.c:3074!
      [  233.048393] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
      [  233.048393] Modules linked in:
      [  233.048393] CPU 0
      [  233.048393] Pid: 3929, comm: trinity-watchdo Not tainted 3.6.0-rc3+
      #4192 Bochs Bochs
      [  233.048393] RIP: 0010:[<ffffffff81169653>]  [<ffffffff81169653>]
      kfree_debugcheck+0x27/0x2d
      [  233.048393] RSP: 0018:ffff88000facbca8  EFLAGS: 00010092
      [  233.048393] RAX: 0000000000000031 RBX: 0000ea6000000bb8 RCX:
      00000000a189a188
      [  233.048393] RDX: 000000000000a189 RSI: ffffffff8108ad32 RDI:
      ffffffff810d30f9
      [  233.048393] RBP: ffff88000facbcb8 R08: 0000000000000002 R09:
      ffffffff843846f0
      [  233.048393] R10: ffffffff810ae37c R11: 0000000000000908 R12:
      0000000000000202
      [  233.048393] R13: ffffffff823dbd5a R14: ffff88000ec5bea8 R15:
      ffffffff8363c780
      [  233.048393] FS:  00007faa6899c700(0000) GS:ffff88001f200000(0000)
      knlGS:0000000000000000
      [  233.048393] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [  233.048393] CR2: 00007faa6841019c CR3: 0000000012c82000 CR4:
      00000000000006f0
      [  233.048393] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
      0000000000000000
      [  233.048393] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
      0000000000000400
      [  233.048393] Process trinity-watchdo (pid: 3929, threadinfo
      ffff88000faca000, task ffff88000faec600)
      [  233.048393] Stack:
      [  233.048393]  0000000000000000 0000ea6000000bb8 ffff88000facbce8
      ffffffff8116ad81
      [  233.048393]  ffff88000ff588a0 ffff88000ff58850 ffff88000ff588a0
      0000000000000000
      [  233.048393]  ffff88000facbd08 ffffffff823dbd5a ffffffff823dbcb0
      ffff88000ff58850
      [  233.048393] Call Trace:
      [  233.048393]  [<ffffffff8116ad81>] kfree+0x5f/0xca
      [  233.048393]  [<ffffffff823dbd5a>] inet_sock_destruct+0xaa/0x13c
      [  233.048393]  [<ffffffff823dbcb0>] ? inet_sk_rebuild_header
      +0x319/0x319
      [  233.048393]  [<ffffffff8231c307>] __sk_free+0x21/0x14b
      [  233.048393]  [<ffffffff8231c4bd>] sk_free+0x26/0x2a
      [  233.048393]  [<ffffffff825372db>] sctp_close+0x215/0x224
      [  233.048393]  [<ffffffff810d6835>] ? lock_release+0x16f/0x1b9
      [  233.048393]  [<ffffffff823daf12>] inet_release+0x7e/0x85
      [  233.048393]  [<ffffffff82317d15>] sock_release+0x1f/0x77
      [  233.048393]  [<ffffffff82317d94>] sock_close+0x27/0x2b
      [  233.048393]  [<ffffffff81173bbe>] __fput+0x101/0x20a
      [  233.048393]  [<ffffffff81173cd5>] ____fput+0xe/0x10
      [  233.048393]  [<ffffffff810a3794>] task_work_run+0x5d/0x75
      [  233.048393]  [<ffffffff8108da70>] do_exit+0x290/0x7f5
      [  233.048393]  [<ffffffff82707415>] ? retint_swapgs+0x13/0x1b
      [  233.048393]  [<ffffffff8108e23f>] do_group_exit+0x7b/0xba
      [  233.048393]  [<ffffffff8108e295>] sys_exit_group+0x17/0x17
      [  233.048393]  [<ffffffff8270de10>] tracesys+0xdd/0xe2
      [  233.048393] Code: 59 01 5d c3 55 48 89 e5 53 41 50 0f 1f 44 00 00 48
      89 fb e8 d4 b0 f0 ff 84 c0 75 11 48 89 de 48 c7 c7 fc fa f7 82 e8 0d 0f
      57 01 <0f> 0b 5f 5b 5d c3 55 48 89 e5 0f 1f 44 00 00 48 63 87 d8 00 00
      [  233.048393] RIP  [<ffffffff81169653>] kfree_debugcheck+0x27/0x2d
      [  233.048393]  RSP <ffff88000facbca8>
      
      Reported-by: default avatarFengguang Wu <wfg@linux.intel.com>
      Tested-by: default avatarFengguang Wu <wfg@linux.intel.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: "H.K. Jerry Chu" <hkchu@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarH.K. Jerry Chu <hkchu@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7ab4551f
  6. Sep 05, 2012
    • Julian Anastasov's avatar
      tcp: add generic netlink support for tcp_metrics · d23ff701
      Julian Anastasov authored
      
      Add support for genl "tcp_metrics". No locking
      is changed, only that now we can unlink and delete
      entries after grace period. We implement get/del for
      single entry and dump to support show/flush filtering
      in user space. Del without address attribute causes
      flush for all addresses, sadly under genl_mutex.
      
      v2:
      - remove rcu_assign_pointer as suggested by Eric Dumazet,
      it is not needed because there are no other writes under lock
      - move the flushing code in tcp_metrics_flush_all
      
      v3:
      - remove synchronize_rcu on flush as suggested by Eric Dumazet
      
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d23ff701
  7. Sep 03, 2012
  8. Sep 01, 2012
    • Jerry Chu's avatar
      tcp: TCP Fast Open Server - main code path · 168a8f58
      Jerry Chu authored
      
      This patch adds the main processing path to complete the TFO server
      patches.
      
      A TFO request (i.e., SYN+data packet with a TFO cookie option) first
      gets processed in tcp_v4_conn_request(). If it passes the various TFO
      checks by tcp_fastopen_check(), a child socket will be created right
      away to be accepted by applications, rather than waiting for the 3WHS
      to finish.
      
      In additon to the use of TFO cookie, a simple max_qlen based scheme
      is put in place to fend off spoofed TFO attack.
      
      When a valid ACK comes back to tcp_rcv_state_process(), it will cause
      the state of the child socket to switch from either TCP_SYN_RECV to
      TCP_ESTABLISHED, or TCP_FIN_WAIT1 to TCP_FIN_WAIT2. At this time
      retransmission will resume for any unack'ed (data, FIN,...) segments.
      
      Signed-off-by: default avatarH.K. Jerry Chu <hkchu@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      168a8f58
    • Jerry Chu's avatar
      tcp: TCP Fast Open Server - support TFO listeners · 8336886f
      Jerry Chu authored
      
      This patch builds on top of the previous patch to add the support
      for TFO listeners. This includes -
      
      1. allocating, properly initializing, and managing the per listener
      fastopen_queue structure when TFO is enabled
      
      2. changes to the inet_csk_accept code to support TFO. E.g., the
      request_sock can no longer be freed upon accept(), not until 3WHS
      finishes
      
      3. allowing a TCP_SYN_RECV socket to properly poll() and sendmsg()
      if it's a TFO socket
      
      4. properly closing a TFO listener, and a TFO socket before 3WHS
      finishes
      
      5. supporting TCP_FASTOPEN socket option
      
      6. modifying tcp_check_req() to use to check a TFO socket as well
      as request_sock
      
      7. supporting TCP's TFO cookie option
      
      8. adding a new SYN-ACK retransmit handler to use the timer directly
      off the TFO socket rather than the listener socket. Note that TFO
      server side will not retransmit anything other than SYN-ACK until
      the 3WHS is completed.
      
      The patch also contains an important function
      "reqsk_fastopen_remove()" to manage the somewhat complex relation
      between a listener, its request_sock, and the corresponding child
      socket. See the comment above the function for the detail.
      
      Signed-off-by: default avatarH.K. Jerry Chu <hkchu@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8336886f
    • Jerry Chu's avatar
      tcp: TCP Fast Open Server - header & support functions · 10467163
      Jerry Chu authored
      
      This patch adds all the necessary data structure and support
      functions to implement TFO server side. It also documents a number
      of flags for the sysctl_tcp_fastopen knob, and adds a few Linux
      extension MIBs.
      
      In addition, it includes the following:
      
      1. a new TCP_FASTOPEN socket option an application must call to
      supply a max backlog allowed in order to enable TFO on its listener.
      
      2. A number of key data structures:
      "fastopen_rsk" in tcp_sock - for a big socket to access its
      request_sock for retransmission and ack processing purpose. It is
      non-NULL iff 3WHS not completed.
      
      "fastopenq" in request_sock_queue - points to a per Fast Open
      listener data structure "fastopen_queue" to keep track of qlen (# of
      outstanding Fast Open requests) and max_qlen, among other things.
      
      "listener" in tcp_request_sock - to point to the original listener
      for book-keeping purpose, i.e., to maintain qlen against max_qlen
      as part of defense against IP spoofing attack.
      
      3. various data structure and functions, many in tcp_fastopen.c, to
      support server side Fast Open cookie operations, including
      /proc/sys/net/ipv4/tcp_fastopen_key to allow manual rekeying.
      
      Signed-off-by: default avatarH.K. Jerry Chu <hkchu@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10467163
  9. Aug 31, 2012
    • Alexander Duyck's avatar
      ipv4: Minor logic clean-up in ipv4_mtu · 98d75c37
      Alexander Duyck authored
      
      In ipv4_mtu there is some logic where we are testing for a non-zero value
      and a timer expiration, then setting the value to zero, and then testing if
      the value is zero we set it to a value based on the dst.  Instead of
      bothering with the extra steps it is easier to just cleanup the logic so
      that we set it to the dst based value if it is zero or if the timer has
      expired.
      
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      98d75c37
  10. Aug 30, 2012
  11. Aug 26, 2012
  12. Aug 24, 2012
  13. Aug 23, 2012
    • Eric Dumazet's avatar
      net: reinstate rtnl in call_netdevice_notifiers() · 748e2d93
      Eric Dumazet authored
      
      Eric Biederman pointed out that not holding RTNL while calling
      call_netdevice_notifiers() was racy.
      
      This patch is a direct transcription his feedback
      against commit 0115e8e3 (net: remove delay at device dismantle)
      
      Thanks Eric !
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Mahesh Bandewar <maheshb@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Gao feng <gaofeng@cn.fujitsu.com>
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      748e2d93
    • Eric Dumazet's avatar
      net: remove delay at device dismantle · 0115e8e3
      Eric Dumazet authored
      
      I noticed extra one second delay in device dismantle, tracked down to
      a call to dst_dev_event() while some call_rcu() are still in RCU queues.
      
      These call_rcu() were posted by rt_free(struct rtable *rt) calls.
      
      We then wait a little (but one second) in netdev_wait_allrefs() before
      kicking again NETDEV_UNREGISTER.
      
      As the call_rcu() are now completed, dst_dev_event() can do the needed
      device swap on busy dst.
      
      To solve this problem, add a new NETDEV_UNREGISTER_FINAL, called
      after a rcu_barrier(), but outside of RTNL lock.
      
      Use NETDEV_UNREGISTER_FINAL with care !
      
      Change dst_dev_event() handler to react to NETDEV_UNREGISTER_FINAL
      
      Also remove NETDEV_UNREGISTER_BATCH, as its not used anymore after
      IP cache removal.
      
      With help from Gao feng
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Mahesh Bandewar <maheshb@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Gao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0115e8e3
    • Eric Dumazet's avatar
      ipv4: properly update pmtu · 9b04f350
      Eric Dumazet authored
      
      Sylvain Munault reported following info :
      
       - TCP connection get "stuck" with data in send queue when doing
         "large" transfers ( like typing 'ps ax' on a ssh connection )
       - Only happens on path where the PMTU is lower than the MTU of
         the interface
       - Is not present right after boot, it only appears 10-20min after
         boot or so. (and that's inside the _same_ TCP connection, it works
         fine at first and then in the same ssh session, it'll get stuck)
       - Definitely seems related to fragments somehow since I see a router
         sending ICMP message saying fragmentation is needed.
       - Exact same setup works fine with kernel 3.5.1
      
      Problem happens when the 10 minutes (ip_rt_mtu_expires) expiration
      period is over.
      
      ip_rt_update_pmtu() calls dst_set_expires() to rearm a new expiration,
      but dst_set_expires() does nothing because dst.expires is already set.
      
      It seems we want to set the expires field to a new value, regardless
      of prior one.
      
      With help from Julian Anastasov.
      
      Reported-by: default avatarSylvain Munaut <s.munaut@whatever-company.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      CC: Julian Anastasov <ja@ssi.bg>
      Tested-by: default avatarSylvain Munaut <s.munaut@whatever-company.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b04f350
  14. Aug 22, 2012
  15. Aug 21, 2012
    • Eric Dumazet's avatar
      ipv4: fix ip header ident selection in __ip_make_skb() · a9915a1b
      Eric Dumazet authored
      Christian Casteyde reported a kmemcheck 32-bit read from uninitialized
      memory in __ip_select_ident().
      
      It turns out that __ip_make_skb() called ip_select_ident() before
      properly initializing iph->daddr.
      
      This is a bug uncovered by commit 1d861aa4 (inet: Minimize use of
      cached route inetpeer.)
      
      Addresses https://bugzilla.kernel.org/show_bug.cgi?id=46131
      
      
      
      Reported-by: default avatarChristian Casteyde <casteyde.christian@free.fr>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Stephen Hemminger <shemminger@vyatta.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9915a1b
    • Christoph Paasch's avatar
      ipv4: Use newinet->inet_opt in inet_csk_route_child_sock() · 1a7b27c9
      Christoph Paasch authored
      
      Since 0e734419 ("ipv4: Use inet_csk_route_child_sock() in DCCP and
      TCP."), inet_csk_route_child_sock() is called instead of
      inet_csk_route_req().
      
      However, after creating the child-sock in tcp/dccp_v4_syn_recv_sock(),
      ireq->opt is set to NULL, before calling inet_csk_route_child_sock().
      Thus, inside inet_csk_route_child_sock() opt is always NULL and the
      SRR-options are not respected anymore.
      Packets sent by the server won't have the correct destination-IP.
      
      This patch fixes it by accessing newinet->inet_opt instead of ireq->opt
      inside inet_csk_route_child_sock().
      
      Reported-by: default avatarLuca Boccassi <luca.boccassi@gmail.com>
      Signed-off-by: default avatarChristoph Paasch <christoph.paasch@uclouvain.be>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a7b27c9
    • Eric Dumazet's avatar
      tcp: fix possible socket refcount problem · 144d56e9
      Eric Dumazet authored
      
      Commit 6f458dfb (tcp: improve latencies of timer triggered events)
      added bug leading to following trace :
      
      [ 2866.131281] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000
      [ 2866.131726]
      [ 2866.132188] =========================
      [ 2866.132281] [ BUG: held lock freed! ]
      [ 2866.132281] 3.6.0-rc1+ #622 Not tainted
      [ 2866.132281] -------------------------
      [ 2866.132281] kworker/0:1/652 is freeing memory ffff880019ec0000-ffff880019ec0a1f, with a lock still held there!
      [ 2866.132281]  (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6
      [ 2866.132281] 4 locks held by kworker/0:1/652:
      [ 2866.132281]  #0:  (rpciod){.+.+.+}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f
      [ 2866.132281]  #1:  ((&task->u.tk_work)){+.+.+.}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f
      [ 2866.132281]  #2:  (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6
      [ 2866.132281]  #3:  (&icsk->icsk_retransmit_timer){+.-...}, at: [<ffffffff81078017>] run_timer_softirq+0x1ad/0x35f
      [ 2866.132281]
      [ 2866.132281] stack backtrace:
      [ 2866.132281] Pid: 652, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #622
      [ 2866.132281] Call Trace:
      [ 2866.132281]  <IRQ>  [<ffffffff810bc527>] debug_check_no_locks_freed+0x112/0x159
      [ 2866.132281]  [<ffffffff818a0839>] ? __sk_free+0xfd/0x114
      [ 2866.132281]  [<ffffffff811549fa>] kmem_cache_free+0x6b/0x13a
      [ 2866.132281]  [<ffffffff818a0839>] __sk_free+0xfd/0x114
      [ 2866.132281]  [<ffffffff818a08c0>] sk_free+0x1c/0x1e
      [ 2866.132281]  [<ffffffff81911e1c>] tcp_write_timer+0x51/0x56
      [ 2866.132281]  [<ffffffff81078082>] run_timer_softirq+0x218/0x35f
      [ 2866.132281]  [<ffffffff81078017>] ? run_timer_softirq+0x1ad/0x35f
      [ 2866.132281]  [<ffffffff810f5831>] ? rb_commit+0x58/0x85
      [ 2866.132281]  [<ffffffff81911dcb>] ? tcp_write_timer_handler+0x148/0x148
      [ 2866.132281]  [<ffffffff81070bd6>] __do_softirq+0xcb/0x1f9
      [ 2866.132281]  [<ffffffff81a0a00c>] ? _raw_spin_unlock+0x29/0x2e
      [ 2866.132281]  [<ffffffff81a1227c>] call_softirq+0x1c/0x30
      [ 2866.132281]  [<ffffffff81039f38>] do_softirq+0x4a/0xa6
      [ 2866.132281]  [<ffffffff81070f2b>] irq_exit+0x51/0xad
      [ 2866.132281]  [<ffffffff81a129cd>] do_IRQ+0x9d/0xb4
      [ 2866.132281]  [<ffffffff81a0a3ef>] common_interrupt+0x6f/0x6f
      [ 2866.132281]  <EOI>  [<ffffffff8109d006>] ? sched_clock_cpu+0x58/0xd1
      [ 2866.132281]  [<ffffffff81a0a172>] ? _raw_spin_unlock_irqrestore+0x4c/0x56
      [ 2866.132281]  [<ffffffff81078692>] mod_timer+0x178/0x1a9
      [ 2866.132281]  [<ffffffff818a00aa>] sk_reset_timer+0x19/0x26
      [ 2866.132281]  [<ffffffff8190b2cc>] tcp_rearm_rto+0x99/0xa4
      [ 2866.132281]  [<ffffffff8190dfba>] tcp_event_new_data_sent+0x6e/0x70
      [ 2866.132281]  [<ffffffff8190f7ea>] tcp_write_xmit+0x7de/0x8e4
      [ 2866.132281]  [<ffffffff818a565d>] ? __alloc_skb+0xa0/0x1a1
      [ 2866.132281]  [<ffffffff8190f952>] __tcp_push_pending_frames+0x2e/0x8a
      [ 2866.132281]  [<ffffffff81904122>] tcp_sendmsg+0xb32/0xcc6
      [ 2866.132281]  [<ffffffff819229c2>] inet_sendmsg+0xaa/0xd5
      [ 2866.132281]  [<ffffffff81922918>] ? inet_autobind+0x5f/0x5f
      [ 2866.132281]  [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb
      [ 2866.132281]  [<ffffffff8189adab>] sock_sendmsg+0xa3/0xc4
      [ 2866.132281]  [<ffffffff810f5de6>] ? rb_reserve_next_event+0x26f/0x2d5
      [ 2866.132281]  [<ffffffff8103e6a9>] ? native_sched_clock+0x29/0x6f
      [ 2866.132281]  [<ffffffff8103e6f8>] ? sched_clock+0x9/0xd
      [ 2866.132281]  [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb
      [ 2866.132281]  [<ffffffff8189ae03>] kernel_sendmsg+0x37/0x43
      [ 2866.132281]  [<ffffffff8199ce49>] xs_send_kvec+0x77/0x80
      [ 2866.132281]  [<ffffffff8199cec1>] xs_sendpages+0x6f/0x1a0
      [ 2866.132281]  [<ffffffff8107826d>] ? try_to_del_timer_sync+0x55/0x61
      [ 2866.132281]  [<ffffffff8199d0d2>] xs_tcp_send_request+0x55/0xf1
      [ 2866.132281]  [<ffffffff8199bb90>] xprt_transmit+0x89/0x1db
      [ 2866.132281]  [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c
      [ 2866.132281]  [<ffffffff81999d92>] call_transmit+0x1c5/0x20e
      [ 2866.132281]  [<ffffffff819a0d55>] __rpc_execute+0x6f/0x225
      [ 2866.132281]  [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c
      [ 2866.132281]  [<ffffffff819a0f33>] rpc_async_schedule+0x28/0x34
      [ 2866.132281]  [<ffffffff810835d6>] process_one_work+0x24d/0x47f
      [ 2866.132281]  [<ffffffff81083567>] ? process_one_work+0x1de/0x47f
      [ 2866.132281]  [<ffffffff819a0f0b>] ? __rpc_execute+0x225/0x225
      [ 2866.132281]  [<ffffffff81083a6d>] worker_thread+0x236/0x317
      [ 2866.132281]  [<ffffffff81083837>] ? process_scheduled_works+0x2f/0x2f
      [ 2866.132281]  [<ffffffff8108b7b8>] kthread+0x9a/0xa2
      [ 2866.132281]  [<ffffffff81a12184>] kernel_thread_helper+0x4/0x10
      [ 2866.132281]  [<ffffffff81a0a4b0>] ? retint_restore_args+0x13/0x13
      [ 2866.132281]  [<ffffffff8108b71e>] ? __init_kthread_worker+0x5a/0x5a
      [ 2866.132281]  [<ffffffff81a12180>] ? gs_change+0x13/0x13
      [ 2866.308506] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000
      [ 2866.309689] =============================================================================
      [ 2866.310254] BUG TCP (Not tainted): Object already free
      [ 2866.310254] -----------------------------------------------------------------------------
      [ 2866.310254]
      
      The bug comes from the fact that timer set in sk_reset_timer() can run
      before we actually do the sock_hold(). socket refcount reaches zero and
      we free the socket too soon.
      
      timer handler is not allowed to reduce socket refcnt if socket is owned
      by the user, or we need to change sk_reset_timer() implementation.
      
      We should take a reference on the socket in case TCP_DELACK_TIMER_DEFERRED
      or TCP_DELACK_TIMER_DEFERRED bit are set in tsq_flags
      
      Also fix a typo in tcp_delack_timer(), where TCP_WRITE_TIMER_DEFERRED
      was used instead of TCP_DELACK_TIMER_DEFERRED.
      
      For consistency, use same socket refcount change for TCP_MTU_REDUCED_DEFERRED,
      even if not fired from a timer.
      
      Reported-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Tested-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      144d56e9
  16. Aug 20, 2012
  17. Aug 16, 2012
Loading