Skip to content
Snippets Groups Projects
  1. Mar 28, 2014
    • Flavio Leitner's avatar
      openvswitch: fix a possible deadlock and lockdep warning · 4f647e0a
      Flavio Leitner authored
      
      There are two problematic situations.
      
      A deadlock can happen when is_percpu is false because it can get
      interrupted while holding the spinlock. Then it executes
      ovs_flow_stats_update() in softirq context which tries to get
      the same lock.
      
      The second sitation is that when is_percpu is true, the code
      correctly disables BH but only for the local CPU, so the
      following can happen when locking the remote CPU without
      disabling BH:
      
             CPU#0                            CPU#1
        ovs_flow_stats_get()
         stats_read()
       +->spin_lock remote CPU#1        ovs_flow_stats_get()
       |  <interrupted>                  stats_read()
       |  ...                       +-->  spin_lock remote CPU#0
       |                            |     <interrupted>
       |  ovs_flow_stats_update()   |     ...
       |   spin_lock local CPU#0 <--+     ovs_flow_stats_update()
       +---------------------------------- spin_lock local CPU#1
      
      This patch disables BH for both cases fixing the deadlocks.
      Acked-by: default avatarJesse Gross <jesse@nicira.com>
      
      =================================
      [ INFO: inconsistent lock state ]
      3.14.0-rc8-00007-g632b06a #1 Tainted: G          I
      ---------------------------------
      inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
      swapper/0/0 [HC0[0]:SC1[5]:HE1:SE0] takes:
      (&(&cpu_stats->lock)->rlock){+.?...}, at: [<ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch]
      {SOFTIRQ-ON-W} state was registered at:
      [<ffffffff810f973f>] __lock_acquire+0x68f/0x1c40
      [<ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0
      [<ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80
      [<ffffffffa05dd9e4>] ovs_flow_stats_get+0xc4/0x1e0 [openvswitch]
      [<ffffffffa05da855>] ovs_flow_cmd_fill_info+0x185/0x360 [openvswitch]
      [<ffffffffa05daf05>] ovs_flow_cmd_build_info.constprop.27+0x55/0x90 [openvswitch]
      [<ffffffffa05db41d>] ovs_flow_cmd_new_or_set+0x4dd/0x570 [openvswitch]
      [<ffffffff816c245d>] genl_family_rcv_msg+0x1cd/0x3f0
      [<ffffffff816c270e>] genl_rcv_msg+0x8e/0xd0
      [<ffffffff816c0239>] netlink_rcv_skb+0xa9/0xc0
      [<ffffffff816c0798>] genl_rcv+0x28/0x40
      [<ffffffff816bf830>] netlink_unicast+0x100/0x1e0
      [<ffffffff816bfc57>] netlink_sendmsg+0x347/0x770
      [<ffffffff81668e9c>] sock_sendmsg+0x9c/0xe0
      [<ffffffff816692d9>] ___sys_sendmsg+0x3a9/0x3c0
      [<ffffffff8166a911>] __sys_sendmsg+0x51/0x90
      [<ffffffff8166a962>] SyS_sendmsg+0x12/0x20
      [<ffffffff817e3ce9>] system_call_fastpath+0x16/0x1b
      irq event stamp: 1740726
      hardirqs last  enabled at (1740726): [<ffffffff8175d5e0>] ip6_finish_output2+0x4f0/0x840
      hardirqs last disabled at (1740725): [<ffffffff8175d59b>] ip6_finish_output2+0x4ab/0x840
      softirqs last  enabled at (1740674): [<ffffffff8109be12>] _local_bh_enable+0x22/0x50
      softirqs last disabled at (1740675): [<ffffffff8109db05>] irq_exit+0xc5/0xd0
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&(&cpu_stats->lock)->rlock);
        <Interrupt>
          lock(&(&cpu_stats->lock)->rlock);
      
       *** DEADLOCK ***
      
      5 locks held by swapper/0/0:
       #0:  (((&ifa->dad_timer))){+.-...}, at: [<ffffffff810a7155>] call_timer_fn+0x5/0x320
       #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff81788a55>] mld_sendpack+0x5/0x4a0
       #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8175d149>] ip6_finish_output2+0x59/0x840
       #3:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8168ba75>] __dev_queue_xmit+0x5/0x9b0
       #4:  (rcu_read_lock){.+.+..}, at: [<ffffffffa05e41b5>] internal_dev_xmit+0x5/0x110 [openvswitch]
      
      stack backtrace:
      CPU: 0 PID: 0 Comm: swapper/0 Tainted: G          I  3.14.0-rc8-00007-g632b06a #1
      Hardware name:                  /DX58SO, BIOS SOX5810J.86A.5599.2012.0529.2218 05/29/2012
       0000000000000000 0fcf20709903df0c ffff88042d603808 ffffffff817cfe3c
       ffffffff81c134c0 ffff88042d603858 ffffffff817cb6da 0000000000000005
       ffffffff00000001 ffff880400000000 0000000000000006 ffffffff81c134c0
      Call Trace:
       <IRQ>  [<ffffffff817cfe3c>] dump_stack+0x4d/0x66
       [<ffffffff817cb6da>] print_usage_bug+0x1f4/0x205
       [<ffffffff810f7f10>] ? check_usage_backwards+0x180/0x180
       [<ffffffff810f8963>] mark_lock+0x223/0x2b0
       [<ffffffff810f96d3>] __lock_acquire+0x623/0x1c40
       [<ffffffff810f5707>] ? __lock_is_held+0x57/0x80
       [<ffffffffa05e26c6>] ? masked_flow_lookup+0x236/0x250 [openvswitch]
       [<ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0
       [<ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch]
       [<ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80
       [<ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch]
       [<ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch]
       [<ffffffffa05dcc64>] ovs_dp_process_received_packet+0x84/0x120 [openvswitch]
       [<ffffffff810f93f7>] ? __lock_acquire+0x347/0x1c40
       [<ffffffffa05e3bea>] ovs_vport_receive+0x2a/0x30 [openvswitch]
       [<ffffffffa05e4218>] internal_dev_xmit+0x68/0x110 [openvswitch]
       [<ffffffffa05e41b5>] ? internal_dev_xmit+0x5/0x110 [openvswitch]
       [<ffffffff8168b4a6>] dev_hard_start_xmit+0x2e6/0x8b0
       [<ffffffff8168be87>] __dev_queue_xmit+0x417/0x9b0
       [<ffffffff8168ba75>] ? __dev_queue_xmit+0x5/0x9b0
       [<ffffffff8175d5e0>] ? ip6_finish_output2+0x4f0/0x840
       [<ffffffff8168c430>] dev_queue_xmit+0x10/0x20
       [<ffffffff8175d641>] ip6_finish_output2+0x551/0x840
       [<ffffffff8176128a>] ? ip6_finish_output+0x9a/0x220
       [<ffffffff8176128a>] ip6_finish_output+0x9a/0x220
       [<ffffffff8176145f>] ip6_output+0x4f/0x1f0
       [<ffffffff81788c29>] mld_sendpack+0x1d9/0x4a0
       [<ffffffff817895b8>] mld_send_initial_cr.part.32+0x88/0xa0
       [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
       [<ffffffff8178e301>] ipv6_mc_dad_complete+0x31/0x50
       [<ffffffff817690d7>] addrconf_dad_completed+0x147/0x220
       [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
       [<ffffffff8176934f>] addrconf_dad_timer+0x19f/0x1c0
       [<ffffffff810a71e9>] call_timer_fn+0x99/0x320
       [<ffffffff810a7155>] ? call_timer_fn+0x5/0x320
       [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220
       [<ffffffff810a76c4>] run_timer_softirq+0x254/0x3b0
       [<ffffffff8109d47d>] __do_softirq+0x12d/0x480
      
      Signed-off-by: default avatarFlavio Leitner <fbl@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f647e0a
    • Toshiaki Makita's avatar
      bridge: Fix handling stacked vlan tags · 99b192da
      Toshiaki Makita authored
      
      If a bridge with vlan_filtering enabled receives frames with stacked
      vlan tags, i.e., they have two vlan tags, br_vlan_untag() strips not
      only the outer tag but also the inner tag.
      
      br_vlan_untag() is called only from br_handle_vlan(), and in this case,
      it is enough to set skb->vlan_tci to 0 here, because vlan_tci has already
      been set before calling br_handle_vlan().
      
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      99b192da
    • Toshiaki Makita's avatar
      bridge: Fix inabillity to retrieve vlan tags when tx offload is disabled · 12464bb8
      Toshiaki Makita authored
      
      Bridge vlan code (br_vlan_get_tag()) assumes that all frames have vlan_tci
      if they are tagged, but if vlan tx offload is manually disabled on bridge
      device and frames are sent from vlan device on the bridge device, the tags
      are embedded in skb->data and they break this assumption.
      Extract embedded vlan tags and move them to vlan_tci at ingress.
      
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      12464bb8
  2. Mar 27, 2014
  3. Mar 26, 2014
  4. Mar 24, 2014
    • Erik Hugne's avatar
      tipc: fix spinlock recursion bug for failed subscriptions · a5d0e7c0
      Erik Hugne authored
      
      If a topology event subscription fails for any reason, such as out
      of memory, max number reached or because we received an invalid
      request the correct behavior is to terminate the subscribers
      connection to the topology server. This is currently broken and
      produces the following oops:
      
      [27.953662] tipc: Subscription rejected, illegal request
      [27.955329] BUG: spinlock recursion on CPU#1, kworker/u4:0/6
      [27.957066]  lock: 0xffff88003c67f408, .magic: dead4ead, .owner: kworker/u4:0/6, .owner_cpu: 1
      [27.958054] CPU: 1 PID: 6 Comm: kworker/u4:0 Not tainted 3.14.0-rc6+ #5
      [27.960230] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      [27.960874] Workqueue: tipc_rcv tipc_recv_work [tipc]
      [27.961430]  ffff88003c67f408 ffff88003de27c18 ffffffff815c0207 ffff88003de1c050
      [27.962292]  ffff88003de27c38 ffffffff815beec5 ffff88003c67f408 ffffffff817f0a8a
      [27.963152]  ffff88003de27c58 ffffffff815beeeb ffff88003c67f408 ffffffffa0013520
      [27.964023] Call Trace:
      [27.964292]  [<ffffffff815c0207>] dump_stack+0x45/0x56
      [27.964874]  [<ffffffff815beec5>] spin_dump+0x8c/0x91
      [27.965420]  [<ffffffff815beeeb>] spin_bug+0x21/0x26
      [27.965995]  [<ffffffff81083df6>] do_raw_spin_lock+0x116/0x140
      [27.966631]  [<ffffffff815c6215>] _raw_spin_lock_bh+0x15/0x20
      [27.967256]  [<ffffffffa0008540>] subscr_conn_shutdown_event+0x20/0xa0 [tipc]
      [27.968051]  [<ffffffffa000fde4>] tipc_close_conn+0xa4/0xb0 [tipc]
      [27.968722]  [<ffffffffa00101ba>] tipc_conn_terminate+0x1a/0x30 [tipc]
      [27.969436]  [<ffffffffa00089a2>] subscr_conn_msg_event+0x1f2/0x2f0 [tipc]
      [27.970209]  [<ffffffffa0010000>] tipc_receive_from_sock+0x90/0xf0 [tipc]
      [27.970972]  [<ffffffffa000fa79>] tipc_recv_work+0x29/0x50 [tipc]
      [27.971633]  [<ffffffff8105dbf5>] process_one_work+0x165/0x3e0
      [27.972267]  [<ffffffff8105e869>] worker_thread+0x119/0x3a0
      [27.972896]  [<ffffffff8105e750>] ? manage_workers.isra.25+0x2a0/0x2a0
      [27.973622]  [<ffffffff810648af>] kthread+0xdf/0x100
      [27.974168]  [<ffffffff810647d0>] ? kthread_create_on_node+0x1a0/0x1a0
      [27.974893]  [<ffffffff815ce13c>] ret_from_fork+0x7c/0xb0
      [27.975466]  [<ffffffff810647d0>] ? kthread_create_on_node+0x1a0/0x1a0
      
      The recursion occurs when subscr_terminate tries to grab the
      subscriber lock, which is already taken by subscr_conn_msg_event.
      We fix this by checking if the request to establish a new
      subscription was successful, and if not we initiate termination of
      the subscriber after we have released the subscriber lock.
      
      Signed-off-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5d0e7c0
    • Li RongQing's avatar
      netpoll: fix the skb check in pkt_is_ns · c27f0872
      Li RongQing authored
      
      Neighbor Solicitation is ipv6 protocol, so we should check
      skb->protocol with ETH_P_IPV6
      
      Signed-off-by: default avatarLi RongQing <roy.qing.li@gmail.com>
      Cc: WANG Cong <amwang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c27f0872
  5. Mar 20, 2014
  6. Mar 18, 2014
    • lucien's avatar
      ipv6: ip6_append_data_mtu do not handle the mtu of the second fragment properly · e367c2d0
      lucien authored
      
      In ip6_append_data_mtu(), when the xfrm mode is not tunnel(such as
      transport),the ipsec header need to be added in the first fragment, so the mtu
      will decrease to reserve space for it, then the second fragment come, the mtu
      should be turn back, as the commit 0c183379
      said.  however, in the commit a493e60ac4bbe2e977e7129d6d8cbb0dd236be, it use
      *mtu = min(*mtu, ...) to change the mtu, which lead to the new mtu is alway
      equal with the first fragment's. and cannot turn back.
      
      when I test through  ping6 -c1 -s5000 $ip (mtu=1280):
      ...frag (0|1232) ESP(spi=0x00002000,seq=0xb), length 1232
      ...frag (1232|1216)
      ...frag (2448|1216)
      ...frag (3664|1216)
      ...frag (4880|164)
      
      which should be:
      ...frag (0|1232) ESP(spi=0x00001000,seq=0x1), length 1232
      ...frag (1232|1232)
      ...frag (2464|1232)
      ...frag (3696|1232)
      ...frag (4928|116)
      
      so delete the min() when change back the mtu.
      
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Fixes: 75a493e6 ("ipv6: ip6_append_data_mtu did not care about pmtudisc and frag_size")
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e367c2d0
  7. Mar 13, 2014
  8. Mar 12, 2014
  9. Mar 11, 2014
  10. Mar 10, 2014
  11. Mar 06, 2014
    • Sabrina Dubroca's avatar
      ipv6: don't set DST_NOCOUNT for remotely added routes · c88507fb
      Sabrina Dubroca authored
      
      DST_NOCOUNT should only be used if an authorized user adds routes
      locally. In case of routes which are added on behalf of router
      advertisments this flag must not get used as it allows an unlimited
      number of routes getting added remotely.
      
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c88507fb
    • Anton Nayshtut's avatar
      ipv6: Fix exthdrs offload registration. · d2d273ff
      Anton Nayshtut authored
      
      Without this fix, ipv6_exthdrs_offload_init doesn't register IPPROTO_DSTOPTS
      offload, but returns 0 (as the IPPROTO_ROUTING registration actually succeeds).
      
      This then causes the ipv6_gso_segment to drop IPv6 packets with IPPROTO_DSTOPTS
      header.
      
      The issue detected and the fix verified by running MS HCK Offload LSO test on
      top of QEMU Windows guests, as this test sends IPv6 packets with
      IPPROTO_DSTOPTS.
      
      Signed-off-by: default avatarAnton Nayshtut <anton@swortex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2d273ff
    • Anton Blanchard's avatar
      net: unix socket code abuses csum_partial · 0a13404d
      Anton Blanchard authored
      
      The unix socket code is using the result of csum_partial to
      hash into a lookup table:
      
      	unix_hash_fold(csum_partial(sunaddr, len, 0));
      
      csum_partial is only guaranteed to produce something that can be
      folded into a checksum, as its prototype explains:
      
       * returns a 32-bit number suitable for feeding into itself
       * or csum_tcpudp_magic
      
      The 32bit value should not be used directly.
      
      Depending on the alignment, the ppc64 csum_partial will return
      different 32bit partial checksums that will fold into the same
      16bit checksum.
      
      This difference causes the following testcase (courtesy of
      Gustavo) to sometimes fail:
      
      #include <sys/socket.h>
      #include <stdio.h>
      
      int main()
      {
      	int fd = socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0);
      
      	int i = 1;
      	setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &i, 4);
      
      	struct sockaddr addr;
      	addr.sa_family = AF_LOCAL;
      	bind(fd, &addr, 2);
      
      	listen(fd, 128);
      
      	struct sockaddr_storage ss;
      	socklen_t sslen = (socklen_t)sizeof(ss);
      	getsockname(fd, (struct sockaddr*)&ss, &sslen);
      
      	fd = socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0);
      
      	if (connect(fd, (struct sockaddr*)&ss, sslen) == -1){
      		perror(NULL);
      		return 1;
      	}
      	printf("OK\n");
      	return 0;
      }
      
      As suggested by davem, fix this by using csum_fold to fold the
      partial 32bit checksum into a 16bit checksum before using it.
      
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a13404d
    • Florian Westphal's avatar
      inet: frag: make sure forced eviction removes all frags · e588e2f2
      Florian Westphal authored
      
      Quoting Alexander Aring:
        While fragmentation and unloading of 6lowpan module I got this kernel Oops
        after few seconds:
      
        BUG: unable to handle kernel paging request at f88bbc30
        [..]
        Modules linked in: ipv6 [last unloaded: 6lowpan]
        Call Trace:
         [<c012af4c>] ? call_timer_fn+0x54/0xb3
         [<c012aef8>] ? process_timeout+0xa/0xa
         [<c012b66b>] run_timer_softirq+0x140/0x15f
      
      Problem is that incomplete frags are still around after unload; when
      their frag expire timer fires, we get crash.
      
      When a netns is removed (also done when unloading module), inet_frag
      calls the evictor with 'force' argument to purge remaining frags.
      
      The evictor loop terminates when accounted memory ('work') drops to 0
      or the lru-list becomes empty.  However, the mem accounting is done
      via percpu counters and may not be accurate, i.e. loop may terminate
      prematurely.
      
      Alter evictor to only stop once the lru list is empty when force is
      requested.
      
      Reported-by: default avatarPhoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de>
      Reported-by: default avatarAlexander Aring <alex.aring@gmail.com>
      Tested-by: default avatarAlexander Aring <alex.aring@gmail.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e588e2f2
    • Erik Hugne's avatar
      tipc: don't log disabled tasklet handler errors · 2892505e
      Erik Hugne authored
      
      Failure to schedule a TIPC tasklet with tipc_k_signal because the
      tasklet handler is disabled is not an error. It means TIPC is
      currently in the process of shutting down. We remove the error
      logging in this case.
      
      Signed-off-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2892505e
    • Erik Hugne's avatar
      tipc: fix memory leak during module removal · 1bb8dce5
      Erik Hugne authored
      
      When the TIPC module is removed, the tasklet handler is disabled
      before all other subsystems. This will cause lingering publications
      in the name table because the node_down tasklets responsible to
      clean up publications from an unreachable node will never run.
      When the name table is shut down, these publications are detected
      and an error message is logged:
      tipc: nametbl_stop(): orphaned hash chain detected
      This is actually a memory leak, introduced with commit
      993b858e ("tipc: correct the order
      of stopping services at rmmod")
      
      Instead of just logging an error and leaking memory, we free
      the orphaned entries during nametable shutdown.
      
      Signed-off-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1bb8dce5
    • Erik Hugne's avatar
      tipc: drop subscriber connection id invalidation · edcc0511
      Erik Hugne authored
      
      When a topology server subscriber is disconnected, the associated
      connection id is set to zero. A check vs zero is then done in the
      subscription timeout function to see if the subscriber have been
      shut down. This is unnecessary, because all subscription timers
      will be cancelled when a subscriber terminates. Setting the
      connection id to zero is actually harmful because id zero is the
      identity of the topology server listening socket, and can cause a
      race that leads to this socket being closed instead.
      
      Signed-off-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      edcc0511
    • Ying Xue's avatar
      tipc: avoid to unnecessary process switch under non-block mode · fe8e4649
      Ying Xue authored
      
      When messages are received via tipc socket under non-block mode,
      schedule_timeout() is called in tipc_wait_for_rcvmsg(), that is,
      the process of receiving messages will be scheduled once although
      timeout value passed to schedule_timeout() is 0. The same issue
      exists in accept()/wait_for_accept(). To avoid this unnecessary
      process switch, we only call schedule_timeout() if the timeout
      value is non-zero.
      
      Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
      Reviewed-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe8e4649
    • Ying Xue's avatar
      tipc: fix connection refcount leak · 4652edb7
      Ying Xue authored
      
      When tipc_conn_sendmsg() calls tipc_conn_lookup() to query a
      connection instance, its reference count value is increased if
      it's found. But subsequently if it's found that the connection is
      closed, the work of sending message is not queued into its server
      send workqueue, and the connection reference count is not decreased.
      This will cause a reference count leak. To reproduce this problem,
      an application would need to open and closes topology server
      connections with high intensity.
      
      We fix this by immediately decrementing the connection reference
      count if a send fails due to the connection being closed.
      
      Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4652edb7
    • Ying Xue's avatar
      tipc: allow connection shutdown callback to be invoked in advance · 6d4ebeb4
      Ying Xue authored
      
      Currently connection shutdown callback function is called when
      connection instance is released in tipc_conn_kref_release(), and
      receiving packets and sending packets are running in different
      threads. Even if connection is closed by the thread of receiving
      packets, its shutdown callback may not be called immediately as
      the connection reference count is non-zero at that moment. So,
      although the connection is shut down by the thread of receiving
      packets, the thread of sending packets doesn't know it. Before
      its shutdown callback is invoked to tell the sending thread its
      connection has been closed, the sending thread may deliver
      messages by tipc_conn_sendmsg(), this is why the following error
      information appears:
      
      "Sending subscription event failed, no memory"
      
      To eliminate it, allow connection shutdown callback function to
      be called before connection id is removed in tipc_close_conn(),
      which makes the sending thread know the truth in time that its
      socket is closed so that it doesn't send message to it. We also
      remove the "Sending XXX failed..." error reporting for topology
      and config services.
      
      Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6d4ebeb4
Loading