Skip to content
Snippets Groups Projects
  1. May 01, 2014
  2. Apr 29, 2014
  3. Apr 15, 2014
  4. Apr 08, 2014
    • Joe Thornber's avatar
      dm thin: fix rcu_read_lock being held in code that can sleep · b10ebd34
      Joe Thornber authored
      
      Commit c140e1c4 ("dm thin: use per thin device deferred bio lists")
      introduced the use of an rculist for all active thin devices.  The use
      of rcu_read_lock() in process_deferred_bios() can result in a BUG if a
      dm_bio_prison_cell must be allocated as a side-effect of bio_detain():
      
       BUG: sleeping function called from invalid context at mm/mempool.c:203
       in_atomic(): 1, irqs_disabled(): 0, pid: 6, name: kworker/u8:0
       3 locks held by kworker/u8:0/6:
         #0:  ("dm-" "thin"){.+.+..}, at: [<ffffffff8106be42>] process_one_work+0x192/0x550
         #1:  ((&pool->worker)){+.+...}, at: [<ffffffff8106be42>] process_one_work+0x192/0x550
         #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff816360b5>] do_worker+0x5/0x4d0
      
      We can't process deferred bios with the rcu lock held, since
      dm_bio_prison_cell allocation may block if the bio-prison's cell mempool
      is exhausted.
      
      To fix:
      
      - Introduce a refcount and completion field to each thin_c
      
      - Add thin_get/put methods for adjusting the refcount.  If the refcount
        hits zero then the completion is triggered.
      
      - Initialise refcount to 1 when creating thin_c
      
      - When iterating the active_thins list we thin_get() whilst the rcu
        lock is held.
      
      - After the rcu lock is dropped we process the deferred bios for that
        thin.
      
      - When destroying a thin_c we thin_put() and then wait for the
        completion -- to avoid a race between the worker thread iterating
        from that thin_c and destroying the thin_c.
      
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      b10ebd34
    • Joe Thornber's avatar
      dm thin: irqsave must always be used with the pool->lock spinlock · 5e3283e2
      Joe Thornber authored
      
      Commit c140e1c4 ("dm thin: use per thin device deferred bio lists")
      incorrectly stopped disabling irqs when taking the pool's spinlock.
      
      Irqs must be disabled when taking the pool's spinlock otherwise a thread
      could spin_lock(), then get interrupted to service thin_endio() in
      interrupt context, which would then deadlock in spin_lock_irqsave().
      
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      5e3283e2
  5. Apr 04, 2014
    • Joe Thornber's avatar
      dm cache: fix a lock-inversion · 0596661f
      Joe Thornber authored
      
      When suspending a cache the policy is walked and the individual policy
      hints written to the metadata via sync_metadata().  This led to this
      lock order:
      
            policy->lock
              cache_metadata->root_lock
      
      When loading the cache target the policy is populated while the metadata
      lock is held:
      
            cache_metadata->root_lock
               policy->lock
      
      Fix this potential lock-inversion (ABBA) deadlock in sync_metadata() by
      ensuring the cache_metadata root_lock is held whilst all the hints are
      written, rather than being repeatedly locked while policy->lock is held
      (as was the case with each callout that policy_walk_mappings() made to
      the old save_hint() method).
      
      Found by turning on the CONFIG_PROVE_LOCKING ("Lock debugging: prove
      locking correctness") build option.  However, it is not clear how the
      LOCKDEP reported paths can lead to a deadlock since the two paths,
      suspending a target and loading a target, never occur at the same time.
      But that doesn't mean the same lock-inversion couldn't have occurred
      elsewhere.
      
      Reported-by: default avatarMarian Csontos <mcsontos@redhat.com>
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org
      0596661f
    • Mike Snitzer's avatar
      dm thin: sort the per thin deferred bios using an rb_tree · 67324ea1
      Mike Snitzer authored
      
      A thin-pool will allocate blocks using FIFO order for all thin devices
      which share the thin-pool.  Because of this simplistic allocation the
      thin-pool's space can become fragmented quite easily; especially when
      multiple threads are requesting blocks in parallel.
      
      Sort each thin device's deferred_bio_list based on logical sector to
      help reduce fragmentation of the thin-pool's ondisk layout.
      
      The following tables illustrate the realized gains/potential offered by
      sorting each thin device's deferred_bio_list.  An "io size"-sized random
      read of the device would result in "seeks/io" fragments being read, with
      an average "distance/seek" between each fragment.
      
      Data was written to a single thin device using multiple threads via
      iozone (8 threads, 64K for both the block_size and io_size).
      
      unsorted:
      
           io size   seeks/io distance/seek
        --------------------------------------
                4k    0.000   0b
               16k    0.013   11m
               64k    0.065   11m
              256k    0.274   10m
                1m    1.109   10m
                4m    4.411   10m
               16m    17.097  11m
               64m    60.055  13m
              256m    148.798 25m
                1g    809.929 21m
      
      sorted:
      
           io size   seeks/io distance/seek
        --------------------------------------
                4k    0.000   0b
               16k    0.000   1g
               64k    0.001   1g
              256k    0.003   1g
                1m    0.011   1g
                4m    0.045   1g
               16m    0.181   1g
               64m    0.747   1011m
              256m    3.299   1g
                1g    14.373  1g
      
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Acked-by: default avatarJoe Thornber <ejt@redhat.com>
      67324ea1
  6. Mar 31, 2014
  7. Mar 28, 2014
  8. Mar 27, 2014
  9. Mar 24, 2014
    • David Stevens's avatar
      vxlan: fix nonfunctional neigh_reduce() · 4b29dba9
      David Stevens authored
      
      The VXLAN neigh_reduce() code is completely non-functional since
      check-in. Specific errors:
      
      1) The original code drops all packets with a multicast destination address,
      	even though neighbor solicitations are sent to the solicited-node
      	address, a multicast address. The code after this check was never run.
      2) The neighbor table lookup used the IPv6 header destination, which is the
      	solicited node address, rather than the target address from the
      	neighbor solicitation. So neighbor lookups would always fail if it
      	got this far. Also for L3MISSes.
      3) The code calls ndisc_send_na(), which does a send on the tunnel device.
      	The context for neigh_reduce() is the transmit path, vxlan_xmit(),
      	where the host or a bridge-attached neighbor is trying to transmit
      	a neighbor solicitation. To respond to it, the tunnel endpoint needs
      	to do a *receive* of the appropriate neighbor advertisement. Doing a
      	send, would only try to send the advertisement, encapsulated, to the
      	remote destinations in the fdb -- hosts that definitely did not do the
      	corresponding solicitation.
      4) The code uses the tunnel endpoint IPv6 forwarding flag to determine the
      	isrouter flag in the advertisement. This has nothing to do with whether
      	or not the target is a router, and generally won't be set since the
      	tunnel endpoint is bridging, not routing, traffic.
      
      	The patch below creates a proxy neighbor advertisement to respond to
      neighbor solicitions as intended, providing proper IPv6 support for neighbor
      reduction.
      
      Signed-off-by: default avatarDavid L Stevens <dlstevens@us.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4b29dba9
    • Christian Riesch's avatar
      net: davinci_emac: Fix rollback of emac_dev_open() · cd11cf50
      Christian Riesch authored
      
      If an error occurs during the initialization in emac_dev_open() (the
      driver's ndo_open function), interrupts, DMA descriptors etc. must be freed.
      The current rollback code is buggy in several ways.
      
        1) Freeing the interrupts. The current code will not free all interrupts
           that were requested by the driver. Furthermore,  the code tries to do a
           platform_get_resource(priv->pdev, IORESOURCE_IRQ, -1) in its last
           iteration.
      
           This patch fixes these bugs.
      
        2) Wrong order of err: and rollback: labels. If the setup of the PHY in
           the code fails, the interrupts that have been requested before are
           not freed:
      
              request irq
                      if requesting irqs fails, goto rollback
              setup phy
                      if phy setup fails, goto err
              return 0
      
           rollback:
              free irqs
           err:
      
           This patch brings the code into the correct order.
      
        3) The code calls napi_enable() and emac_int_enable(), but does not
           undo both in case of an error.
      
           This patch adds calls of emac_int_disable() and napi_disable() to the
           rollback code.
      
        4) RX DMA descriptors are not freed in case of an error: Right before
           requesting the irqs, the function creates DMA descriptors for the
           RX channel. These RX descriptors are never freed when we jump to either
           rollback or err.
      
           This patch adds code for freeing the DMA descriptors in the case of
           an initialization error. This required a modification of
           cpdma_ctrl_stop() in davinci_cpdma.c: We must be able to call this
           function to free the DMA descriptors while the DMA channels are
           in IDLE state (before cpdma_ctlr_start() was called).
      
      Tested on a custom board with the Texas Instruments AM1808.
      
      Signed-off-by: default avatarChristian Riesch <christian.riesch@omicron.at>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cd11cf50
    • Christian Riesch's avatar
      net: davinci_emac: Replace devm_request_irq with request_irq · 33b7107f
      Christian Riesch authored
      
      In commit 6892b41d
      
      Author: Lad, Prabhakar <prabhakar.csengg@gmail.com>
      Date:   Tue Jun 25 21:24:51 2013 +0530
      net: davinci: emac: Convert to devm_* api
      
      the call of request_irq is replaced by devm_request_irq and the call
      of free_irq is removed. But since interrupts are requested in
      emac_dev_open, doing ifconfig up/down on the board requests the
      interrupts again each time, causing devm_request_irq to fail. The
      interface is dead until the device is rebooted.
      
      This patch reverts said commit partially: It changes the driver back
      to use request_irq instead of devm_request_irq, puts free_irq back in
      place, but keeps the remaining changes of the original patch.
      
      Reported-by: default avatarJon Ringle <jon@ringle.org>
      Signed-off-by: default avatarChristian Riesch <christian.riesch@omicron.at>
      Cc: Lad, Prabhakar <prabhakar.csengg@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33b7107f
    • Nishanth Menon's avatar
      net: micrel : ks8851-ml: add vdd-supply support · ebf4ad95
      Nishanth Menon authored
      
      Few platforms use external regulator to keep the ethernet MAC supplied.
      So, request and enable the regulator for driver functionality.
      
      Fixes: 66fda75f (regulator: core: Replace direct ops->disable usage)
      Reported-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Suggested-by: default avatarMarkus Pargmann <mpa@pengutronix.de>
      Signed-off-by: default avatarNishanth Menon <nm@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebf4ad95
  10. Mar 20, 2014
  11. Mar 19, 2014
  12. Mar 18, 2014
Loading