Skip to content
Snippets Groups Projects
  1. Sep 17, 2012
  2. Aug 14, 2012
  3. Aug 01, 2012
  4. Jul 12, 2012
  5. Jun 28, 2012
    • Alex Shi's avatar
      x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range · e7b52ffd
      Alex Shi authored
      x86 has no flush_tlb_range support in instruction level. Currently the
      flush_tlb_range just implemented by flushing all page table. That is not
      the best solution for all scenarios. In fact, if we just use 'invlpg' to
      flush few lines from TLB, we can get the performance gain from later
      remain TLB lines accessing.
      
      But the 'invlpg' instruction costs much of time. Its execution time can
      compete with cr3 rewriting, and even a bit more on SNB CPU.
      
      So, on a 512 4KB TLB entries CPU, the balance points is at:
      	(512 - X) * 100ns(assumed TLB refill cost) =
      		X(TLB flush entries) * 100ns(assumed invlpg cost)
      
      Here, X is 256, that is 1/2 of 512 entries.
      
      But with the mysterious CPU pre-fetcher and page miss handler Unit, the
      assumed TLB refill cost is far lower then 100ns in sequential access. And
      2 HT siblings in one core makes the memory access more faster if they are
      accessing the same memory. So, in the patch, I just do the change when
      the target entries is less than 1/16 of whole active tlb entries.
      Actually, I have no data support for the percentage '1/16', so any
      suggestions are welcomed.
      
      As to hugetlb, guess due to smaller page table, and smaller active TLB
      entries, I didn't see benefit via my benchmark, so no optimizing now.
      
      My micro benchmark show in ideal scenarios, the performance improves 70
      percent in reading. And in worst scenario, the reading/writing
      performance is similar with unpatched 3.4-rc4 kernel.
      
      Here is the reading data on my 2P * 4cores *HT NHM EP machine, with THP
      'always':
      
      multi thread testing, '-t' paramter is thread number:
      	       	        with patch   unpatched 3.4-rc4
      ./mprotect -t 1           14ns		24ns
      ./mprotect -t 2           13ns		22ns
      ./mprotect -t 4           12ns		19ns
      ./mprotect -t 8           14ns		16ns
      ./mprotect -t 16          28ns		26ns
      ./mprotect -t 32          54ns		51ns
      ./mprotect -t 128         200ns		199ns
      
      Single process with sequencial flushing and memory accessing:
      
      		       	with patch   unpatched 3.4-rc4
      ./mprotect		    7ns			11ns
      ./mprotect -p 4096  -l 8 -n 10240
      			    21ns		21ns
      
      [ hpa: http://lkml.kernel.org/r/1B4B44D9196EFF41AE41FDA404FC0A100BFF94@SHSMSX101.ccr.corp.intel.com
      
      
        has additional performance numbers. ]
      
      Signed-off-by: default avatarAlex Shi <alex.shi@intel.com>
      Link: http://lkml.kernel.org/r/1340845344-27557-3-git-send-email-alex.shi@intel.com
      
      
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      e7b52ffd
  6. Jun 25, 2012
    • Cliff Wickman's avatar
      x86/uv: Work around UV2 BAU hangs · 8b6e511e
      Cliff Wickman authored
      
      On SGI's UV2 the BAU (Broadcast Assist Unit) driver can hang
      under a heavy load. To cure this:
      
      - Disable the UV2 extended status mode (see UV2_EXT_SHFT), as
        this mode changes BAU behavior in more ways then just delivering
        an extra bit of status.  Revert status to just two meaningful bits,
        like UV1.
      
      - Use no IPI-style resets on UV2.  Just give up the request for
        whatever the reason it failed and let it be accomplished with
        the legacy IPI method.
      
      - Use no alternate sending descriptor (the former UV2 workaround
        bcp->using_desc and handle_uv2_busy() stuff).  Just disable the
        use of the BAU for a period of time in favor of the legacy IPI
        method when the h/w bug leaves a descriptor busy.
      
        -- new tunable: giveup_limit determines the threshold at which a hub is
           so plugged that it should do all requests with the legacy IPI method for a
           period of time
        -- generalize disable_for_congestion() (renamed disable_for_period()) for
           use whenever a hub should avoid using the BAU for a period of time
      
      Also:
      
       - Fix find_another_by_swack(), which is part of the UV2 bug workaround
      
       - Correct and clarify the statistics (new stats s_overipilimit, s_giveuplimit,
         s_enters, s_ipifordisabled, s_plugged, s_congested)
      
      Signed-off-by: default avatarCliff Wickman <cpw@sgi.com>
      Link: http://lkml.kernel.org/r/20120622131459.GC31884@sgi.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8b6e511e
    • Cliff Wickman's avatar
      x86/uv: Implement UV BAU runtime enable and disable control via /proc/sgi_uv/ · 26ef8577
      Cliff Wickman authored
      
      This patch enables the BAU to be turned on or off dynamically.
      
        echo "on"  > /proc/sgi_uv/ptc_statistics
        echo "off" > /proc/sgi_uv/ptc_statistics
      
      The system may be booted with or without the nobau option.
      
      Whether the system currently has the BAU off can be seen in
      the /proc file -- normally with the baustats script.
      Each cpu will have a 1 in the bauoff field if the BAU was turned
      off, so baustats will give a count of cpus that have it off.
      
      Signed-off-by: default avatarCliff Wickman <cpw@sgi.com>
      Link: http://lkml.kernel.org/r/20120622131330.GB31884@sgi.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      26ef8577
    • Cliff Wickman's avatar
      x86/uv: Fix the UV BAU destination timeout period · 11cab711
      Cliff Wickman authored
      
      Correct the calculation of a destination timeout period, which
      is used to distinguish between a destination timeout and the
      situation where all the target software ack resources are full
      and a request is returned immediately.
      
      The problem is that integer arithmetic was overflowing, yielding
      a very large result.
      
      Without this fix destination timeouts are identified as resource
      'plugged' events and an ipi method of resource releasing is
      unnecessarily employed.
      
      Signed-off-by: default avatarCliff Wickman <cpw@sgi.com>
      Link: http://lkml.kernel.org/r/20120622131212.GA31884@sgi.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      11cab711
  7. Jun 15, 2012
  8. Jun 14, 2012
  9. Jun 08, 2012
    • Cliff Wickman's avatar
      x86/uv: Fix UV2 BAU legacy mode · d5d2d2ee
      Cliff Wickman authored
      
      The SGI Altix UV2 BAU (Broadcast Assist Unit) as used for
      tlb-shootdown (selective broadcast mode) always uses UV2
      broadcast descriptor format. There is no need to clear the
      'legacy' (UV1) mode, because the hardware always uses UV2 mode
      for selective broadcast.
      
      But the BIOS uses general broadcast and legacy mode, and the
      hardware pays attention to the legacy mode bit for general
      broadcast. So the kernel must not clear that mode bit.
      
      Signed-off-by: default avatarCliff Wickman <cpw@sgi.com>
      Cc: <stable@kernel.org>
      Link: http://lkml.kernel.org/r/E1SccoO-0002Lh-Cb@eag09.americas.sgi.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d5d2d2ee
    • Alexander Gordeev's avatar
      x86/apic: Make cpu_mask_to_apicid() operations return error code · ff164324
      Alexander Gordeev authored
      
      Current cpu_mask_to_apicid() and cpu_mask_to_apicid_and()
      implementations have few shortcomings:
      
      1. A value returned by cpu_mask_to_apicid() is written to
      hardware registers unconditionally. Should BAD_APICID get ever
      returned it will be written to a hardware too. But the value of
      BAD_APICID is not universal across all hardware in all modes and
      might cause unexpected results, i.e. interrupts might get routed
      to CPUs that are not configured to receive it.
      
      2. Because the value of BAD_APICID is not universal it is
      counter- intuitive to return it for a hardware where it does not
      make sense (i.e. x2apic).
      
      3. cpu_mask_to_apicid_and() operation is thought as an
      complement to cpu_mask_to_apicid() that only applies a AND mask
      on top of a cpumask being passed. Yet, as consequence of 18374d89
      commit the two operations are inconsistent in that of:
        cpu_mask_to_apicid() should not get a offline CPU with the cpumask
        cpu_mask_to_apicid_and() should not fail and return BAD_APICID
      These limitations are impossible to realize just from looking at
      the operations prototypes.
      
      Most of these shortcomings are resolved by returning a error
      code instead of BAD_APICID. As the result, faults are reported
      back early rather than possibilities to cause a unexpected
      behaviour exist (in case of [1]).
      
      The only exception is setup_timer_IRQ0_pin() routine. Although
      obviously controversial to this fix, its existing behaviour is
      preserved to not break the fragile check_timer() and would
      better addressed in a separate fix.
      
      Signed-off-by: default avatarAlexander Gordeev <agordeev@redhat.com>
      Acked-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Link: http://lkml.kernel.org/r/20120607131559.GF4759@dhcp-26-207.brq.redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ff164324
  10. Jun 06, 2012
  11. May 24, 2012
  12. May 18, 2012
  13. May 07, 2012
  14. May 04, 2012
  15. Apr 25, 2012
  16. Mar 27, 2012
  17. Mar 20, 2012
  18. Mar 10, 2012
  19. Mar 06, 2012
  20. Feb 24, 2012
  21. Feb 20, 2012
  22. Feb 04, 2012
    • Sebastian Andrzej Siewior's avatar
      gpio: Add a driver for Sodaville GPIO controller · b43ab901
      Sebastian Andrzej Siewior authored
      
      Sodaville has GPIO controller behind the PCI bus. To my suprissed it is
      not the same as on PXA.
      
      The interrupt & gpio chip can be referenced from the device tree like
      from any other driver. Unfortunately the driver which uses the gpio
      interrupt has to use irq_of_parse_and_map() instead of
      platform_get_irq(). The problem is that the platform device (which is
      created from the device tree) is most likely created before the
      interrupt chip is registered and therefore irq_of_parse_and_map() fails.
      
      In theory the driver works as module. In reality most of the irq
      functions are not exported to modules and it is possible that _this_
      module is unloaded while the provided irqs are still in use.
      
      Signed-off-by: default avatarHans J. Koch <hjk@linutronix.de>
      [torbenh@linutronix.de: make it work after the irq namespace cleanup,
      	                add some device tree entries.]
      Signed-off-by: default avatarTorben Hohn <torbenh@linutronix.de>
      [bigeasy@linutronix.de: convert to generic irq & gpio chip]
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      [grant.likely@secretlab.ca: depend on x86 to avoid irq_domain breakage]
      Signed-off-by: default avatarGrant Likely <grant.likely@secretlab.ca>
      b43ab901
  23. Feb 03, 2012
Loading