- May 16, 2014
-
-
Matan Barak authored
This patch adds UPDATE_QP SRIOV wrapper support. The mechanism is a general one, but currently only source MAC index changes are allowed for VFs. Signed-off-by:
Matan Barak <matanb@mellanox.com> Signed-off-by:
Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Duan Jiong authored
RFC 4861 states in 7.2.5: The IsRouter flag in the cache entry MUST be set based on the Router flag in the received advertisement. In those cases where the IsRouter flag changes from TRUE to FALSE as a result of this update, the node MUST remove that router from the Default Router List and update the Destination Cache entries for all destinations using that neighbor as a router as specified in Section 7.3.3. This is needed to detect when a node that is used as a router stops forwarding packets due to being configured as a host. Currently, when dealing with NA Message which IsRouter flag changes from TRUE to FALSE, the kernel only removes router from the Default Router List, and don't update the Destination Cache entries. Now in order to update those Destination Cache entries, i introduce function rt6_clean_tohost(). Signed-off-by:
Duan Jiong <duanj.fnst@cn.fujitsu.com> Acked-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- May 15, 2014
-
-
Cong Wang authored
From: Cong Wang <cwang@twopensource.com> commit 50624c93 (net: Delay default_device_exit_batch until no devices are unregistering) introduced rtnl_lock_unregistering() for default_device_exit_batch(). Same race could happen we when rmmod a driver which calls rtnl_link_unregister() as we call dev->destructor without rtnl lock. For long term, I think we should clean up the mess of netdev_run_todo() and net namespce exit code. Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
Cong Wang <cwang@twopensource.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- May 14, 2014
-
-
Hannes Frederic Sowa authored
net_get_random_once depends on the static keys infrastructure to patch up the branch to the slow path during boot. This was realized by abusing the static keys api and defining a new initializer to not enable the call site while still indicating that the branch point should get patched up. This was needed to have the fast path considered likely by gcc. The static key initialization during boot up normally walks through all the registered keys and either patches in ideal nops or enables the jump site but omitted that step on x86 if ideal nops where already placed at static_key branch points. Thus net_get_random_once branches not always became active. This patch switches net_get_random_once to the ordinary static_key api and thus places the kernel fast path in the - by gcc considered - unlikely path. Microbenchmarks on Intel and AMD x86-64 showed that the unlikely path actually beats the likely path in terms of cycle cost and that different nop patterns did not make much difference, thus this switch should not be noticeable. Fixes: a48e4292 ("net: introduce new macro net_get_random_once") Reported-by:
Tuomas Räsänen <tuomasjjrasanen@tjjr.fi> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- May 09, 2014
-
-
Cong Wang authored
Similarly, when CONFIG_SYSCTL is not set, ping_group_range should still work, just that no one can change it. Therefore we should move it out of sysctl_net_ipv4.c. And, it should not share the same seqlock with ip_local_port_range. BTW, rename it to ->ping_group_range instead. Cc: David S. Miller <davem@davemloft.net> Cc: Francois Romieu <romieu@fr.zoreil.com> Reported-by:
Stefan de Konink <stefan@konink.de> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Cong Wang authored
When CONFIG_SYSCTL is not set, ip_local_port_range should still work, just that no one can change it. Therefore we should move it out of sysctl_inet.c. Also, rename it to ->ip_local_ports instead. Cc: David S. Miller <davem@davemloft.net> Cc: Francois Romieu <romieu@fr.zoreil.com> Reported-by:
Stefan de Konink <stefan@konink.de> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- May 07, 2014
-
-
Daniel Mack authored
If CONFIG_OF is not set, make of_mdiobus_register() call mdiobus_register() instead of returning -ENOSYS. This way, we can just call of_mdiobus_register() from all DT-enabled drivers to handle the compat cases. Signed-off-by:
Daniel Mack <zonque@gmail.com> Suggested-by:
Florian Fainelli <f.fainelli@gmail.com> Acked-by:
Florian Fainelli <f.fainelli@gmail.com> Acked-by:
Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Florian Westphal authored
This reverts commit d2069403, there are no more callers. Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- May 05, 2014
-
-
Andy King authored
Right now the core vsock module is the owner of the proto family. This means there's nothing preventing the transport module from unloading if there are open sockets, which results in a panic. Fix that by allowing the transport to be the owner, which will refcount it properly. Includes version bump to 1.0.1.0-k Passes checkpatch this time, I swear... Acked-by:
Dmitry Torokhov <dtor@vmware.com> Signed-off-by:
Andy King <acking@vmware.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eliad Peller authored
Add locked-version for cfg80211_sched_scan_stopped. This is used for some users that might want to call it when rtnl is already locked. Fixes: d43c6b6e ("mac80211: reschedule sched scan after HW restart") Cc: stable@vger.kernel.org (3.14+) Signed-off-by:
Eliad Peller <eliadx.peller@intel.com> Signed-off-by:
Johannes Berg <johannes.berg@intel.com>
-
- May 04, 2014
-
-
Peter Hurley authored
This reverts commit 6a20dbd6. Although the commit correctly identifies an unsafe race condition between __tty_buffer_request_room() and flush_to_ldisc(), the commit fixes the race with an unnecessary spinlock in a lockless algorithm. The follow-on commit, "tty: Fix lockless tty buffer race" fixes the race locklessly. Signed-off-by:
Peter Hurley <peter@hurleysoftware.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- May 03, 2014
-
-
Marc Zyngier authored
Commit d57c33c5 (add generic fixmap.h) added (among other similar things) set_fixmap_io to deal with early ioremap of devices. More recently, commit bf4b558e (arm64: add early_ioremap support) converted the arm64 earlyprintk to use set_fixmap_io. A side effect of this conversion is that my virtual machines have stopped booting when I pass "earlyprintk=uart8250-8bit,0x3f8" to the guest kernel. Turns out that the new earlyprintk code doesn't care at all about sub-page offsets, and just assumes that the earlyprintk device will be page-aligned. Obviously, that doesn't play well with the above example. Further investigation shows that set_fixmap_io uses __set_fixmap instead of __set_fixmap_offset. A fix is to introduce a set_fixmap_offset_io that uses the latter, and to remove the superflous call to fix_to_virt (which only returns the value that set_fixmap_io has already given us). With this applied, my VMs are back in business. Tested on a Cortex-A57 platform with kvmtool as platform emulation. Cc: Will Deacon <will.deacon@arm.com> Acked-by:
Mark Salter <msalter@redhat.com> Acked-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Marc Zyngier <marc.zyngier@arm.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
- May 01, 2014
-
-
H. Peter Anvin authored
This is simpler and cleaner. Depending on architecture, a smart compiler may or may not generate the same code. Acked-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Apr 28, 2014
-
-
Steven Rostedt (Red Hat) authored
A race exists between module loading and enabling of function tracer. CPU 1 CPU 2 ----- ----- load_module() module->state = MODULE_STATE_COMING register_ftrace_function() mutex_lock(&ftrace_lock); ftrace_startup() update_ftrace_function(); ftrace_arch_code_modify_prepare() set_all_module_text_rw(); <enables-ftrace> ftrace_arch_code_modify_post_process() set_all_module_text_ro(); [ here all module text is set to RO, including the module that is loading!! ] blocking_notifier_call_chain(MODULE_STATE_COMING); ftrace_init_module() [ tries to modify code, but it's RO, and fails! ftrace_bug() is called] When this race happens, ftrace_bug() will produces a nasty warning and all of the function tracing features will be disabled until reboot. The simple solution is to treate module load the same way the core kernel is treated at boot. To hardcode the ftrace function modification of converting calls to mcount into nops. This is done in init/main.c there's no reason it could not be done in load_module(). This gives a better control of the changes and doesn't tie the state of the module to its notifiers as much. Ftrace is special, it needs to be treated as such. The reason this would work, is that the ftrace_module_init() would be called while the module is in MODULE_STATE_UNFORMED, which is ignored by the set_all_module_text_ro() call. Link: http://lkml.kernel.org/r/1395637826-3312-1-git-send-email-indou.takao@jp.fujitsu.com Reported-by:
Takao Indoh <indou.takao@jp.fujitsu.com> Acked-by:
Rusty Russell <rusty@rustcorp.com.au> Cc: stable@vger.kernel.org # 2.6.38+ Signed-off-by:
Steven Rostedt <rostedt@goodmis.org>
-
Thomas Gleixner authored
On x86 the allocation of irq descriptors may allocate interrupts which are in the range of the GSI interrupts. That's wrong as those interrupts are hardwired and we don't have the irq domain translation like PPC. So one of these interrupts can be hooked up later to one of the devices which are hard wired to it and the io_apic init code for that particular interrupt line happily reuses that descriptor with a completely different configuration so hell breaks lose. Inside x86 we allocate dynamic interrupts from above nr_gsi_irqs, except for a few usage sites which have not yet blown up in our face for whatever reason. But for drivers which need an irq range, like the GPIO drivers, we have no limit in place and we don't want to expose such a detail to a driver. To cure this introduce a function which an architecture can implement to impose a lower bound on the dynamic interrupt allocations. Implement it for x86 and set the lower bound to nr_gsi_irqs, which is the end of the hardwired interrupt space, so all dynamic allocations happen above. That not only allows the GPIO driver to work sanely, it also protects the bogus callsites of create_irq_nr() in hpet, uv, irq_remapping and htirq code. They need to be cleaned up as well, but that's a separate issue. Reported-by:
Jin Yao <yao.jin@linux.intel.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Tested-by:
Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Mathias Nyman <mathias.nyman@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Krogerus Heikki <heikki.krogerus@intel.com> Cc: Linus Walleij <linus.walleij@linaro.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1404241617360.28206@ionos.tec.linutronix.de Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Randy Dunlap authored
Fix new kernel-doc warnings in <linux/interrupt.h>: Warning(include/linux/interrupt.h:219): No description found for parameter 'cpumask' Warning(include/linux/interrupt.h:219): Excess function parameter 'mask' description in 'irq_set_affinity' Warning(include/linux/interrupt.h:236): No description found for parameter 'cpumask' Warning(include/linux/interrupt.h:236): Excess function parameter 'mask' description in 'irq_force_affinity' Signed-off-by:
Randy Dunlap <rdunlap@infradead.org> Link: http://lkml.kernel.org/r/535DD2FD.7030804@infradead.org Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Will Deacon authored
The asm-generic, big-endian version of zero_bytemask creates a mask of bytes preceding the first zero-byte by left shifting ~0ul based on the position of the first zero byte. Unfortunately, if the first (top) byte is zero, the output of prep_zero_mask has only the top bit set, resulting in undefined C behaviour as we shift left by an amount equal to the width of the type. As it happens, GCC doesn't manage to spot this through the call to fls(), but the issue remains if architectures choose to implement their shift instructions differently. An example would be arch/arm/ (AArch32), where LSL Rd, Rn, #32 results in Rd == 0x0, whilst on arch/arm64 (AArch64) LSL Xd, Xn, #64 results in Xd == Xn. Rather than check explicitly for the problematic shift, this patch adds an extra shift by 1, replacing fls with __fls. Since zero_bytemask is never called with a zero argument (has_zero() is used to check the data first), we don't need to worry about calling __fls(0), which is undefined. Cc: <stable@vger.kernel.org> Cc: Victor Kamensky <victor.kamensky@linaro.org> Signed-off-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Apr 25, 2014
-
-
Manfred Schlaegl authored
The race was introduced while development of linux-3.11 by e8437d7e and e9975fde. Originally it was found and reproduced on linux-3.12.15 and linux-3.12.15-rt25, by sending 500 byte blocks with 115kbaud to the target uart in a loop with 100 milliseconds delay. In short: 1. The consumer flush_to_ldisc is on to remove the head tty_buffer. 2. The producer adds a number of bytes, so that a new tty_buffer must be allocated and added by __tty_buffer_request_room. 3. The consumer removes the head tty_buffer element, without handling newly committed data. Detailed example: * Initial buffer: * Head, Tail -> 0: used=250; commit=250; read=240; next=NULL * Consumer: ''flush_to_ldisc'' * consumed 10 Byte * buffer: * Head, Tail -> 0: used=250; commit=250; read=250; next=NULL {{{ count = head->commit - head->read; // count = 0 if (!count) { // enter // INTERRUPTED BY PRODUCER -> if (head->next == NULL) break; buf->head = head->next; tty_buffer_free(port, head); continue; } }}} * Producer: tty_insert_flip_... 10 bytes + tty_flip_buffer_push * buffer: * Head, Tail -> 0: used=250; commit=250; read=250; next=NULL * added 6 bytes: head-element filled to maximum. * buffer: * Head, Tail -> 0: used=256; commit=250; read=250; next=NULL * added 4 bytes: __tty_buffer_request_room is called * buffer: * Head -> 0: used=256; commit=256; read=250; next=1 * Tail -> 1: used=4; commit=0; read=250 next=NULL * push (tty_flip_buffer_push) * buffer: * Head -> 0: used=256; commit=256; read=250; next=1 * Tail -> 1: used=4; commit=4; read=250 next=NULL * Consumer {{{ count = head->commit - head->read; if (!count) { // INTERRUPTED BY PRODUCER <- if (head->next == NULL) // -> no break break; buf->head = head->next; tty_buffer_free(port, head); // ERROR: tty_buffer head freed -> 6 bytes lost continue; } }}} This patch reintroduces a spin_lock to protect this case. Perhaps later a lock-less solution could be found. Signed-off-by:
Manfred Schlaegl <manfred.schlaegl@gmx.at> Cc: stable <stable@vger.kernel.org> # 3.11 Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- Apr 24, 2014
-
-
Rob Herring authored
Currently we get the following kind of errors if we try to use interrupt phandles to irqchips that have not yet initialized: irq: no irq domain found for /ocp/pinmux@48002030 ! ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at drivers/of/platform.c:171 of_device_alloc+0x144/0x184() Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.12.0-00038-g42a9708 #1012 (show_stack+0x14/0x1c) (dump_stack+0x6c/0xa0) (warn_slowpath_common+0x64/0x84) (warn_slowpath_null+0x1c/0x24) (of_device_alloc+0x144/0x184) (of_platform_device_create_pdata+0x44/0x9c) (of_platform_bus_create+0xd0/0x170) (of_platform_bus_create+0x12c/0x170) (of_platform_populate+0x60/0x98) This is because we're wrongly trying to populate resources that are not yet available. It's perfectly valid to create irqchips dynamically, so let's fix up the issue by resolving the interrupt resources when platform_get_irq is called. And then we also need to accept the fact that some irqdomains do not exist that early on, and only get initialized later on. So we can make the current WARN_ON into just into a pr_debug(). We still attempt to populate irq resources when we create the devices. This allows current drivers which don't use platform_get_irq to continue to function. Once all drivers are fixed, this code can be removed. Suggested-by:
Russell King <linux@arm.linux.org.uk> Signed-off-by:
Rob Herring <robh@kernel.org> Signed-off-by:
Tony Lindgren <tony@atomide.com> Tested-by:
Tony Lindgren <tony@atomide.com> Cc: stable@vger.kernel.org # v3.10+ Signed-off-by:
Grant Likely <grant.likely@linaro.org>
-
Grygorii Strashko authored
This fixes a regression on Keystone 2 platforms caused by patch 57303488 "usb: dwc3: adapt dwc3 core to use Generic PHY Framework" which adds optional support of generic phy in DWC3 core. On Keystone 2 platforms the USB is not working now because CONFIG_GENERIC_PHY isn't set and, as result, Generic PHY APIs stubs return -ENOSYS always. The log shows: dwc3 2690000.dwc3: failed to initialize core dwc3: probe of 2690000.dwc3 failed with error -38 Hence, fix it by making NULL a valid phy reference in Generic PHY APIs stubs in the same way as it was done by the patch 04c2faca "drivers: phy: Make NULL a valid phy reference". Acked-by:
Felipe Balbi <balbi@ti.com> Acked-by:
Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by:
Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by:
Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric W. Biederman authored
netlink_net_capable - The common case use, for operations that are safe on a network namespace netlink_capable - For operations that are only known to be safe for the global root netlink_ns_capable - The general case of capable used to handle special cases __netlink_ns_capable - Same as netlink_ns_capable except taking a netlink_skb_parms instead of the skbuff of a netlink message. Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric W. Biederman authored
sk_net_capable - The common case, operations that are safe in a network namespace. sk_capable - Operations that are not known to be safe in a network namespace sk_ns_capable - The general case for special cases. Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Eric W. Biederman authored
The permission check in sock_diag_put_filterinfo is wrong, and it is so removed from it's sources it is not clear why it is wrong. Move the computation into packet_diag_dump and pass a bool of the result into sock_diag_filterinfo. This does not yet correct the capability check but instead simply moves it to make it clear what is going on. Reported-by:
Andy Lutomirski <luto@amacapital.net> Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Stephen Warren authored
The Tegra124 clock DT binding currently provides 3 clocks that don't actually exist; 2 for NAND and one for UART5/UARTE. Delete these. While this is technically an incompatible DT ABI change, nothing could have used these clock IDs for anything practical, since the HW doesn't exist. Cc: <stable@vger.kernel.org> Signed-off-by:
Stephen Warren <swarren@nvidia.com> Signed-off-by:
Arnd Bergmann <arnd@arndb.de>
-
- Apr 23, 2014
-
-
Jeff Layton authored
File-private locks have been re-christened as "open file description" locks. Finish the symbol name cleanup in the internal implementation. Signed-off-by:
Jeff Layton <jlayton@redhat.com>
-
Mathieu Desnoyers authored
In the following commit: commit 57673c2b Author: Rusty Russell <rusty@rustcorp.com.au> Date: Mon Mar 31 14:39:57 2014 +1030 Use 'E' instead of 'X' for unsigned module taint flag. One site has been forgotten in trace events module.h. Signed-off-by:
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: Steven Rostedt <rostedt@goodmis.org> CC: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
- Apr 22, 2014
-
-
Andrew Lutomirski authored
The caller needs capabilities on the namespace being queried, not on their own namespace. This is a security bug, although it likely has only a minor impact. Cc: stable@vger.kernel.org Signed-off-by:
Andy Lutomirski <luto@amacapital.net> Acked-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Jeff Layton authored
File-private locks have been merged into Linux for v3.15, and *now* people are commenting that the name and macro definitions for the new file-private locks suck. ...and I can't even disagree. The names and command macros do suck. We're going to have to live with these for a long time, so it's important that we be happy with the names before we're stuck with them. The consensus on the lists so far is that they should be rechristened as "open file description locks". The name isn't a big deal for the kernel, but the command macros are not visually distinct enough from the traditional POSIX lock macros. The glibc and documentation folks are recommending that we change them to look like F_OFD_{GETLK|SETLK|SETLKW}. That lessens the chance that a programmer will typo one of the commands wrong, and also makes it easier to spot this difference when reading code. This patch makes the following changes that I think are necessary before v3.15 ships: 1) rename the command macros to their new names. These end up in the uapi headers and so are part of the external-facing API. It turns out that glibc doesn't actually use the fcntl.h uapi header, but it's hard to be sure that something else won't. Changing it now is safest. 2) make the the /proc/locks output display these as type "OFDLCK" Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Carlos O'Donell <carlos@redhat.com> Cc: Stefan Metzmacher <metze@samba.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Frank Filz <ffilzlnx@mindspring.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Jeff Layton <jlayton@redhat.com>
-
- Apr 20, 2014
-
-
Hans de Goede authored
On some newer laptops with a trackpoint the physical buttons for the trackpoint have been removed to allow for a larger touchpad. On these laptops the buttonpad has clearly marked areas on the top which are to be used as trackpad buttons. Users of the event device-node need to know about this, so that they can properly interpret BTN_LEFT events as being a left / right / middle click depending on where on the button pad the clicking finger is. This commits adds a INPUT_PROP_TOPBUTTONPAD device property which drivers for such buttonpads will use to signal to the user that this buttonpad not only has the normal bottom button area, but also a top button area. Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Dmitry Torokhov <dmitry.torokhov@gmail.com>
-
Hans de Goede authored
serio devices exposed via platform firmware interfaces such as ACPI may provide additional identifying information of use to userspace. We don't associate the serio devices with the firmware device (we don't set it as parent), so there's no way for userspace to make use of this information. We cannot change the parent for serio devices instantiated though a firmware interface as that would break suspend / resume ordering. Therefore this patch adds a new firmware_id sysfs attribute so that userspace can get a string from there with any additional identifying information the firmware interface may provide. Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Acked-by:
Peter Hutterer <peter.hutterer@who-t.net> Signed-off-by:
Dmitry Torokhov <dmitry.torokhov@gmail.com>
-
- Apr 19, 2014
-
-
Mel Gorman authored
David Vrabel identified a regression when using automatic NUMA balancing under Xen whereby page table entries were getting corrupted due to the use of native PTE operations. Quoting him Xen PV guest page tables require that their entries use machine addresses if the preset bit (_PAGE_PRESENT) is set, and (for successful migration) non-present PTEs must use pseudo-physical addresses. This is because on migration MFNs in present PTEs are translated to PFNs (canonicalised) so they may be translated back to the new MFN in the destination domain (uncanonicalised). pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma() set and clear the _PAGE_PRESENT bit using pte_set_flags(), pte_clear_flags(), etc. In a Xen PV guest, these functions must translate MFNs to PFNs when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting _PAGE_PRESENT. His suggested fix converted p[te|md]_[set|clear]_flags to using paravirt-friendly ops but this is overkill. He suggested an alternative of using p[te|md]_modify in the NUMA page table operations but this is does more work than necessary and would require looking up a VMA for protections. This patch modifies the NUMA page table operations to use paravirt friendly operations to set/clear the flags of interest. Unfortunately this will take a performance hit when updating the PTEs on CONFIG_PARAVIRT but I do not see a way around it that does not break Xen. Signed-off-by:
Mel Gorman <mgorman@suse.de> Acked-by:
David Vrabel <david.vrabel@citrix.com> Tested-by:
David Vrabel <david.vrabel@citrix.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Noonan <steven@uplinklabs.net> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Peter Zijlstra authored
Stick in a comment before someone else tries to fix the sparse warning this generates. Suggested-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-o2ro6f3vkxklni0bc8f7m68s@git.kernel.org Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Viresh Kumar authored
shiraz.hashim@st.com email-id doesn't exist anymore as he has left the company. Replace ST's id with shiraz.linux.kernel@gmail.com. It also updates .mailmap file to fix address for 'git shortlog'. Signed-off-by:
Viresh Kumar <viresh.kumar@linaro.org> Cc: Shiraz Hashim <shiraz.linux.kernel@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Vlad Yasevich authored
Currently, it is possible to create an SCTP socket, then switch auth_enable via sysctl setting to 1 and crash the system on connect: Oops[#1]: CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.1-mipsgit-20140415 #1 task: ffffffff8056ce80 ti: ffffffff8055c000 task.ti: ffffffff8055c000 [...] Call Trace: [<ffffffff8043c4e8>] sctp_auth_asoc_set_default_hmac+0x68/0x80 [<ffffffff8042b300>] sctp_process_init+0x5e0/0x8a4 [<ffffffff8042188c>] sctp_sf_do_5_1B_init+0x234/0x34c [<ffffffff804228c8>] sctp_do_sm+0xb4/0x1e8 [<ffffffff80425a08>] sctp_endpoint_bh_rcv+0x1c4/0x214 [<ffffffff8043af68>] sctp_rcv+0x588/0x630 [<ffffffff8043e8e8>] sctp6_rcv+0x10/0x24 [<ffffffff803acb50>] ip6_input+0x2c0/0x440 [<ffffffff8030fc00>] __netif_receive_skb_core+0x4a8/0x564 [<ffffffff80310650>] process_backlog+0xb4/0x18c [<ffffffff80313cbc>] net_rx_action+0x12c/0x210 [<ffffffff80034254>] __do_softirq+0x17c/0x2ac [<ffffffff800345e0>] irq_exit+0x54/0xb0 [<ffffffff800075a4>] ret_from_irq+0x0/0x4 [<ffffffff800090ec>] rm7k_wait_irqoff+0x24/0x48 [<ffffffff8005e388>] cpu_startup_entry+0xc0/0x148 [<ffffffff805a88b0>] start_kernel+0x37c/0x398 Code: dd0900b8 000330f8 0126302d <dcc60000> 50c0fff1 0047182a a48306a0 03e00008 00000000 ---[ end trace b530b0551467f2fd ]--- Kernel panic - not syncing: Fatal exception in interrupt What happens while auth_enable=0 in that case is, that ep->auth_hmacs is initialized to NULL in sctp_auth_init_hmacs() when endpoint is being created. After that point, if an admin switches over to auth_enable=1, the machine can crash due to NULL pointer dereference during reception of an INIT chunk. When we enter sctp_process_init() via sctp_sf_do_5_1B_init() in order to respond to an INIT chunk, the INIT verification succeeds and while we walk and process all INIT params via sctp_process_param() we find that net->sctp.auth_enable is set, therefore do not fall through, but invoke sctp_auth_asoc_set_default_hmac() instead, and thus, dereference what we have set to NULL during endpoint initialization phase. The fix is to make auth_enable immutable by caching its value during endpoint initialization, so that its original value is being carried along until destruction. The bug seems to originate from the very first days. Fix in joint work with Daniel Borkmann. Reported-by:
Joshua Kinard <kumba@gentoo.org> Signed-off-by:
Vlad Yasevich <vyasevic@redhat.com> Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Acked-by:
Neil Horman <nhorman@tuxdriver.com> Tested-by:
Joshua Kinard <kumba@gentoo.org> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Apr 18, 2014
-
-
Dan Williams authored
The AHCI spec allows implementations to issue commands in tag order rather than FIFO order: 5.3.2.12 P:SelectCmd HBA sets pSlotLoc = (pSlotLoc + 1) mod (CAP.NCS + 1) or HBA selects the command to issue that has had the PxCI bit set to '1' longer than any other command pending to be issued. The result is that commands posted sequentially (time-wise) may play out of sequence when issued by hardware. This behavior has likely been hidden by drives that arrange for commands to complete in issue order. However, it appears recent drives (two from different vendors that we have found so far) inflict out-of-order completions as a matter of course. So, we need to take care to maintain ordered submission, otherwise we risk triggering a drive to fall out of sequential-io automation and back to random-io processing, which incurs large latency and degrades throughput. This issue was found in simple benchmarks where QD=2 seq-write performance was 30-50% *greater* than QD=32 seq-write performance. Tagging for -stable and making the change globally since it has a low risk-to-reward ratio. Also, word is that recent versions of an unnamed OS also does it this way now. So, drives in the field are already experienced with this tag ordering scheme. Cc: <stable@vger.kernel.org> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ed Ciechanowski <ed.ciechanowski@intel.com> Reviewed-by:
Matthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
Tejun Heo <tj@kernel.org>
-
Tim Kryger authored
Drivers that call regulator_get_optional are tolerant to the absence of that regulator. By modifying the value returned from the stub function to match that seen when a regulator isn't present, callers can wrap the regulator logic with an IS_ERR based conditional even if they happen to call regulator_is_supported_voltage. This improves efficiency as well as eliminates the possibility for a very subtle bug. Signed-off-by:
Tim Kryger <tim.kryger@linaro.org> Reviewed-by:
Alex Elder <elder@linaro.org> Signed-off-by:
Mark Brown <broonie@linaro.org>
-
Alexander Shiyan authored
Add an empty version of of_find_node_by_path(). This fixes following build error for asoc tree: sound/soc/fsl/fsl_ssi.c: In function 'fsl_ssi_probe': sound/soc/fsl/fsl_ssi.c:1471:2: error: implicit declaration of function 'of_find_node_by_path' [-Werror=implicit-function-declaration] sprop = of_get_property(of_find_node_by_path("/"), "compatible", NULL); Reported-by:
Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by:
Alexander Shiyan <shc_work@mail.ru> Signed-off-by:
Rob Herring <robh@kernel.org>
-
Daniel Vetter authored
This is leftover stuff from my previous doc round which I kinda wanted to do but didn't yet due to rebase hell. The modeset helpers and the probing helpers a independent and e.g. i915 uses the probing stuff but has its own modeset infrastructure. It hence makes to split this up. While at it add a DOC: comment for the probing libraray. It would be rather neat to pull some of the DocBook documenting these two helpers into in-line DOC: comments. But unfortunately kerneldoc doesn't support markdown or something similar to make nice-looking documentation, so the current state is better. Signed-off-by:
Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by:
Dave Airlie <airlied@redhat.com>
-
- Apr 17, 2014
-
-
Thomas Gleixner authored
The current implementation of irq_set_affinity() refuses rightfully to route an interrupt to an offline cpu. But there is a special case, where this is actually desired. Some of the ARM SoCs have per cpu timers which require setting the affinity during cpu startup where the cpu is not yet in the online mask. If we can't do that, then the local timer interrupt for the about to become online cpu is routed to some random online cpu. The developers of the affected machines tried to work around that issue, but that results in a massive mess in that timer code. We have a yet unused argument in the set_affinity callbacks of the irq chips, which I added back then for a similar reason. It was never required so it got not used. But I'm happy that I never removed it. That allows us to implement a sane handling of the above scenario. So the affected SoC drivers can add the required force handling to their interrupt chip, switch the timer code to irq_force_affinity() and things just work. This does not affect any existing user of irq_set_affinity(). Tagged for stable to allow a simple fix of the affected SoC clock event drivers. Reported-and-tested-by:
Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Tomasz Figa <t.figa@samsung.com>, Cc: Daniel Lezcano <daniel.lezcano@linaro.org>, Cc: Kukjin Kim <kgene.kim@samsung.com> Cc: linux-arm-kernel@lists.infradead.org, Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20140416143315.717251504@linutronix.de Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Corey Minyard authored
Convert some ints to bools. Signed-off-by:
Corey Minyard <cminyard@mvista.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-