aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-06-06x86/platform/uv: Use apic_ack_irq()Thomas Gleixner1-6/+1
To address the EBUSY fail of interrupt affinity settings in case that the previous setting has not been cleaned up yet, use the new apic_ack_irq() function instead of the special uv_ack_apic() implementation which is merily a wrapper around ack_APIC_irq(). Preparatory change for the real fix Fixes: dccfe3147b42 ("x86/vector: Simplify vector move cleanup") Reported-by: Song Liu <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Song Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Dmitry Safonov <[email protected]> Cc: [email protected] Cc: Mike Travis <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Tariq Toukan <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06x86/ioapic: Use apic_ack_irq()Thomas Gleixner1-1/+1
To address the EBUSY fail of interrupt affinity settings in case that the previous setting has not been cleaned up yet, use the new apic_ack_irq() function instead of directly invoking ack_APIC_irq(). Preparatory change for the real fix Fixes: dccfe3147b42 ("x86/vector: Simplify vector move cleanup") Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Song Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Dmitry Safonov <[email protected]> Cc: [email protected] Cc: Mike Travis <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Tariq Toukan <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06irq_remapping: Use apic_ack_irq()Thomas Gleixner4-9/+2
To address the EBUSY fail of interrupt affinity settings in case that the previous setting has not been cleaned up yet, use the new apic_ack_irq() function instead of the special ir_ack_apic_edge() implementation which is merily a wrapper around ack_APIC_irq(). Preparatory change for the real fix Fixes: dccfe3147b42 ("x86/vector: Simplify vector move cleanup") Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Song Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Dmitry Safonov <[email protected]> Cc: [email protected] Cc: Mike Travis <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Tariq Toukan <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06x86/apic: Provide apic_ack_irq()Thomas Gleixner2-2/+9
apic_ack_edge() is explicitely for handling interrupt affinity cleanup when interrupt remapping is not available or disable. Remapped interrupts and also some of the platform specific special interrupts, e.g. UV, invoke ack_APIC_irq() directly. To address the issue of failing an affinity update with -EBUSY the delayed affinity mechanism can be reused, but ack_APIC_irq() does not handle that. Adding this to ack_APIC_irq() is not possible, because that function is also used for exceptions and directly handled interrupts like IPIs. Create a new function, which just contains the conditional invocation of irq_move_irq() and the final ack_APIC_irq(). Reuse the new function in apic_ack_edge(). Preparatory change for the real fix. Fixes: dccfe3147b42 ("x86/vector: Simplify vector move cleanup") Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Song Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Dmitry Safonov <[email protected]> Cc: [email protected] Cc: Mike Travis <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Tariq Toukan <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06genirq/migration: Avoid out of line call if pending is not setThomas Gleixner2-5/+7
The upcoming fix for the -EBUSY return from affinity settings requires to use the irq_move_irq() functionality even on irq remapped interrupts. To avoid the out of line call, move the check for the pending bit into an inline helper. Preparatory change for the real fix. No functional change. Fixes: dccfe3147b42 ("x86/vector: Simplify vector move cleanup") Signed-off-by: Thomas Gleixner <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Dmitry Safonov <[email protected]> Cc: [email protected] Cc: Mike Travis <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Tariq Toukan <[email protected]> Cc: Dou Liyang <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06genirq/generic_pending: Do not lose pending affinity updateThomas Gleixner1-7/+19
The generic pending interrupt mechanism moves interrupts from the interrupt handler on the original target CPU to the new destination CPU. This is required for x86 and ia64 due to the way the interrupt delivery and acknowledge works if the interrupts are not remapped. However that update can fail for various reasons. Some of them are valid reasons to discard the pending update, but the case, when the previous move has not been fully cleaned up is not a legit reason to fail. Check the return value of irq_do_set_affinity() for -EBUSY, which indicates a pending cleanup, and rearm the pending move in the irq dexcriptor so it's tried again when the next interrupt arrives. Fixes: 996c591227d9 ("x86/irq: Plug vector cleanup race") Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Song Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Dmitry Safonov <[email protected]> Cc: [email protected] Cc: Mike Travis <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Tariq Toukan <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06x86/apic/vector: Prevent hlist corruption and leaksThomas Gleixner1-0/+9
Several people observed the WARN_ON() in irq_matrix_free() which triggers when the caller tries to free an vector which is not in the allocation range. Song provided the trace information which allowed to decode the root cause. The rework of the vector allocation mechanism failed to preserve a sanity check, which prevents setting a new target vector/CPU when the previous affinity change has not fully completed. As a result a half finished affinity change can be overwritten, which can cause the leak of a irq descriptor pointer on the previous target CPU and double enqueue of the hlist head into the cleanup lists of two or more CPUs. After one CPU cleaned up its vector the next CPU will invoke the cleanup handler with vector 0, which triggers the out of range warning in the matrix allocator. Prevent this by checking the apic_data of the interrupt whether the move_in_progress flag is false and the hlist node is not hashed. Return -EBUSY if not. This prevents the damage and restores the behaviour before the vector allocation rework, but due to other changes in that area it also widens the chance that user space can observe -EBUSY. In theory this should be fine, but actually not all user space tools handle -EBUSY correctly. Addressing that is not part of this fix, but will be addressed in follow up patches. Fixes: 69cde0004a4b ("x86/vector: Use matrix allocator for vector assignment") Reported-by: Dmitry Safonov <[email protected]> Reported-by: Tariq Toukan <[email protected]> Reported-by: Song Liu <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Song Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: Mike Travis <[email protected]> Cc: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06enic: fix UDP rss bitsGovindarajulu Varadarajan7-21/+67
In commit 48398b6e7065 ("enic: set UDP rss flag") driver needed to set a single bit to enable UDP rss. This is changed to two bit. One for UDP IPv4 and other bit for UDP IPv6. The hardware which supports this is not released yet. When released, driver should set 2 bit to enable UDP rss for both IPv4 and IPv6. Also add spinlock around vnic_dev_capable_rss_hash_type(). Signed-off-by: Govindarajulu Varadarajan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-06objtool: Fix GCC 8 cold subfunction detection for aliased functionsJosh Poimboeuf1-0/+22
The kbuild test robot reported the following issue: kernel/time/posix-stubs.o: warning: objtool: sys_ni_posix_timers.cold.1()+0x0: unreachable instruction This file creates symbol aliases for the sys_ni_posix_timers() function. So there are multiple ELF function symbols for the same function: 23: 0000000000000150 26 FUNC GLOBAL DEFAULT 1 __x64_sys_timer_create 24: 0000000000000150 26 FUNC GLOBAL DEFAULT 1 sys_ni_posix_timers 25: 0000000000000150 26 FUNC GLOBAL DEFAULT 1 __ia32_sys_timer_create 26: 0000000000000150 26 FUNC GLOBAL DEFAULT 1 __x64_sys_timer_gettime Here's the corresponding cold subfunction: 11: 0000000000000000 45 FUNC LOCAL DEFAULT 6 sys_ni_posix_timers.cold.1 When analyzing overlapping functions, objtool only looks at the first one in the symbol list. The rest of the functions are basically ignored because they point to instructions which have already been analyzed. So in this case it analyzes the __x64_sys_timer_create() function, but then it fails to recognize that its cold subfunction is sys_ni_posix_timers.cold.1(), because the names are different. Make the subfunction detection a little smarter by associating each subfunction with the first function which jumps to it, since that's the one which will be analyzed. Unfortunately we still have to leave the original subfunction detection code in place, thanks to GCC switch tables. (See the comment for more details.) Fixes: 13810435b9a7 ("objtool: Support GCC 8's cold subfunctions") Reported-by: kbuild test robot <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lkml.kernel.org/r/d3ba52662cbc8e3a64a3b64d44b4efc5674fd9ab.1527855808.git.jpoimboe@redhat.com
2018-06-06x86/bugs: Switch the selection of mitigation from CPU vendor to CPU featuresKonrad Rzeszutek Wilk1-8/+3
Both AMD and Intel can have SPEC_CTRL_MSR for SSBD. However AMD also has two more other ways of doing it - which are !SPEC_CTRL MSR ways. Signed-off-by: Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Kees Cook <[email protected]> Cc: [email protected] Cc: KarimAllah Ahmed <[email protected]> Cc: [email protected] Cc: "H. Peter Anvin" <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Woodhouse <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06x86/bugs: Add AMD's SPEC_CTRL MSR usageKonrad Rzeszutek Wilk5-10/+27
The AMD document outlining the SSBD handling 124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf mentions that if CPUID 8000_0008.EBX[24] is set we should be using the SPEC_CTRL MSR (0x48) over the VIRT SPEC_CTRL MSR (0xC001_011f) for speculative store bypass disable. This in effect means we should clear the X86_FEATURE_VIRT_SSBD flag so that we would prefer the SPEC_CTRL MSR. See the document titled: 124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf A copy of this document is available at https://bugzilla.kernel.org/show_bug.cgi?id=199889 Signed-off-by: Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Janakarajan Natarajan <[email protected]> Cc: [email protected] Cc: KarimAllah Ahmed <[email protected]> Cc: [email protected] Cc: Joerg Roedel <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Kees Cook <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06x86/bugs: Add AMD's variant of SSB_NOKonrad Rzeszutek Wilk3-2/+4
The AMD document outlining the SSBD handling 124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf mentions that the CPUID 8000_0008.EBX[26] will mean that the speculative store bypass disable is no longer needed. A copy of this document is available at: https://bugzilla.kernel.org/show_bug.cgi?id=199889 Signed-off-by: Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Janakarajan Natarajan <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Woodhouse <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06x86/vector: Fix the args of vector_alloc tracepointDou Liyang1-1/+1
The vector_alloc tracepont reversed the reserved and ret aggs, that made the trace print wrong. Exchange them. Fixes: 8d1e3dca7de6 ("x86/vector: Add tracepoints for vector management") Signed-off-by: Dou Liyang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-06-06x86/idt: Simplify the idt_setup_apic_and_irq_gates()Dou Liyang1-5/+2
The idt_setup_apic_and_irq_gates() sets the gates from FIRST_EXTERNAL_VECTOR up to FIRST_SYSTEM_VECTOR first. then secondly, from FIRST_SYSTEM_VECTOR to NR_VECTORS, it takes both APIC=y and APIC=n into account. But for APIC=n, the FIRST_SYSTEM_VECTOR is equal to NR_VECTORS, all vectors has been set at the first step. Simplify the second step, make it just work for APIC=y. Signed-off-by: Dou Liyang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06x86/platform/uv: Remove extra parenthesesVarsha Rao1-1/+1
Remove extra parentheses to fix the extraneous parentheses clang warning. Suggested-by: Lukas Bulwahn <[email protected]> Signed-off-by: Varsha Rao <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Nicholas Mc Guire <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06x86/mm: Decouple dynamic __PHYSICAL_MASK from AMD SMEKirill A. Shutemov5-1/+24
AMD SME claims one bit from physical address to indicate whether the page is encrypted or not. To achieve that we clear out the bit from __PHYSICAL_MASK. The capability to adjust __PHYSICAL_MASK is required beyond AMD SME. For instance for upcoming Intel Multi-Key Total Memory Encryption. Factor it out into a separate feature with own Kconfig handle. It also helps with overhead of AMD SME. It saves more than 3k in .text on defconfig + AMD_MEM_ENCRYPT: add/remove: 3/2 grow/shrink: 5/110 up/down: 189/-3753 (-3564) We would need to return to this once we have infrastructure to patch constants in code. That's good candidate for it. Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Tom Lendacky <[email protected]> Cc: [email protected] Cc: "H. Peter Anvin" <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06x86: Mark native_set_p4d() as __always_inlineArnd Bergmann1-2/+2
When CONFIG_OPTIMIZE_INLINING is enabled, the function native_set_p4d() may not be fully inlined into the caller, resulting in a false-positive warning about an access to the __pgtable_l5_enabled variable from a non-__init function, despite the original caller being an __init function: WARNING: vmlinux.o(.text.unlikely+0x1429): Section mismatch in reference from the function native_set_p4d() to the variable .init.data:__pgtable_l5_enabled WARNING: vmlinux.o(.text.unlikely+0x1429): Section mismatch in reference from the function native_p4d_clear() to the variable .init.data:__pgtable_l5_enabled The function native_set_p4d() references the variable __initdata __pgtable_l5_enabled. This is often because native_set_p4d lacks a __initdata annotation or the annotation of __pgtable_l5_enabled is wrong. Marking the native_set_p4d function and its caller native_p4d_clear() avoids this problem. I did not bisect the original cause, but I assume this is related to the recent rework that turned pgtable_l5_enabled() into an inline function, which in turn caused the compiler to make different inlining decisions. Fixes: ad3fe525b950 ("x86/mm: Unify pgtable_l5_enabled usage in early boot code") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Zi Yan <[email protected]> Cc: Naoya Horiguchi <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06irqchip/ls-scfg-msi: Map MSIs in the iommuLaurentiu Tudor1-0/+3
Add the required iommu_dma_map_msi_msg() when composing the MSI message, otherwise the interrupts will not work. Signed-off-by: Laurentiu Tudor <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-06-06irqchip/stm32: Fix non-SMP build warningArnd Bergmann1-3/+1
A CONFIG_SMP=n build emits a harmless compile-time warning: drivers/irqchip/irq-stm32-exti.c:495:12: error: 'stm32_exti_h_set_affinity' defined but not used [-Werror=unused-function] The #ifdef is inconsistent here, and it's better to use an IS_ENABLED() check that lets the compiler silently drop that function. Fixes: 927abfc4461e ("irqchip/stm32: Add stm32mp1 support with hierarchy domain") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Ludovic Barre <[email protected]> Cc: Rob Herring <[email protected]> Cc: Benjamin Gaignard <[email protected]> Cc: Radoslaw Pietrzyk <[email protected]> Cc: Jason Cooper <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: [email protected] Cc: Maxime Coquelin <[email protected]> Cc: Alexandre Torgue <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06rseq/selftests: Provide Makefile, scripts, gitignoreMathieu Desnoyers5-0/+159
A run_param_test.sh script runs many variants of the parametrizable tests. Wire up the rseq Makefile, add directory entry into MAINTAINERS file. Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Andi Kleen <[email protected]> Cc: [email protected] Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06rseq/selftests: Provide parametrized testsMathieu Desnoyers1-0/+1260
"param_test" is a parametrizable restartable sequences test. See the "--help" output for usage. "param_test_benchmark" is the same as "param_test", but it removes testing book-keeping code to allow accurate benchmarks. "param_test_compare_twice" is the same as "param_test", but it performs each comparison within rseq critical section twice, thus validating invariants. If any of the second comparisons fails, an error message is printed and the test aborts. Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Andi Kleen <[email protected]> Cc: [email protected] Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06rseq/selftests: Provide basic percpu ops testMathieu Desnoyers1-0/+312
"basic_percpu_ops_test" is a slightly more "realistic" variant, implementing a few simple per-cpu operations and testing their correctness. Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Andi Kleen <[email protected]> Cc: [email protected] Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06rseq/selftests: Provide basic testMathieu Desnoyers1-0/+56
"basic_test" only asserts that RSEQ works moderately correctly. E.g. that the CPUID pointer works. Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Andi Kleen <[email protected]> Cc: [email protected] Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06rseq/selftests: Provide rseq libraryMathieu Desnoyers6-0/+2847
This rseq helper library provides a user-space API to the rseq() system call. The rseq fast-path exposes the instruction pointer addresses where the rseq assembly blocks begin and end, as well as the associated abort instruction pointer, in the __rseq_table section. This section allows debuggers may know where to place breakpoints when single-stepping through assembly blocks which may be aborted at any point by the kernel. Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Andi Kleen <[email protected]> Cc: [email protected] Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06selftests/lib.mk: Introduce OVERRIDE_TARGETSMathieu Desnoyers1-0/+4
Introduce OVERRIDE_TARGETS to allow tests to express dependencies on header files and .so, which require to override the selftests lib.mk targets. Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Shuah Khan <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andi Kleen <[email protected]> Cc: [email protected] Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06powerpc: Wire up restartable sequences system callBoqun Feng3-1/+3
Wire up the rseq system call on powerpc. This provides an ABI improving the speed of a user-space getcpu operation on powerpc by skipping the getcpu system call on the fast path, as well as improving the speed of user-space operations on per-cpu data compared to using load-reservation/store-conditional atomics. Signed-off-by: Boqun Feng <[email protected]> Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06powerpc: Add syscall detection for restartable sequencesBoqun Feng2-0/+15
Syscalls are not allowed inside restartable sequences, so add a call to rseq_syscall() at the very beginning of system call exiting path for CONFIG_DEBUG_RSEQ=y kernel. This could help us to detect whether there is a syscall issued inside restartable sequences. Signed-off-by: Boqun Feng <[email protected]> Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06powerpc: Add support for restartable sequencesBoqun Feng2-0/+4
Call the rseq_handle_notify_resume() function on return to userspace if TIF_NOTIFY_RESUME thread flag is set. Perform fixup on the pre-signal when a signal is delivered on top of a restartable sequence critical section. Signed-off-by: Boqun Feng <[email protected]> Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06x86: Wire up restartable sequence system callMathieu Desnoyers2-0/+2
Wire up the rseq system call on x86 32/64. This provides an ABI improving the speed of a user-space getcpu operation on x86 by removing the need to perform a function call, "lsl" instruction, or system call on the fast path, as well as improving the speed of user-space operations on per-cpu data. Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andi Kleen <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06x86: Add support for restartable sequencesMathieu Desnoyers3-0/+10
Call the rseq_handle_notify_resume() function on return to userspace if TIF_NOTIFY_RESUME thread flag is set. Perform fixup on the pre-signal frame when a signal is delivered on top of a restartable sequence critical section. Check that system calls are not invoked from within rseq critical sections by invoking rseq_signal() from syscall_return_slowpath(). With CONFIG_DEBUG_RSEQ, such behavior results in termination of the process with SIGSEGV. Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andi Kleen <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06arm: Wire up restartable sequences system callMathieu Desnoyers1-0/+1
Wire up the rseq system call on 32-bit ARM. This provides an ABI improving the speed of a user-space getcpu operation on ARM by skipping the getcpu system call on the fast path, as well as improving the speed of user-space operations on per-cpu data compared to using load-linked/store-conditional. Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andi Kleen <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06arm: Add syscall detection for restartable sequencesMathieu Desnoyers2-6/+26
Syscalls are not allowed inside restartable sequences, so add a call to rseq_syscall() at the very beginning of system call exiting path for CONFIG_DEBUG_RSEQ=y kernel. This could help us to detect whether there is a syscall issued inside restartable sequences. Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andi Kleen <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06arm: Add restartable sequences supportMathieu Desnoyers2-0/+8
Call the rseq_handle_notify_resume() function on return to userspace if TIF_NOTIFY_RESUME thread flag is set. Perform fixup on the pre-signal frame when a signal is delivered on top of a restartable sequence critical section. Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andi Kleen <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06rseq: Introduce restartable sequences system callMathieu Desnoyers13-1/+734
Expose a new system call allowing each thread to register one userspace memory area to be used as an ABI between kernel and user-space for two purposes: user-space restartable sequences and quick access to read the current CPU number value from user-space. * Restartable sequences (per-cpu atomics) Restartables sequences allow user-space to perform update operations on per-cpu data without requiring heavy-weight atomic operations. The restartable critical sections (percpu atomics) work has been started by Paul Turner and Andrew Hunter. It lets the kernel handle restart of critical sections. [1] [2] The re-implementation proposed here brings a few simplifications to the ABI which facilitates porting to other architectures and speeds up the user-space fast path. Here are benchmarks of various rseq use-cases. Test hardware: arm32: ARMv7 Processor rev 4 (v7l) "Cubietruck", 2-core x86-64: Intel E5-2630 [email protected], 16-core, hyperthreading The following benchmarks were all performed on a single thread. * Per-CPU statistic counter increment getcpu+atomic (ns/op) rseq (ns/op) speedup arm32: 344.0 31.4 11.0 x86-64: 15.3 2.0 7.7 * LTTng-UST: write event 32-bit header, 32-bit payload into tracer per-cpu buffer getcpu+atomic (ns/op) rseq (ns/op) speedup arm32: 2502.0 2250.0 1.1 x86-64: 117.4 98.0 1.2 * liburcu percpu: lock-unlock pair, dereference, read/compare word getcpu+atomic (ns/op) rseq (ns/op) speedup arm32: 751.0 128.5 5.8 x86-64: 53.4 28.6 1.9 * jemalloc memory allocator adapted to use rseq Using rseq with per-cpu memory pools in jemalloc at Facebook (based on rseq 2016 implementation): The production workload response-time has 1-2% gain avg. latency, and the P99 overall latency drops by 2-3%. * Reading the current CPU number Speeding up reading the current CPU number on which the caller thread is running is done by keeping the current CPU number up do date within the cpu_id field of the memory area registered by the thread. This is done by making scheduler preemption set the TIF_NOTIFY_RESUME flag on the current thread. Upon return to user-space, a notify-resume handler updates the current CPU value within the registered user-space memory area. User-space can then read the current CPU number directly from memory. Keeping the current cpu id in a memory area shared between kernel and user-space is an improvement over current mechanisms available to read the current CPU number, which has the following benefits over alternative approaches: - 35x speedup on ARM vs system call through glibc - 20x speedup on x86 compared to calling glibc, which calls vdso executing a "lsl" instruction, - 14x speedup on x86 compared to inlined "lsl" instruction, - Unlike vdso approaches, this cpu_id value can be read from an inline assembly, which makes it a useful building block for restartable sequences. - The approach of reading the cpu id through memory mapping shared between kernel and user-space is portable (e.g. ARM), which is not the case for the lsl-based x86 vdso. On x86, yet another possible approach would be to use the gs segment selector to point to user-space per-cpu data. This approach performs similarly to the cpu id cache, but it has two disadvantages: it is not portable, and it is incompatible with existing applications already using the gs segment selector for other purposes. Benchmarking various approaches for reading the current CPU number: ARMv7 Processor rev 4 (v7l) Machine model: Cubietruck - Baseline (empty loop): 8.4 ns - Read CPU from rseq cpu_id: 16.7 ns - Read CPU from rseq cpu_id (lazy register): 19.8 ns - glibc 2.19-0ubuntu6.6 getcpu: 301.8 ns - getcpu system call: 234.9 ns x86-64 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz: - Baseline (empty loop): 0.8 ns - Read CPU from rseq cpu_id: 0.8 ns - Read CPU from rseq cpu_id (lazy register): 0.8 ns - Read using gs segment selector: 0.8 ns - "lsl" inline assembly: 13.0 ns - glibc 2.19-0ubuntu6 getcpu: 16.6 ns - getcpu system call: 53.9 ns - Speed (benchmark taken on v8 of patchset) Running 10 runs of hackbench -l 100000 seems to indicate, contrary to expectations, that enabling CONFIG_RSEQ slightly accelerates the scheduler: Configuration: 2 sockets * 8-core Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz (directly on hardware, hyperthreading disabled in BIOS, energy saving disabled in BIOS, turboboost disabled in BIOS, cpuidle.off=1 kernel parameter), with a Linux v4.6 defconfig+localyesconfig, restartable sequences series applied. * CONFIG_RSEQ=n avg.: 41.37 s std.dev.: 0.36 s * CONFIG_RSEQ=y avg.: 40.46 s std.dev.: 0.33 s - Size On x86-64, between CONFIG_RSEQ=n/y, the text size increase of vmlinux is 567 bytes, and the data size increase of vmlinux is 5696 bytes. [1] https://lwn.net/Articles/650333/ [2] http://www.linuxplumbersconf.org/2013/ocw/system/presentations/1695/original/LPC%20-%20PerCpu%20Atomics.pdf Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andi Kleen <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: Alexander Viro <[email protected]> Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: http://lkml.kernel.org/r/20151027235635.16059.11630.stgit@pjt-glaptop.roam.corp.google.com Link: http://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-06-06uapi/headers: Provide types_32_64.hMathieu Desnoyers1-0/+50
Provide helper macros for fields which represent pointers in kernel-userspace ABI. This facilitates handling of 32-bit user-space by 64-bit kernels by defining those fields as 32-bit 0-padding and 32-bit integer on 32-bit architectures, which allows the kernel to treat those as 64-bit integers. The order of padding and 32-bit integer depends on the endianness. Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Watson <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andi Kleen <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Russell King <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Paul Turner <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ben Maurer <[email protected]> Cc: [email protected] Cc: Andy Lutomirski <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-06-06powerpc/64s/radix: Fix missing ptesync in flush_cache_vmapNicholas Piggin1-3/+2
There is a typo in f1cb8f9beb ("powerpc/64s/radix: avoid ptesync after set_pte and ptep_set_access_flags") config ifdef, which results in the necessary ptesync not being issued after vmalloc. This causes random kernel faults in module load, bpf load, anywhere that vmalloc mappings are used. After correcting the code, this survives a guest kernel booting hundreds of times where previously there would be a crash every few boots (I haven't noticed the crash on host, perhaps due to different TLB and page table walking behaviour in hardware). A memory clobber is also added to the flush, just to be sure it won't be reordered with the pte set or the subsequent mapping access. Fixes: f1cb8f9beb ("powerpc/64s/radix: avoid ptesync after set_pte and ptep_set_access_flags") Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-05Merge branch 'for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds6-64/+119
Pull workqueue updates from Tejun Heo: - make kworkers report the workqueue it is executing or has executed most recently in /proc/PID/comm (so they show up in ps/top) - CONFIG_SMP shuffle to move stuff which isn't necessary for UP builds inside CONFIG_SMP. * 'for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: move function definitions within CONFIG_SMP block workqueue: Make sure struct worker is accessible for wq_worker_comm() workqueue: Show the latest workqueue name in /proc/PID/{comm,stat,status} proc: Consolidate task->comm formatting into proc_task_name() workqueue: Set worker->desc to workqueue name by default workqueue: Make worker_attach/detach_pool() update worker->pool workqueue: Replace pool->attach_mutex with global wq_pool_attach_mutex
2018-06-05Merge branch 'for-4.18' of ↵Linus Torvalds8-417/+554
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - For cpustat, cgroup has a percpu hierarchical stat mechanism which propagates up the hierarchy lazily. This contains commits to factor out and generalize the mechanism so that it can be used for other cgroup stats too. The original intention was to update memcg stats to use it but memcg went for a different approach, so still the only user is cpustat. The factoring out and generalization still make sense and it's likely that this can be used for other purposes in the future. - cgroup uses kernfs_notify() (which uses fsnotify()) to inform user space of certain events. A rate limiting mechanism is added. - Other misc changes. * 'for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: css_set_lock should nest inside tasklist_lock rdmacg: Convert to use match_string() helper cgroup: Make cgroup_rstat_updated() ready for root cgroup usage cgroup: Add memory barriers to plug cgroup_rstat_updated() race window cgroup: Add cgroup_subsys->css_rstat_flush() cgroup: Replace cgroup_rstat_mutex with a spinlock cgroup: Factor out and expose cgroup_rstat_*() interface functions cgroup: Reorganize kernel/cgroup/rstat.c cgroup: Distinguish base resource stat implementation from rstat cgroup: Rename stat to rstat cgroup: Rename kernel/cgroup/stat.c to kernel/cgroup/rstat.c cgroup: Limit event generation frequency cgroup: Explicitly remove core interface files
2018-06-05Merge branch 'for-4.18' of ↵Linus Torvalds13-162/+177
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata updates from Tejun Heo: - libata has always been limiting the maximum queue depth to 31, with one entry set aside mostly for historical reasons. This didn't use to make much difference but Jens found out that modern hard drives can actually perform measurably better with the extra one queue depth. Jens updated libata core so that it can make use of full 32 queue depth - Damien updated command retry logic in error handling so that it doesn't unnecessarily retry when upper layer (SCSI) is gonna handle them - A couple misc changes * 'for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: sata_fsl: use the right type for tag bitshift ahci: enable full queue depth of 32 libata: don't clamp queue depth to ATA_MAX_QUEUE - 1 libata: add extra internal command sata_nv: set host can_queue count appropriately libata: remove assumption that ATA_MAX_QUEUE - 1 is the max libata: use ata_tag_internal() consistently libata: bump ->qc_active to a 64-bit type libata: convert core and drivers to ->hw_tag usage libata: introduce notion of separate hardware tags libata: Fix command retry decision libata: Honor RQF_QUIET flag libata: Make ata_dev_set_mode() less verbose libata: Fix ata_err_string() libata: Fix comment typo in ata_eh_analyze_tf() sata_nv: don't use block layer bounce buffer ata: hpt37x: Convert to use match_string() helper
2018-06-05Merge branch 'for-4.17-fixes' of ↵Linus Torvalds2-4/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "These two are fixes which missed v4.17. One is to remove an incorrect power management blacklist entry and the other to fix a cdb buffer overrun which has been there for a very long time" * 'for-4.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata: Drop SanDisk SD7UB3Q*G1001 NOLPM quirk libata: zpodd: small read overflow in eject_tray()
2018-06-05Merge tag 'tty-4.18-rc1' of ↵Linus Torvalds31-297/+694
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial updates from Greg KH: "Here is the big tty/serial driver update for 4.18-rc1. There's nothing major here, just lots of serial driver updates. Full details are in the shortlog, nothing anything specific to call out here. All have been in linux-next for a while with no reported issues" * tag 'tty-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (55 commits) vt: Perform safe console erase only once serial: imx: disable UCR4_OREN on shutdown serial: imx: drop CTS/RTS handling from shutdown tty: fix typo in ASYNCB_FOURPORT comment serial: samsung: check DMA engine capabilities before using DMA mode tty: Fix data race in tty_insert_flip_string_fixed_flag tty: serial: msm_geni_serial: Fix TX infinite loop serial: 8250_dw: Fix runtime PM handling serial: 8250: omap: Fix idling of clocks for unused uarts tty: serial: drop ATH79 specific SoC symbols serial: 8250: Add missing rxtrig_bytes on Altera 16550 UART serial/aspeed-vuart: fix a couple mod_timer() calls serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version serial: 8250_of: Add IO space support tty/serial: atmel: use port->name as name in request_irq() serial: imx: dma_unmap_sg buffers on shutdown serial: imx: cleanup imx_uart_disable_dma() tty: serial: qcom_geni_serial: Add early console support tty: serial: qcom_geni_serial: Return IRQ_NONE for spurious interrupts tty: serial: qcom_geni_serial: Use iowrite32_rep to write to FIFO ...
2018-06-05Merge tag 'driver-core-4.18-rc1' of ↵Linus Torvalds23-162/+372
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the driver core patchset for 4.18-rc1. The large chunk of these are firmware core documentation and api updates. Nothing major there, just better descriptions for others to be able to understand the firmware code better. There's also a user for a new firmware api call. Other than that, there are some minor updates for debugfs, kernfs, and the driver core itself. All of these have been in linux-next for a while with no reported issues" * tag 'driver-core-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (23 commits) driver core: hold dev's parent lock when needed driver-core: return EINVAL error instead of BUG_ON() driver core: add __printf verification to device_create_groups_vargs mm: memory_hotplug: use put_device() if device_register fail base: core: fix typo 'can by' to 'can be' debugfs: inode: debugfs_create_dir uses mode permission from parent debugfs: Re-use kstrtobool_from_user() Documentation: clarify firmware_class provenance and why we can't rename the module Documentation: remove stale firmware API reference Documentation: fix few typos and clarifications for the firmware loader ath10k: re-enable the firmware fallback mechanism for testmode ath10k: use firmware_request_nowarn() to load firmware firmware: add firmware_request_nowarn() - load firmware without warnings firmware_loader: make firmware_fallback_sysfs() print more useful firmware_loader: move kconfig FW_LOADER entries to its own file firmware_loader: replace ---help--- with help firmware_loader: enhance Kconfig documentation over FW_LOADER firmware_loader: document firmware_sysfs_fallback() firmware: rename fw_sysfs_fallback to firmware_fallback_sysfs() firmware: use () to terminate kernel-doc function names ...
2018-06-05Merge tag 'char-misc-4.18-rc1' of ↵Linus Torvalds132-1234/+8981
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the "big" char and misc driver patches for 4.18-rc1. It's not a lot of stuff here, but there are some highlights: - coreboot driver updates - soundwire driver updates - android binder updates - fpga big sync, mostly documentation - lots of minor driver updates All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (81 commits) vmw_balloon: fixing double free when batching mode is off MAINTAINERS: Add driver-api/fpga path fpga: clarify that unregister functions also free documentation: fpga: move fpga-region.txt to driver-api documentation: fpga: add bridge document to driver-api documentation: fpga: move fpga-mgr.txt to driver-api Documentation: fpga: move fpga overview to driver-api fpga: region: kernel-doc fixes fpga: bridge: kernel-doc fixes fpga: mgr: kernel-doc fixes fpga: use SPDX fpga: region: change api, add fpga_region_create/free fpga: bridge: change api, don't use drvdata fpga: manager: change api, don't use drvdata fpga: region: don't use drvdata in common fpga code Drivers: hv: vmbus: Removed an unnecessary cast from void * ver_linux: Drop redundant calls to system() to test if file is readable ver_linux: Move stderr redirection from function parameter to function body misc: IBM Virtual Management Channel Driver (VMC) rpmsg: Correct support for MODULE_DEVICE_TABLE() ...
2018-06-05Merge tag 'usb-4.18-rc1' of ↵Linus Torvalds211-3486/+11404
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB and PHY updates from Greg KH: "Here is the big USB pull request for 4.18-rc1. Lots of stuff here, the highlights are: - phy driver updates and new additions - usual set of xhci driver updates - normal set of musb updates - gadget driver updates and new controllers - typec work, it's getting closer to getting fully out of the staging portion of the tree. - lots of minor cleanups and bugfixes. All of these have been in linux-next for a while with no reported issues" * tag 'usb-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (263 commits) Revert "xhci: Reset Renesas uPD72020x USB controller for 32-bit DMA issue" xhci: Add quirk to zero 64bit registers on Renesas PCIe controllers xhci: Allow more than 32 quirks usb: xhci: force all memory allocations to node selftests: add test for USB over IP driver USB: typec: fsusb302: no need to check return value of debugfs_create_dir() USB: gadget: udc: s3c2410_udc: no need to check return value of debugfs_create functions USB: gadget: udc: renesas_usb3: no need to check return value of debugfs_create functions USB: gadget: udc: pxa27x_udc: no need to check return value of debugfs_create functions USB: gadget: udc: gr_udc: no need to check return value of debugfs_create functions USB: gadget: udc: bcm63xx_udc: no need to check return value of debugfs_create functions USB: udc: atmel_usba_udc: no need to check return value of debugfs_create functions USB: dwc3: no need to check return value of debugfs_create functions USB: dwc2: no need to check return value of debugfs_create functions USB: core: no need to check return value of debugfs_create functions USB: chipidea: no need to check return value of debugfs_create functions USB: ehci-hcd: no need to check return value of debugfs_create functions USB: fhci-hcd: no need to check return value of debugfs_create functions USB: fotg210-hcd: no need to check return value of debugfs_create functions USB: imx21-hcd: no need to check return value of debugfs_create functions ...
2018-06-05Merge tag 'mmc-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmcLinus Torvalds66-431/+1382
Pull MMC updates from Ulf Hansson: "MMC core: - Decrease polling rate for erase/trim/discard - Allow non-sleeping GPIOs for card detect - Improve mmc block removal path - Enable support for mmc_sw_reset() for SDIO cards - Add mmc_sw_reset() to allow users to do a soft reset of the card - Allow power delay to be tunable via DT - Allow card detect debounce delay to be tunable via DT - Enable new quirk to limit clock rate for Marvell 8887 chip - Don't show eMMC RPMB and BOOT areas in /proc/partitions - Add capability to avoid 3.3V signaling for fragile HWs MMC host: - Improve/fixup support for handle highmem pages - Remove depends on HAS_DMA in case of platform dependency - mvsdio: Enable support for erase/trim/discard - rtsx_usb: Enable support for erase/trim/discard - renesas_sdhi: Fix WP logic regressions - renesas_sdhi: Add r8a77965 support - renesas_sdhi: Add R8A77980 to whitelist - meson: Add optional support for device reset - meson: Add support for the Meson-AXG platform - dw_mmc: Add new driver for BlueField DW variant - mediatek: Add support for 64G DRAM DMA - sunxi: Deploy runtime PM support - jz4740: Add support for JZ4780 - jz4740: Enable support for DT based platforms - sdhci: Various improvement to timeout handling - sdhci: Disable support for HS200/HS400/UHS when no 1.8V support - sdhci-omap: Add support for controller in k2g SoC - sdhci-omap: Add workarounds for a couple of Erratas - sdhci-omap: Enable support for generic sdhci DT properties - sdhci-cadence: Re-send tune request to deal with errata - sdhci-pci: Fix 3.3V voltage switch for some BYT-based Intel controllers - sdhci-pci: Avoid 3.3V signaling on some NI 904x - sdhci-esdhc-imx: Use watermark levels for PIO access - sdhci-msm: Improve card detection handling - sdhci-msm: Add support voltage pad switching" * tag 'mmc-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (104 commits) mmc: renesas_sdhi: really fix WP logic regressions mmc: mvsdio: Enable MMC_CAP_ERASE mmc: mvsdio: Respect card busy time out from mmc core mmc: sdhci-msm: Remove NO_CARD_NO_RESET quirk mmc: sunxi: Use ifdef rather than __maybe_unused mmc: mxmmc: Use ifdef rather than __maybe_unused mmc: mxmmc: include linux/highmem.h mmc: sunxi: mark PM functions as __maybe_unused mmc: Throttle calls to MMC_SEND_STATUS during mmc_do_erase() mmc: au1xmmc: handle highmem pages mmc: Allow non-sleeping GPIO cd mmc: sdhci-*: Don't emit error msg if sdhci_add_host() fails mmc: sd: Define name for default speed dtr mmc: core: Move calls to ->prepare_hs400_tuning() closer to mmc code mmc: sdhci-xenon: use match_string() helper mmc: wbsd: handle highmem pages mmc: ushc: handle highmem pages mmc: mxcmmc: handle highmem pages mmc: atmel-mci: use sg_copy_{from,to}_buffer mmc: android-goldfish: use sg_copy_{from,to}_buffer ...
2018-06-05Merge tag 'leds_for_4.18-rc1' of ↵Linus Torvalds11-18/+1237
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds Pull LED updates from Jacek Anaszewski: "This was quite a fruitful cycle, taking into account usual traffic on linux-leds list, as we managed to merge three new LED class drivers. New LED class drivers with related DT bindings: - add LED driver for CR0014114 board - add Spreadtrum SC27xx breathing light controller driver - introduce the lm3601x LED driver LED class fix: - ensure workqueue is initialized before setting brightness Improvements and fixes to existing LED class drivers: - fix return value check in sc27xx_led_probe() - use sysfs_match_string() helper in wm831x_status_src_store()" * tag 'leds_for_4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: leds: class: ensure workqueue is initialized before setting brightness leds: lm3601x: Introduce the lm3601x LED driver dt: bindings: lm3601x: Introduce the lm3601x driver leds: sc27xx: Fix return value check in sc27xx_led_probe() leds: Add Spreadtrum SC27xx breathing light controller driver dt-bindings: leds: Add SC27xx breathing light controller documentation leds: wm831x-status: Use sysfs_match_string() helper leds: add LED driver for CR0014114 board dt-bindings: Add vendor prefix and docs for CR0014114
2018-06-05Merge branch 'for-linus' of ↵Linus Torvalds26-389/+1325
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - a new driver to ChipOne icn8505 based touchscreens - on certain systems with Elan touch controllers they will be switched away form PS/2 emulation and over to native SMbus mode - assorted driver fixups and improvements * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (24 commits) Input: elan_i2c - add ELAN0612 (Lenovo v330 14IKB) ACPI ID Input: goodix - add new ACPI id for GPD Win 2 touch screen Input: xpad - add GPD Win 2 Controller USB IDs Input: ti_am335x_tsc - prevent system suspend when TSC is in use Input: ti_am335x_tsc - ack pending IRQs at probe and before suspend Input: cros_ec_keyb - mark cros_ec_keyb driver as wake enabled device. Input: mk712 - update documentation web link Input: atmel_mxt_ts - fix reset-gpio for level based irqs Input: atmel_mxt_ts - require device properties present when probing Input: psmouse-smbus - allow to control psmouse_deactivate Input: elantech - detect new ICs and setup Host Notify for them Input: elantech - add support for SMBus devices Input: elantech - query the resolution in query_info Input: elantech - split device info into a separate structure Input: elan_i2c - add trackstick report Input: usbtouchscreen - add sysfs attribute for 3M MTouch firmware rev Input: ati_remote2 - fix typo 'can by' to 'can be' Input: replace hard coded string with __func__ in pr_err() Input: add support for ChipOne icn8505 based touchscreens Input: gamecon - avoid using __set_bit() for capabilities ...
2018-06-05Merge branch 'linus' of ↵Linus Torvalds141-15349/+20656
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Decryption test vectors are now automatically generated from encryption test vectors. Algorithms: - Fix unaligned access issues in crc32/crc32c. - Add zstd compression algorithm. - Add AEGIS. - Add MORUS. Drivers: - Add accelerated AEGIS/MORUS on x86. - Add accelerated SM4 on arm64. - Removed x86 assembly salsa implementation as it is slower than C. - Add authenc(hmac(sha*), cbc(aes)) support in inside-secure. - Add ctr(aes) support in crypto4xx. - Add hardware key support in ccree. - Add support for new Centaur CPU in via-rng" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (112 commits) crypto: chtls - free beyond end rspq_skb_cache crypto: chtls - kbuild warnings crypto: chtls - dereference null variable crypto: chtls - wait for memory sendmsg, sendpage crypto: chtls - key len correction crypto: salsa20 - Revert "crypto: salsa20 - export generic helpers" crypto: x86/salsa20 - remove x86 salsa20 implementations crypto: ccp - Add GET_ID SEV command crypto: ccp - Add DOWNLOAD_FIRMWARE SEV command crypto: qat - Add MODULE_FIRMWARE for all qat drivers crypto: ccree - silence debug prints crypto: ccree - better clock handling crypto: ccree - correct host regs offset crypto: chelsio - Remove separate buffer used for DMA map B0 block in CCM crypt: chelsio - Send IV as Immediate for cipher algo crypto: chelsio - Return -ENOSPC for transient busy indication. crypto: caam/qi - fix warning in init_cgr() crypto: caam - fix rfc4543 descriptors crypto: caam - fix MC firmware detection crypto: clarify licensing of OpenSSL asm code ...
2018-06-05dt-bindings: submitting-patches: add guidance on patch content and subjectRob Herring1-1/+8
Clarify that binding patches should also include include/dt-bindings/* as part of them. The binding doc defines the ABI and the includes are part of that. Add some details on the preferred subject prefix and contents. Signed-off-by: Rob Herring <[email protected]>
2018-06-05Merge tag 'fscrypt_for_linus' of ↵Linus Torvalds13-213/+248
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt Pull fscrypt updates from Ted Ts'o: "Add bunch of cleanups, and add support for the Speck128/256 algorithms. Yes, Speck is contrversial, but the intention is to use them only for the lowest end Android devices, where the alternative *really* is no encryption at all for data stored at rest" * tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt: fscrypt: log the crypto algorithm implementations fscrypt: add Speck128/256 support fscrypt: only derive the needed portion of the key fscrypt: separate key lookup from key derivation fscrypt: use a common logging function fscrypt: remove internal key size constants fscrypt: remove unnecessary check for non-logon key type fscrypt: make fscrypt_operations.max_namelen an integer fscrypt: drop empty name check from fname_decrypt() fscrypt: drop max_namelen check from fname_decrypt() fscrypt: don't special-case EOPNOTSUPP from fscrypt_get_encryption_info() fscrypt: don't clear flags on crypto transform fscrypt: remove stale comment from fscrypt_d_revalidate() fscrypt: remove error messages for skcipher_request_alloc() failure fscrypt: remove unnecessary NULL check when allocating skcipher fscrypt: clean up after fscrypt_prepare_lookup() conversions fs, fscrypt: only define ->s_cop when FS_ENCRYPTION is enabled fscrypt: use unbound workqueue for decryption