Age | Commit message (Collapse) | Author | Files | Lines |
|
To address the EBUSY fail of interrupt affinity settings in case that the
previous setting has not been cleaned up yet, use the new apic_ack_irq()
function instead of the special uv_ack_apic() implementation which is
merily a wrapper around ack_APIC_irq().
Preparatory change for the real fix
Fixes: dccfe3147b42 ("x86/vector: Simplify vector move cleanup")
Reported-by: Song Liu <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Song Liu <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Mike Travis <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Tariq Toukan <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
To address the EBUSY fail of interrupt affinity settings in case that the
previous setting has not been cleaned up yet, use the new apic_ack_irq()
function instead of directly invoking ack_APIC_irq().
Preparatory change for the real fix
Fixes: dccfe3147b42 ("x86/vector: Simplify vector move cleanup")
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Song Liu <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Mike Travis <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Tariq Toukan <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
To address the EBUSY fail of interrupt affinity settings in case that the
previous setting has not been cleaned up yet, use the new apic_ack_irq()
function instead of the special ir_ack_apic_edge() implementation which is
merily a wrapper around ack_APIC_irq().
Preparatory change for the real fix
Fixes: dccfe3147b42 ("x86/vector: Simplify vector move cleanup")
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Song Liu <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Mike Travis <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Tariq Toukan <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
apic_ack_edge() is explicitely for handling interrupt affinity cleanup when
interrupt remapping is not available or disable.
Remapped interrupts and also some of the platform specific special
interrupts, e.g. UV, invoke ack_APIC_irq() directly.
To address the issue of failing an affinity update with -EBUSY the delayed
affinity mechanism can be reused, but ack_APIC_irq() does not handle
that. Adding this to ack_APIC_irq() is not possible, because that function
is also used for exceptions and directly handled interrupts like IPIs.
Create a new function, which just contains the conditional invocation of
irq_move_irq() and the final ack_APIC_irq().
Reuse the new function in apic_ack_edge().
Preparatory change for the real fix.
Fixes: dccfe3147b42 ("x86/vector: Simplify vector move cleanup")
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Song Liu <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Mike Travis <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Tariq Toukan <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
The upcoming fix for the -EBUSY return from affinity settings requires to
use the irq_move_irq() functionality even on irq remapped interrupts. To
avoid the out of line call, move the check for the pending bit into an
inline helper.
Preparatory change for the real fix. No functional change.
Fixes: dccfe3147b42 ("x86/vector: Simplify vector move cleanup")
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Mike Travis <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Tariq Toukan <[email protected]>
Cc: Dou Liyang <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
The generic pending interrupt mechanism moves interrupts from the interrupt
handler on the original target CPU to the new destination CPU. This is
required for x86 and ia64 due to the way the interrupt delivery and
acknowledge works if the interrupts are not remapped.
However that update can fail for various reasons. Some of them are valid
reasons to discard the pending update, but the case, when the previous move
has not been fully cleaned up is not a legit reason to fail.
Check the return value of irq_do_set_affinity() for -EBUSY, which indicates
a pending cleanup, and rearm the pending move in the irq dexcriptor so it's
tried again when the next interrupt arrives.
Fixes: 996c591227d9 ("x86/irq: Plug vector cleanup race")
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Song Liu <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Mike Travis <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Tariq Toukan <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Several people observed the WARN_ON() in irq_matrix_free() which triggers
when the caller tries to free an vector which is not in the allocation
range. Song provided the trace information which allowed to decode the root
cause.
The rework of the vector allocation mechanism failed to preserve a sanity
check, which prevents setting a new target vector/CPU when the previous
affinity change has not fully completed.
As a result a half finished affinity change can be overwritten, which can
cause the leak of a irq descriptor pointer on the previous target CPU and
double enqueue of the hlist head into the cleanup lists of two or more
CPUs. After one CPU cleaned up its vector the next CPU will invoke the
cleanup handler with vector 0, which triggers the out of range warning in
the matrix allocator.
Prevent this by checking the apic_data of the interrupt whether the
move_in_progress flag is false and the hlist node is not hashed. Return
-EBUSY if not.
This prevents the damage and restores the behaviour before the vector
allocation rework, but due to other changes in that area it also widens the
chance that user space can observe -EBUSY. In theory this should be fine,
but actually not all user space tools handle -EBUSY correctly. Addressing
that is not part of this fix, but will be addressed in follow up patches.
Fixes: 69cde0004a4b ("x86/vector: Use matrix allocator for vector assignment")
Reported-by: Dmitry Safonov <[email protected]>
Reported-by: Tariq Toukan <[email protected]>
Reported-by: Song Liu <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Song Liu <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: Mike Travis <[email protected]>
Cc: Borislav Petkov <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
In commit 48398b6e7065 ("enic: set UDP rss flag") driver needed to set a
single bit to enable UDP rss. This is changed to two bit. One for UDP
IPv4 and other bit for UDP IPv6. The hardware which supports this is not
released yet. When released, driver should set 2 bit to enable UDP rss for
both IPv4 and IPv6.
Also add spinlock around vnic_dev_capable_rss_hash_type().
Signed-off-by: Govindarajulu Varadarajan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The kbuild test robot reported the following issue:
kernel/time/posix-stubs.o: warning: objtool: sys_ni_posix_timers.cold.1()+0x0: unreachable instruction
This file creates symbol aliases for the sys_ni_posix_timers() function.
So there are multiple ELF function symbols for the same function:
23: 0000000000000150 26 FUNC GLOBAL DEFAULT 1 __x64_sys_timer_create
24: 0000000000000150 26 FUNC GLOBAL DEFAULT 1 sys_ni_posix_timers
25: 0000000000000150 26 FUNC GLOBAL DEFAULT 1 __ia32_sys_timer_create
26: 0000000000000150 26 FUNC GLOBAL DEFAULT 1 __x64_sys_timer_gettime
Here's the corresponding cold subfunction:
11: 0000000000000000 45 FUNC LOCAL DEFAULT 6 sys_ni_posix_timers.cold.1
When analyzing overlapping functions, objtool only looks at the first
one in the symbol list. The rest of the functions are basically ignored
because they point to instructions which have already been analyzed.
So in this case it analyzes the __x64_sys_timer_create() function, but
then it fails to recognize that its cold subfunction is
sys_ni_posix_timers.cold.1(), because the names are different.
Make the subfunction detection a little smarter by associating each
subfunction with the first function which jumps to it, since that's the
one which will be analyzed.
Unfortunately we still have to leave the original subfunction detection
code in place, thanks to GCC switch tables. (See the comment for more
details.)
Fixes: 13810435b9a7 ("objtool: Support GCC 8's cold subfunctions")
Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lkml.kernel.org/r/d3ba52662cbc8e3a64a3b64d44b4efc5674fd9ab.1527855808.git.jpoimboe@redhat.com
|
|
Both AMD and Intel can have SPEC_CTRL_MSR for SSBD.
However AMD also has two more other ways of doing it - which
are !SPEC_CTRL MSR ways.
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: [email protected]
Cc: KarimAllah Ahmed <[email protected]>
Cc: [email protected]
Cc: "H. Peter Anvin" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Woodhouse <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
The AMD document outlining the SSBD handling
124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
mentions that if CPUID 8000_0008.EBX[24] is set we should be using
the SPEC_CTRL MSR (0x48) over the VIRT SPEC_CTRL MSR (0xC001_011f)
for speculative store bypass disable.
This in effect means we should clear the X86_FEATURE_VIRT_SSBD
flag so that we would prefer the SPEC_CTRL MSR.
See the document titled:
124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
A copy of this document is available at
https://bugzilla.kernel.org/show_bug.cgi?id=199889
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Janakarajan Natarajan <[email protected]>
Cc: [email protected]
Cc: KarimAllah Ahmed <[email protected]>
Cc: [email protected]
Cc: Joerg Roedel <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Kees Cook <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
The AMD document outlining the SSBD handling
124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
mentions that the CPUID 8000_0008.EBX[26] will mean that the
speculative store bypass disable is no longer needed.
A copy of this document is available at:
https://bugzilla.kernel.org/show_bug.cgi?id=199889
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Janakarajan Natarajan <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Woodhouse <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
The vector_alloc tracepont reversed the reserved and ret aggs, that made
the trace print wrong. Exchange them.
Fixes: 8d1e3dca7de6 ("x86/vector: Add tracepoints for vector management")
Signed-off-by: Dou Liyang <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
The idt_setup_apic_and_irq_gates() sets the gates from
FIRST_EXTERNAL_VECTOR up to FIRST_SYSTEM_VECTOR first. then secondly, from
FIRST_SYSTEM_VECTOR to NR_VECTORS, it takes both APIC=y and APIC=n into
account.
But for APIC=n, the FIRST_SYSTEM_VECTOR is equal to NR_VECTORS, all
vectors has been set at the first step.
Simplify the second step, make it just work for APIC=y.
Signed-off-by: Dou Liyang <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Remove extra parentheses to fix the extraneous parentheses clang
warning.
Suggested-by: Lukas Bulwahn <[email protected]>
Signed-off-by: Varsha Rao <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Nicholas Mc Guire <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
AMD SME claims one bit from physical address to indicate whether the
page is encrypted or not. To achieve that we clear out the bit from
__PHYSICAL_MASK.
The capability to adjust __PHYSICAL_MASK is required beyond AMD SME.
For instance for upcoming Intel Multi-Key Total Memory Encryption.
Factor it out into a separate feature with own Kconfig handle.
It also helps with overhead of AMD SME. It saves more than 3k in .text
on defconfig + AMD_MEM_ENCRYPT:
add/remove: 3/2 grow/shrink: 5/110 up/down: 189/-3753 (-3564)
We would need to return to this once we have infrastructure to patch
constants in code. That's good candidate for it.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Tom Lendacky <[email protected]>
Cc: [email protected]
Cc: "H. Peter Anvin" <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
When CONFIG_OPTIMIZE_INLINING is enabled, the function native_set_p4d()
may not be fully inlined into the caller, resulting in a false-positive
warning about an access to the __pgtable_l5_enabled variable from a
non-__init function, despite the original caller being an __init function:
WARNING: vmlinux.o(.text.unlikely+0x1429): Section mismatch in reference from the function native_set_p4d() to the variable .init.data:__pgtable_l5_enabled
WARNING: vmlinux.o(.text.unlikely+0x1429): Section mismatch in reference from the function native_p4d_clear() to the variable .init.data:__pgtable_l5_enabled
The function native_set_p4d() references the variable __initdata
__pgtable_l5_enabled. This is often because native_set_p4d lacks a
__initdata annotation or the annotation of __pgtable_l5_enabled is wrong.
Marking the native_set_p4d function and its caller native_p4d_clear()
avoids this problem.
I did not bisect the original cause, but I assume this is related to the
recent rework that turned pgtable_l5_enabled() into an inline function,
which in turn caused the compiler to make different inlining decisions.
Fixes: ad3fe525b950 ("x86/mm: Unify pgtable_l5_enabled usage in early boot code")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Add the required iommu_dma_map_msi_msg() when composing the MSI message,
otherwise the interrupts will not work.
Signed-off-by: Laurentiu Tudor <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
A CONFIG_SMP=n build emits a harmless compile-time warning:
drivers/irqchip/irq-stm32-exti.c:495:12: error: 'stm32_exti_h_set_affinity' defined but not used [-Werror=unused-function]
The #ifdef is inconsistent here, and it's better to use an IS_ENABLED() check
that lets the compiler silently drop that function.
Fixes: 927abfc4461e ("irqchip/stm32: Add stm32mp1 support with hierarchy domain")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Ludovic Barre <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Benjamin Gaignard <[email protected]>
Cc: Radoslaw Pietrzyk <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: [email protected]
Cc: Maxime Coquelin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
A run_param_test.sh script runs many variants of the parametrizable
tests.
Wire up the rseq Makefile, add directory entry into MAINTAINERS file.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
"param_test" is a parametrizable restartable sequences test. See
the "--help" output for usage.
"param_test_benchmark" is the same as "param_test", but it removes
testing book-keeping code to allow accurate benchmarks.
"param_test_compare_twice" is the same as "param_test", but it performs
each comparison within rseq critical section twice, thus validating
invariants. If any of the second comparisons fails, an error message
is printed and the test aborts.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
"basic_percpu_ops_test" is a slightly more "realistic" variant,
implementing a few simple per-cpu operations and testing their
correctness.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
"basic_test" only asserts that RSEQ works moderately correctly. E.g.
that the CPUID pointer works.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
This rseq helper library provides a user-space API to the rseq()
system call.
The rseq fast-path exposes the instruction pointer addresses where the
rseq assembly blocks begin and end, as well as the associated abort
instruction pointer, in the __rseq_table section. This section allows
debuggers may know where to place breakpoints when single-stepping
through assembly blocks which may be aborted at any point by the kernel.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Introduce OVERRIDE_TARGETS to allow tests to express dependencies on
header files and .so, which require to override the selftests lib.mk
targets.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Wire up the rseq system call on powerpc.
This provides an ABI improving the speed of a user-space getcpu
operation on powerpc by skipping the getcpu system call on the fast
path, as well as improving the speed of user-space operations on per-cpu
data compared to using load-reservation/store-conditional atomics.
Signed-off-by: Boqun Feng <[email protected]>
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Syscalls are not allowed inside restartable sequences, so add a call to
rseq_syscall() at the very beginning of system call exiting path for
CONFIG_DEBUG_RSEQ=y kernel. This could help us to detect whether there
is a syscall issued inside restartable sequences.
Signed-off-by: Boqun Feng <[email protected]>
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Call the rseq_handle_notify_resume() function on return to userspace if
TIF_NOTIFY_RESUME thread flag is set.
Perform fixup on the pre-signal when a signal is delivered on top of a
restartable sequence critical section.
Signed-off-by: Boqun Feng <[email protected]>
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Wire up the rseq system call on x86 32/64.
This provides an ABI improving the speed of a user-space getcpu
operation on x86 by removing the need to perform a function call, "lsl"
instruction, or system call on the fast path, as well as improving the
speed of user-space operations on per-cpu data.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Call the rseq_handle_notify_resume() function on return to userspace if
TIF_NOTIFY_RESUME thread flag is set.
Perform fixup on the pre-signal frame when a signal is delivered on top
of a restartable sequence critical section.
Check that system calls are not invoked from within rseq critical
sections by invoking rseq_signal() from syscall_return_slowpath().
With CONFIG_DEBUG_RSEQ, such behavior results in termination of the
process with SIGSEGV.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Wire up the rseq system call on 32-bit ARM.
This provides an ABI improving the speed of a user-space getcpu
operation on ARM by skipping the getcpu system call on the fast path, as
well as improving the speed of user-space operations on per-cpu data
compared to using load-linked/store-conditional.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Syscalls are not allowed inside restartable sequences, so add a call to
rseq_syscall() at the very beginning of system call exiting path for
CONFIG_DEBUG_RSEQ=y kernel. This could help us to detect whether there
is a syscall issued inside restartable sequences.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Call the rseq_handle_notify_resume() function on return to
userspace if TIF_NOTIFY_RESUME thread flag is set.
Perform fixup on the pre-signal frame when a signal is delivered on top
of a restartable sequence critical section.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Expose a new system call allowing each thread to register one userspace
memory area to be used as an ABI between kernel and user-space for two
purposes: user-space restartable sequences and quick access to read the
current CPU number value from user-space.
* Restartable sequences (per-cpu atomics)
Restartables sequences allow user-space to perform update operations on
per-cpu data without requiring heavy-weight atomic operations.
The restartable critical sections (percpu atomics) work has been started
by Paul Turner and Andrew Hunter. It lets the kernel handle restart of
critical sections. [1] [2] The re-implementation proposed here brings a
few simplifications to the ABI which facilitates porting to other
architectures and speeds up the user-space fast path.
Here are benchmarks of various rseq use-cases.
Test hardware:
arm32: ARMv7 Processor rev 4 (v7l) "Cubietruck", 2-core
x86-64: Intel E5-2630 [email protected], 16-core, hyperthreading
The following benchmarks were all performed on a single thread.
* Per-CPU statistic counter increment
getcpu+atomic (ns/op) rseq (ns/op) speedup
arm32: 344.0 31.4 11.0
x86-64: 15.3 2.0 7.7
* LTTng-UST: write event 32-bit header, 32-bit payload into tracer
per-cpu buffer
getcpu+atomic (ns/op) rseq (ns/op) speedup
arm32: 2502.0 2250.0 1.1
x86-64: 117.4 98.0 1.2
* liburcu percpu: lock-unlock pair, dereference, read/compare word
getcpu+atomic (ns/op) rseq (ns/op) speedup
arm32: 751.0 128.5 5.8
x86-64: 53.4 28.6 1.9
* jemalloc memory allocator adapted to use rseq
Using rseq with per-cpu memory pools in jemalloc at Facebook (based on
rseq 2016 implementation):
The production workload response-time has 1-2% gain avg. latency, and
the P99 overall latency drops by 2-3%.
* Reading the current CPU number
Speeding up reading the current CPU number on which the caller thread is
running is done by keeping the current CPU number up do date within the
cpu_id field of the memory area registered by the thread. This is done
by making scheduler preemption set the TIF_NOTIFY_RESUME flag on the
current thread. Upon return to user-space, a notify-resume handler
updates the current CPU value within the registered user-space memory
area. User-space can then read the current CPU number directly from
memory.
Keeping the current cpu id in a memory area shared between kernel and
user-space is an improvement over current mechanisms available to read
the current CPU number, which has the following benefits over
alternative approaches:
- 35x speedup on ARM vs system call through glibc
- 20x speedup on x86 compared to calling glibc, which calls vdso
executing a "lsl" instruction,
- 14x speedup on x86 compared to inlined "lsl" instruction,
- Unlike vdso approaches, this cpu_id value can be read from an inline
assembly, which makes it a useful building block for restartable
sequences.
- The approach of reading the cpu id through memory mapping shared
between kernel and user-space is portable (e.g. ARM), which is not the
case for the lsl-based x86 vdso.
On x86, yet another possible approach would be to use the gs segment
selector to point to user-space per-cpu data. This approach performs
similarly to the cpu id cache, but it has two disadvantages: it is
not portable, and it is incompatible with existing applications already
using the gs segment selector for other purposes.
Benchmarking various approaches for reading the current CPU number:
ARMv7 Processor rev 4 (v7l)
Machine model: Cubietruck
- Baseline (empty loop): 8.4 ns
- Read CPU from rseq cpu_id: 16.7 ns
- Read CPU from rseq cpu_id (lazy register): 19.8 ns
- glibc 2.19-0ubuntu6.6 getcpu: 301.8 ns
- getcpu system call: 234.9 ns
x86-64 Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz:
- Baseline (empty loop): 0.8 ns
- Read CPU from rseq cpu_id: 0.8 ns
- Read CPU from rseq cpu_id (lazy register): 0.8 ns
- Read using gs segment selector: 0.8 ns
- "lsl" inline assembly: 13.0 ns
- glibc 2.19-0ubuntu6 getcpu: 16.6 ns
- getcpu system call: 53.9 ns
- Speed (benchmark taken on v8 of patchset)
Running 10 runs of hackbench -l 100000 seems to indicate, contrary to
expectations, that enabling CONFIG_RSEQ slightly accelerates the
scheduler:
Configuration: 2 sockets * 8-core Intel(R) Xeon(R) CPU E5-2630 v3 @
2.40GHz (directly on hardware, hyperthreading disabled in BIOS, energy
saving disabled in BIOS, turboboost disabled in BIOS, cpuidle.off=1
kernel parameter), with a Linux v4.6 defconfig+localyesconfig,
restartable sequences series applied.
* CONFIG_RSEQ=n
avg.: 41.37 s
std.dev.: 0.36 s
* CONFIG_RSEQ=y
avg.: 40.46 s
std.dev.: 0.33 s
- Size
On x86-64, between CONFIG_RSEQ=n/y, the text size increase of vmlinux is
567 bytes, and the data size increase of vmlinux is 5696 bytes.
[1] https://lwn.net/Articles/650333/
[2] http://www.linuxplumbersconf.org/2013/ocw/system/presentations/1695/original/LPC%20-%20PerCpu%20Atomics.pdf
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/r/20151027235635.16059.11630.stgit@pjt-glaptop.roam.corp.google.com
Link: http://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
Provide helper macros for fields which represent pointers in
kernel-userspace ABI. This facilitates handling of 32-bit
user-space by 64-bit kernels by defining those fields as
32-bit 0-padding and 32-bit integer on 32-bit architectures,
which allows the kernel to treat those as 64-bit integers.
The order of padding and 32-bit integer depends on the
endianness.
Signed-off-by: Mathieu Desnoyers <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Andrew Hunter <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
There is a typo in f1cb8f9beb ("powerpc/64s/radix: avoid ptesync after
set_pte and ptep_set_access_flags") config ifdef, which results in the
necessary ptesync not being issued after vmalloc.
This causes random kernel faults in module load, bpf load, anywhere
that vmalloc mappings are used.
After correcting the code, this survives a guest kernel booting
hundreds of times where previously there would be a crash every few
boots (I haven't noticed the crash on host, perhaps due to different
TLB and page table walking behaviour in hardware).
A memory clobber is also added to the flush, just to be sure it won't
be reordered with the pte set or the subsequent mapping access.
Fixes: f1cb8f9beb ("powerpc/64s/radix: avoid ptesync after set_pte and ptep_set_access_flags")
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Pull workqueue updates from Tejun Heo:
- make kworkers report the workqueue it is executing or has executed
most recently in /proc/PID/comm (so they show up in ps/top)
- CONFIG_SMP shuffle to move stuff which isn't necessary for UP builds
inside CONFIG_SMP.
* 'for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: move function definitions within CONFIG_SMP block
workqueue: Make sure struct worker is accessible for wq_worker_comm()
workqueue: Show the latest workqueue name in /proc/PID/{comm,stat,status}
proc: Consolidate task->comm formatting into proc_task_name()
workqueue: Set worker->desc to workqueue name by default
workqueue: Make worker_attach/detach_pool() update worker->pool
workqueue: Replace pool->attach_mutex with global wq_pool_attach_mutex
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- For cpustat, cgroup has a percpu hierarchical stat mechanism which
propagates up the hierarchy lazily.
This contains commits to factor out and generalize the mechanism so
that it can be used for other cgroup stats too.
The original intention was to update memcg stats to use it but memcg
went for a different approach, so still the only user is cpustat. The
factoring out and generalization still make sense and it's likely
that this can be used for other purposes in the future.
- cgroup uses kernfs_notify() (which uses fsnotify()) to inform user
space of certain events. A rate limiting mechanism is added.
- Other misc changes.
* 'for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: css_set_lock should nest inside tasklist_lock
rdmacg: Convert to use match_string() helper
cgroup: Make cgroup_rstat_updated() ready for root cgroup usage
cgroup: Add memory barriers to plug cgroup_rstat_updated() race window
cgroup: Add cgroup_subsys->css_rstat_flush()
cgroup: Replace cgroup_rstat_mutex with a spinlock
cgroup: Factor out and expose cgroup_rstat_*() interface functions
cgroup: Reorganize kernel/cgroup/rstat.c
cgroup: Distinguish base resource stat implementation from rstat
cgroup: Rename stat to rstat
cgroup: Rename kernel/cgroup/stat.c to kernel/cgroup/rstat.c
cgroup: Limit event generation frequency
cgroup: Explicitly remove core interface files
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata updates from Tejun Heo:
- libata has always been limiting the maximum queue depth to 31, with
one entry set aside mostly for historical reasons. This didn't use to
make much difference but Jens found out that modern hard drives can
actually perform measurably better with the extra one queue depth.
Jens updated libata core so that it can make use of full 32 queue
depth
- Damien updated command retry logic in error handling so that it
doesn't unnecessarily retry when upper layer (SCSI) is gonna handle
them
- A couple misc changes
* 'for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
sata_fsl: use the right type for tag bitshift
ahci: enable full queue depth of 32
libata: don't clamp queue depth to ATA_MAX_QUEUE - 1
libata: add extra internal command
sata_nv: set host can_queue count appropriately
libata: remove assumption that ATA_MAX_QUEUE - 1 is the max
libata: use ata_tag_internal() consistently
libata: bump ->qc_active to a 64-bit type
libata: convert core and drivers to ->hw_tag usage
libata: introduce notion of separate hardware tags
libata: Fix command retry decision
libata: Honor RQF_QUIET flag
libata: Make ata_dev_set_mode() less verbose
libata: Fix ata_err_string()
libata: Fix comment typo in ata_eh_analyze_tf()
sata_nv: don't use block layer bounce buffer
ata: hpt37x: Convert to use match_string() helper
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"These two are fixes which missed v4.17.
One is to remove an incorrect power management blacklist entry and the
other to fix a cdb buffer overrun which has been there for a very long
time"
* 'for-4.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
libata: Drop SanDisk SD7UB3Q*G1001 NOLPM quirk
libata: zpodd: small read overflow in eject_tray()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial updates from Greg KH:
"Here is the big tty/serial driver update for 4.18-rc1.
There's nothing major here, just lots of serial driver updates. Full
details are in the shortlog, nothing anything specific to call out
here.
All have been in linux-next for a while with no reported issues"
* tag 'tty-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (55 commits)
vt: Perform safe console erase only once
serial: imx: disable UCR4_OREN on shutdown
serial: imx: drop CTS/RTS handling from shutdown
tty: fix typo in ASYNCB_FOURPORT comment
serial: samsung: check DMA engine capabilities before using DMA mode
tty: Fix data race in tty_insert_flip_string_fixed_flag
tty: serial: msm_geni_serial: Fix TX infinite loop
serial: 8250_dw: Fix runtime PM handling
serial: 8250: omap: Fix idling of clocks for unused uarts
tty: serial: drop ATH79 specific SoC symbols
serial: 8250: Add missing rxtrig_bytes on Altera 16550 UART
serial/aspeed-vuart: fix a couple mod_timer() calls
serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version
serial: 8250_of: Add IO space support
tty/serial: atmel: use port->name as name in request_irq()
serial: imx: dma_unmap_sg buffers on shutdown
serial: imx: cleanup imx_uart_disable_dma()
tty: serial: qcom_geni_serial: Add early console support
tty: serial: qcom_geni_serial: Return IRQ_NONE for spurious interrupts
tty: serial: qcom_geni_serial: Use iowrite32_rep to write to FIFO
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the driver core patchset for 4.18-rc1.
The large chunk of these are firmware core documentation and api
updates. Nothing major there, just better descriptions for others to
be able to understand the firmware code better. There's also a user
for a new firmware api call.
Other than that, there are some minor updates for debugfs, kernfs, and
the driver core itself.
All of these have been in linux-next for a while with no reported
issues"
* tag 'driver-core-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (23 commits)
driver core: hold dev's parent lock when needed
driver-core: return EINVAL error instead of BUG_ON()
driver core: add __printf verification to device_create_groups_vargs
mm: memory_hotplug: use put_device() if device_register fail
base: core: fix typo 'can by' to 'can be'
debugfs: inode: debugfs_create_dir uses mode permission from parent
debugfs: Re-use kstrtobool_from_user()
Documentation: clarify firmware_class provenance and why we can't rename the module
Documentation: remove stale firmware API reference
Documentation: fix few typos and clarifications for the firmware loader
ath10k: re-enable the firmware fallback mechanism for testmode
ath10k: use firmware_request_nowarn() to load firmware
firmware: add firmware_request_nowarn() - load firmware without warnings
firmware_loader: make firmware_fallback_sysfs() print more useful
firmware_loader: move kconfig FW_LOADER entries to its own file
firmware_loader: replace ---help--- with help
firmware_loader: enhance Kconfig documentation over FW_LOADER
firmware_loader: document firmware_sysfs_fallback()
firmware: rename fw_sysfs_fallback to firmware_fallback_sysfs()
firmware: use () to terminate kernel-doc function names
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the "big" char and misc driver patches for 4.18-rc1.
It's not a lot of stuff here, but there are some highlights:
- coreboot driver updates
- soundwire driver updates
- android binder updates
- fpga big sync, mostly documentation
- lots of minor driver updates
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (81 commits)
vmw_balloon: fixing double free when batching mode is off
MAINTAINERS: Add driver-api/fpga path
fpga: clarify that unregister functions also free
documentation: fpga: move fpga-region.txt to driver-api
documentation: fpga: add bridge document to driver-api
documentation: fpga: move fpga-mgr.txt to driver-api
Documentation: fpga: move fpga overview to driver-api
fpga: region: kernel-doc fixes
fpga: bridge: kernel-doc fixes
fpga: mgr: kernel-doc fixes
fpga: use SPDX
fpga: region: change api, add fpga_region_create/free
fpga: bridge: change api, don't use drvdata
fpga: manager: change api, don't use drvdata
fpga: region: don't use drvdata in common fpga code
Drivers: hv: vmbus: Removed an unnecessary cast from void *
ver_linux: Drop redundant calls to system() to test if file is readable
ver_linux: Move stderr redirection from function parameter to function body
misc: IBM Virtual Management Channel Driver (VMC)
rpmsg: Correct support for MODULE_DEVICE_TABLE()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB and PHY updates from Greg KH:
"Here is the big USB pull request for 4.18-rc1.
Lots of stuff here, the highlights are:
- phy driver updates and new additions
- usual set of xhci driver updates
- normal set of musb updates
- gadget driver updates and new controllers
- typec work, it's getting closer to getting fully out of the staging
portion of the tree.
- lots of minor cleanups and bugfixes.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (263 commits)
Revert "xhci: Reset Renesas uPD72020x USB controller for 32-bit DMA issue"
xhci: Add quirk to zero 64bit registers on Renesas PCIe controllers
xhci: Allow more than 32 quirks
usb: xhci: force all memory allocations to node
selftests: add test for USB over IP driver
USB: typec: fsusb302: no need to check return value of debugfs_create_dir()
USB: gadget: udc: s3c2410_udc: no need to check return value of debugfs_create functions
USB: gadget: udc: renesas_usb3: no need to check return value of debugfs_create functions
USB: gadget: udc: pxa27x_udc: no need to check return value of debugfs_create functions
USB: gadget: udc: gr_udc: no need to check return value of debugfs_create functions
USB: gadget: udc: bcm63xx_udc: no need to check return value of debugfs_create functions
USB: udc: atmel_usba_udc: no need to check return value of debugfs_create functions
USB: dwc3: no need to check return value of debugfs_create functions
USB: dwc2: no need to check return value of debugfs_create functions
USB: core: no need to check return value of debugfs_create functions
USB: chipidea: no need to check return value of debugfs_create functions
USB: ehci-hcd: no need to check return value of debugfs_create functions
USB: fhci-hcd: no need to check return value of debugfs_create functions
USB: fotg210-hcd: no need to check return value of debugfs_create functions
USB: imx21-hcd: no need to check return value of debugfs_create functions
...
|
|
Pull MMC updates from Ulf Hansson:
"MMC core:
- Decrease polling rate for erase/trim/discard
- Allow non-sleeping GPIOs for card detect
- Improve mmc block removal path
- Enable support for mmc_sw_reset() for SDIO cards
- Add mmc_sw_reset() to allow users to do a soft reset of the card
- Allow power delay to be tunable via DT
- Allow card detect debounce delay to be tunable via DT
- Enable new quirk to limit clock rate for Marvell 8887 chip
- Don't show eMMC RPMB and BOOT areas in /proc/partitions
- Add capability to avoid 3.3V signaling for fragile HWs
MMC host:
- Improve/fixup support for handle highmem pages
- Remove depends on HAS_DMA in case of platform dependency
- mvsdio: Enable support for erase/trim/discard
- rtsx_usb: Enable support for erase/trim/discard
- renesas_sdhi: Fix WP logic regressions
- renesas_sdhi: Add r8a77965 support
- renesas_sdhi: Add R8A77980 to whitelist
- meson: Add optional support for device reset
- meson: Add support for the Meson-AXG platform
- dw_mmc: Add new driver for BlueField DW variant
- mediatek: Add support for 64G DRAM DMA
- sunxi: Deploy runtime PM support
- jz4740: Add support for JZ4780
- jz4740: Enable support for DT based platforms
- sdhci: Various improvement to timeout handling
- sdhci: Disable support for HS200/HS400/UHS when no 1.8V support
- sdhci-omap: Add support for controller in k2g SoC
- sdhci-omap: Add workarounds for a couple of Erratas
- sdhci-omap: Enable support for generic sdhci DT properties
- sdhci-cadence: Re-send tune request to deal with errata
- sdhci-pci: Fix 3.3V voltage switch for some BYT-based Intel controllers
- sdhci-pci: Avoid 3.3V signaling on some NI 904x
- sdhci-esdhc-imx: Use watermark levels for PIO access
- sdhci-msm: Improve card detection handling
- sdhci-msm: Add support voltage pad switching"
* tag 'mmc-v4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (104 commits)
mmc: renesas_sdhi: really fix WP logic regressions
mmc: mvsdio: Enable MMC_CAP_ERASE
mmc: mvsdio: Respect card busy time out from mmc core
mmc: sdhci-msm: Remove NO_CARD_NO_RESET quirk
mmc: sunxi: Use ifdef rather than __maybe_unused
mmc: mxmmc: Use ifdef rather than __maybe_unused
mmc: mxmmc: include linux/highmem.h
mmc: sunxi: mark PM functions as __maybe_unused
mmc: Throttle calls to MMC_SEND_STATUS during mmc_do_erase()
mmc: au1xmmc: handle highmem pages
mmc: Allow non-sleeping GPIO cd
mmc: sdhci-*: Don't emit error msg if sdhci_add_host() fails
mmc: sd: Define name for default speed dtr
mmc: core: Move calls to ->prepare_hs400_tuning() closer to mmc code
mmc: sdhci-xenon: use match_string() helper
mmc: wbsd: handle highmem pages
mmc: ushc: handle highmem pages
mmc: mxcmmc: handle highmem pages
mmc: atmel-mci: use sg_copy_{from,to}_buffer
mmc: android-goldfish: use sg_copy_{from,to}_buffer
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds
Pull LED updates from Jacek Anaszewski:
"This was quite a fruitful cycle, taking into account usual traffic on
linux-leds list, as we managed to merge three new LED class drivers.
New LED class drivers with related DT bindings:
- add LED driver for CR0014114 board
- add Spreadtrum SC27xx breathing light controller driver
- introduce the lm3601x LED driver
LED class fix:
- ensure workqueue is initialized before setting brightness
Improvements and fixes to existing LED class drivers:
- fix return value check in sc27xx_led_probe()
- use sysfs_match_string() helper in wm831x_status_src_store()"
* tag 'leds_for_4.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
leds: class: ensure workqueue is initialized before setting brightness
leds: lm3601x: Introduce the lm3601x LED driver
dt: bindings: lm3601x: Introduce the lm3601x driver
leds: sc27xx: Fix return value check in sc27xx_led_probe()
leds: Add Spreadtrum SC27xx breathing light controller driver
dt-bindings: leds: Add SC27xx breathing light controller documentation
leds: wm831x-status: Use sysfs_match_string() helper
leds: add LED driver for CR0014114 board
dt-bindings: Add vendor prefix and docs for CR0014114
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- a new driver to ChipOne icn8505 based touchscreens
- on certain systems with Elan touch controllers they will be switched
away form PS/2 emulation and over to native SMbus mode
- assorted driver fixups and improvements
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (24 commits)
Input: elan_i2c - add ELAN0612 (Lenovo v330 14IKB) ACPI ID
Input: goodix - add new ACPI id for GPD Win 2 touch screen
Input: xpad - add GPD Win 2 Controller USB IDs
Input: ti_am335x_tsc - prevent system suspend when TSC is in use
Input: ti_am335x_tsc - ack pending IRQs at probe and before suspend
Input: cros_ec_keyb - mark cros_ec_keyb driver as wake enabled device.
Input: mk712 - update documentation web link
Input: atmel_mxt_ts - fix reset-gpio for level based irqs
Input: atmel_mxt_ts - require device properties present when probing
Input: psmouse-smbus - allow to control psmouse_deactivate
Input: elantech - detect new ICs and setup Host Notify for them
Input: elantech - add support for SMBus devices
Input: elantech - query the resolution in query_info
Input: elantech - split device info into a separate structure
Input: elan_i2c - add trackstick report
Input: usbtouchscreen - add sysfs attribute for 3M MTouch firmware rev
Input: ati_remote2 - fix typo 'can by' to 'can be'
Input: replace hard coded string with __func__ in pr_err()
Input: add support for ChipOne icn8505 based touchscreens
Input: gamecon - avoid using __set_bit() for capabilities
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Decryption test vectors are now automatically generated from
encryption test vectors.
Algorithms:
- Fix unaligned access issues in crc32/crc32c.
- Add zstd compression algorithm.
- Add AEGIS.
- Add MORUS.
Drivers:
- Add accelerated AEGIS/MORUS on x86.
- Add accelerated SM4 on arm64.
- Removed x86 assembly salsa implementation as it is slower than C.
- Add authenc(hmac(sha*), cbc(aes)) support in inside-secure.
- Add ctr(aes) support in crypto4xx.
- Add hardware key support in ccree.
- Add support for new Centaur CPU in via-rng"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (112 commits)
crypto: chtls - free beyond end rspq_skb_cache
crypto: chtls - kbuild warnings
crypto: chtls - dereference null variable
crypto: chtls - wait for memory sendmsg, sendpage
crypto: chtls - key len correction
crypto: salsa20 - Revert "crypto: salsa20 - export generic helpers"
crypto: x86/salsa20 - remove x86 salsa20 implementations
crypto: ccp - Add GET_ID SEV command
crypto: ccp - Add DOWNLOAD_FIRMWARE SEV command
crypto: qat - Add MODULE_FIRMWARE for all qat drivers
crypto: ccree - silence debug prints
crypto: ccree - better clock handling
crypto: ccree - correct host regs offset
crypto: chelsio - Remove separate buffer used for DMA map B0 block in CCM
crypt: chelsio - Send IV as Immediate for cipher algo
crypto: chelsio - Return -ENOSPC for transient busy indication.
crypto: caam/qi - fix warning in init_cgr()
crypto: caam - fix rfc4543 descriptors
crypto: caam - fix MC firmware detection
crypto: clarify licensing of OpenSSL asm code
...
|
|
Clarify that binding patches should also include include/dt-bindings/*
as part of them. The binding doc defines the ABI and the includes are
part of that. Add some details on the preferred subject prefix and
contents.
Signed-off-by: Rob Herring <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt
Pull fscrypt updates from Ted Ts'o:
"Add bunch of cleanups, and add support for the Speck128/256
algorithms.
Yes, Speck is contrversial, but the intention is to use them only for
the lowest end Android devices, where the alternative *really* is no
encryption at all for data stored at rest"
* tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt:
fscrypt: log the crypto algorithm implementations
fscrypt: add Speck128/256 support
fscrypt: only derive the needed portion of the key
fscrypt: separate key lookup from key derivation
fscrypt: use a common logging function
fscrypt: remove internal key size constants
fscrypt: remove unnecessary check for non-logon key type
fscrypt: make fscrypt_operations.max_namelen an integer
fscrypt: drop empty name check from fname_decrypt()
fscrypt: drop max_namelen check from fname_decrypt()
fscrypt: don't special-case EOPNOTSUPP from fscrypt_get_encryption_info()
fscrypt: don't clear flags on crypto transform
fscrypt: remove stale comment from fscrypt_d_revalidate()
fscrypt: remove error messages for skcipher_request_alloc() failure
fscrypt: remove unnecessary NULL check when allocating skcipher
fscrypt: clean up after fscrypt_prepare_lookup() conversions
fs, fscrypt: only define ->s_cop when FS_ENCRYPTION is enabled
fscrypt: use unbound workqueue for decryption
|