Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
"Misc cleanups all around the place"
* tag 'x86-cleanups-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ioperm: Initialize pointer bitmap with NULL rather than 0
x86: uv: uv_hub.h: Delete duplicated word
x86: cmpxchg_32.h: Delete duplicated word
x86: bootparam.h: Delete duplicated word
x86/mm: Remove the unused mk_kernel_pgd() #define
x86/tsc: Remove unused "US_SCALE" and "NS_SCALE" leftover macros
x86/ioapic: Remove unused "IOAPIC_AUTO" define
x86/mm: Drop unused MAX_PHYSADDR_BITS
x86/msr: Move the F15h MSRs where they belong
x86/idt: Make idt_descr static
initrd: Remove erroneous comment
x86/mm/32: Fix -Wmissing prototypes warnings for init.c
cpu/speculation: Add prototype for cpu_show_srbds()
x86/mm: Fix -Wmissing-prototypes warnings for arch/x86/mm/init.c
x86/asm: Unify __ASSEMBLY__ blocks
x86/cpufeatures: Mark two free bits in word 3
x86/msr: Lift AMD family 0x15 power-specific MSRs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/alternatives update from Ingo Molnar:
"A single commit that improves the alternatives patching syslog debug
output"
* tag 'x86-alternatives-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/alternatives: Add pr_fmt() to debug macros
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
- Improve uclamp performance by using a static key for the fast path
- Add the "sched_util_clamp_min_rt_default" sysctl, to optimize for
better power efficiency of RT tasks on battery powered devices.
(The default is to maximize performance & reduce RT latencies.)
- Improve utime and stime tracking accuracy, which had a fixed boundary
of error, which created larger and larger relative errors as the
values become larger. This is now replaced with more precise
arithmetics, using the new mul_u64_u64_div_u64() helper in math64.h.
- Improve the deadline scheduler, such as making it capacity aware
- Improve frequency-invariant scheduling
- Misc cleanups in energy/power aware scheduling
- Add sched_update_nr_running tracepoint to track changes to nr_running
- Documentation additions and updates
- Misc cleanups and smaller fixes
* tag 'sched-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
sched/doc: Factorize bits between sched-energy.rst & sched-capacity.rst
sched/doc: Document capacity aware scheduling
sched: Document arch_scale_*_capacity()
arm, arm64: Fix selection of CONFIG_SCHED_THERMAL_PRESSURE
Documentation/sysctl: Document uclamp sysctl knobs
sched/uclamp: Add a new sysctl to control RT default boost value
sched/uclamp: Fix a deadlock when enabling uclamp static key
sched: Remove duplicated tick_nohz_full_enabled() check
sched: Fix a typo in a comment
sched/uclamp: Remove unnecessary mutex_init()
arm, arm64: Select CONFIG_SCHED_THERMAL_PRESSURE
sched: Cleanup SCHED_THERMAL_PRESSURE kconfig entry
arch_topology, sched/core: Cleanup thermal pressure definition
trace/events/sched.h: fix duplicated word
linux/sched/mm.h: drop duplicated words in comments
smp: Fix a potential usage of stale nr_cpus
sched/fair: update_pick_idlest() Select group with lowest group_util when idle_cpus are equal
sched: nohz: stop passing around unused "ticks" parameter.
sched: Better document ttwu()
sched: Add a tracepoint to track rq->nr_running
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event updates from Ingo Molnar:
"HW support updates:
- Add uncore support for Intel Comet Lake
- Add RAPL support for Hygon Fam18h
- Add Intel "IIO stack to PMON mapping" support on Skylake-SP CPUs,
which enumerates per device performance counters via sysfs and
enables the perf stat --iiostat functionality
- Add support for Intel "Architectural LBRs", which generalized the
model specific LBR hardware tracing feature into a
model-independent, architected performance monitoring feature.
Usage is mostly seamless to tooling, as the pre-existing LBR
features are kept, but there's a couple of advantages under the
hood, such as faster context-switching, faster LBR reads, cleaner
exposure of LBR features to guest kernels, etc.
( Since architectural LBRs are supported via XSAVE, there's related
changes to the x86 FPU code as well. )
ftrace/perf updates:
- Add support to add a text poke event to record changes to kernel
text (i.e. self-modifying code) in order to support tracers like
Intel PT decoding through jump labels, kprobes and ftrace
trampolines.
Misc cleanups, smaller fixes..."
* tag 'perf-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
perf/x86/rapl: Add Hygon Fam18h RAPL support
kprobes: Remove unnecessary module_mutex locking from kprobe_optimizer()
x86/perf: Fix a typo
perf: <linux/perf_event.h>: drop a duplicated word
perf/x86/intel/lbr: Support XSAVES for arch LBR read
perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch
x86/fpu/xstate: Add helpers for LBR dynamic supervisor feature
x86/fpu/xstate: Support dynamic supervisor feature for LBR
x86/fpu: Use proper mask to replace full instruction mask
perf/x86: Remove task_ctx_size
perf/x86/intel/lbr: Create kmem_cache for the LBR context data
perf/core: Use kmem_cache to allocate the PMU specific data
perf/core: Factor out functions to allocate/free the task_ctx_data
perf/x86/intel/lbr: Support Architectural LBR
perf/x86/intel/lbr: Factor out intel_pmu_store_lbr
perf/x86/intel/lbr: Factor out rdlbr_all() and wrlbr_all()
perf/x86/intel/lbr: Mark the {rd,wr}lbr_{to,from} wrappers __always_inline
perf/x86/intel/lbr: Unify the stored format of LBR information
perf/x86/intel/lbr: Support LBR_CTL
perf/x86: Expose CPUID enumeration bits for arch LBR
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
- LKMM updates: mostly documentation changes, but also some new litmus
tests for atomic ops.
- KCSAN updates: the most important change is that GCC 11 now has all
fixes in place to support KCSAN, so GCC support can be enabled again.
Also more annotations.
- futex updates: minor cleanups and simplifications
- seqlock updates: merge preparatory changes/cleanups for the
'associated locks' facilities.
- lockdep updates:
- simplify IRQ trace event handling
- add various new debug checks
- simplify header dependencies, split out <linux/lockdep_types.h>,
decouple lockdep from other low level headers some more
- fix NMI handling
- misc cleanups and smaller fixes
* tag 'locking-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
kcsan: Improve IRQ state trace reporting
lockdep: Refactor IRQ trace events fields into struct
seqlock: lockdep assert non-preemptibility on seqcount_t write
lockdep: Add preemption enabled/disabled assertion APIs
seqlock: Implement raw_seqcount_begin() in terms of raw_read_seqcount()
seqlock: Add kernel-doc for seqcount_t and seqlock_t APIs
seqlock: Reorder seqcount_t and seqlock_t API definitions
seqlock: seqcount_t latch: End read sections with read_seqcount_retry()
seqlock: Properly format kernel-doc code samples
Documentation: locking: Describe seqlock design and usage
locking/qspinlock: Do not include atomic.h from qspinlock_types.h
locking/atomic: Move ATOMIC_INIT into linux/types.h
lockdep: Move list.h inclusion into lockdep.h
locking/lockdep: Fix TRACE_IRQFLAGS vs. NMIs
futex: Remove unused or redundant includes
futex: Consistently use fshared as boolean
futex: Remove needless goto's
futex: Remove put_futex_key()
rwsem: fix commas in initialisation
docs: locking: Replace HTTP links with HTTPS ones
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
"Fix a recent IRQ affinities regression, add in a missing debugfs
printout that helps the debugging of IRQ affinity logic bugs, and fix
a memory leak"
* tag 'irq-urgent-2020-08-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/debugfs: Add missing irqchip flags
genirq/affinity: Make affinity setting if activated opt-in
irqdomain/treewide: Free firmware node after domain removal
|
|
Conflicts:
arch/arm/include/asm/percpu.h
As Stephen Rothwell noted, there's a conflict between this commit
in locking/core:
a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables")
and this fresh upstream commit:
aa54ea903abb ("ARM: percpu.h: fix build error")
a21ee6055c30 is a simpler solution to the dependency problem and doesn't
further increase header hell - so this conflict resolution effectively
reverts aa54ea903abb and uses the a21ee6055c30 solution.
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Remove the special handling for multiple floppies in the initrd code.
No one should be using floppies for booting these days. (famous last
words..)
Includes a spelling fix from Colin Ian King <[email protected]>.
Signed-off-by: Christoph Hellwig <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
|
|
0day reported a possible circular locking dependency:
Chain exists of:
&irq_desc_lock_class --> console_owner --> &port_lock_key
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&port_lock_key);
lock(console_owner);
lock(&port_lock_key);
lock(&irq_desc_lock_class);
The reason for this is a printk() in the i8259 interrupt chip driver
which is invoked with the irq descriptor lock held, which reverses the
lock operations vs. printk() from arbitrary contexts.
Switch the printk() to printk_deferred() to avoid that.
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
All instances of ->get() in arch/x86 switched; that might or might
not be worth splitting up. Notes:
* for xstateregs_get() the amount we want to store is determined at
the boot time; see init_xstate_size() and update_regset_xstate_info() for
details. task->thread.fpu.state.xsave ends with a flexible array member and
the amount of data in it depends upon the FPU features supported/enabled.
* fpregs_get() writes slightly less than full ->thread.fpu.state.fsave
(the last word is not copied); we pass the full size of state.fsave and let
membuf_write() trim to the amount declared by regset - __regset_get() will
make sure that the space in buffer is no more than that.
* copy_xstate_to_user() and its helpers are gone now.
* fpregs_soft_get() was getting user_regset_copyout() arguments
wrong. Since "x86: x86 user_regset math_emu" back in 2008... I really
doubt that it's worth splitting out for -stable, though - you need
a 486SX box for that to trigger...
[Kevin's braino fix for copy_xstate_to_kernel() essentially duplicated here]
Signed-off-by: Al Viro <[email protected]>
|
|
John reported that on a RK3288 system the perf per CPU interrupts are all
affine to CPU0 and provided the analysis:
"It looks like what happens is that because the interrupts are not per-CPU
in the hardware, armpmu_request_irq() calls irq_force_affinity() while
the interrupt is deactivated and then request_irq() with IRQF_PERCPU |
IRQF_NOBALANCING.
Now when irq_startup() runs with IRQ_STARTUP_NORMAL, it calls
irq_setup_affinity() which returns early because IRQF_PERCPU and
IRQF_NOBALANCING are set, leaving the interrupt on its original CPU."
This was broken by the recent commit which blocked interrupt affinity
setting in hardware before activation of the interrupt. While this works in
general, it does not work for this particular case. As contrary to the
initial analysis not all interrupt chip drivers implement an activate
callback, the safe cure is to make the deferred interrupt affinity setting
at activation time opt-in.
Implement the necessary core logic and make the two irqchip implementations
for which this is required opt-in. In hindsight this would have been the
right thing to do, but ...
Fixes: baedb87d1b53 ("genirq/affinity: Handle affinity setting on inactive interrupts correctly")
Reported-by: John Keeping <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Marc Zyngier <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
Having sync_core() in processor.h is problematic since it is not possible
to check for hardware capabilities via the *cpu_has() family of macros.
The latter needs the definitions in processor.h.
It also looks more intuitive to relocate the function to sync_core.h.
This changeset does not make changes in functionality.
Signed-off-by: Ricardo Neri <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Refresh the branch for a dependent commit.
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Resolve conflicts with ongoing lockdep work that fixed the NMI entry code.
Conflicts:
arch/x86/entry/common.c
arch/x86/include/asm/idtentry.h
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Lake CPUs
Add Sapphire Rapids and Alder Lake processors to CPU list to enumerate
and enable the split lock feature.
Signed-off-by: Fenghua Yu <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
commits
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Remove the temporary defines and fixup all references.
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Cleanup the temporary defines and use irqentry_ instead of idtentry_.
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Replace the x86 variant with the generic version. Provide the relevant
architecture specific helper functions and defines.
Use a temporary define for idtentry_exit_user which will be cleaned up
seperately.
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Kees Cook <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Pick up generic entry code to migrate x86 over.
|
|
The noinstr qualifier is to be specified before the return type in the
same way inline is used.
These 2 cases were missed by previous patches.
Signed-off-by: Ira Weiny <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
I've got a silent conflict + two trees based on fixes to merge.
Fixes a silent merge with amdgpu
Signed-off-by: Dave Airlie <[email protected]>
|
|
Commit 711419e504eb ("irqdomain: Add the missing assignment of
domain->fwnode for named fwnode") unintentionally caused a dangling pointer
page fault issue on firmware nodes that were freed after IRQ domain
allocation. Commit e3beca48a45b fixed that dangling pointer issue by only
freeing the firmware node after an IRQ domain allocation failure. That fix
no longer frees the firmware node immediately, but leaves the firmware node
allocated after the domain is removed.
The firmware node must be kept around through irq_domain_remove, but should be
freed it afterwards.
Add the missing free operations after domain removal where where appropriate.
Fixes: e3beca48a45b ("irqdomain/treewide: Keep firmware node unconditionally allocated")
Signed-off-by: Jon Derrick <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]> # drivers/pci
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
show_trace_log_lvl() provides x86 platform-specific way to unwind
backtrace with a given log level. Unfortunately, registers dump(s) are
not printed with the same log level - instead, KERN_DEFAULT is always
used.
Arista's switches uses quite common setup with rsyslog, where only
urgent messages goes to console (console_log_level=KERN_ERR), everything
else goes into /var/log/ as the console baud-rate often is indecently
slow (9600 bps).
Backtrace dumps without registers printed have proven to be as useful as
morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED
(which I believe is still the most elegant way to fix raciness of sysrq[1])
the log level should be passed down the stack to register dumping
functions. Besides, there is a potential use-case for printing traces
with KERN_DEBUG level [2] (where registers dump shouldn't appear with
higher log level).
After all preparations are done, provide log_lvl parameter for
show_regs_if_on_stack() and wire up to actual log level used as
an argument for show_trace_log_lvl().
[1]: https://lore.kernel.org/lkml/[email protected]/
[2]: https://lore.kernel.org/linux-doc/[email protected]/
Signed-off-by: Dmitry Safonov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Petr Mladek <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
show_trace_log_lvl() provides x86 platform-specific way to unwind
backtrace with a given log level. Unfortunately, registers dump(s) are
not printed with the same log level - instead, KERN_DEFAULT is always
used.
Arista's switches uses quite common setup with rsyslog, where only
urgent messages goes to console (console_log_level=KERN_ERR), everything
else goes into /var/log/ as the console baud-rate often is indecently
slow (9600 bps).
Backtrace dumps without registers printed have proven to be as useful as
morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED
(which I believe is still the most elegant way to fix raciness of sysrq[1])
the log level should be passed down the stack to register dumping
functions. Besides, there is a potential use-case for printing traces
with KERN_DEBUG level [2] (where registers dump shouldn't appear with
higher log level).
Add log_lvl parameter to __show_regs().
Keep the used log level intact to separate visible change.
[1]: https://lore.kernel.org/lkml/[email protected]/
[2]: https://lore.kernel.org/linux-doc/[email protected]/
Signed-off-by: Dmitry Safonov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Petr Mladek <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
show_trace_log_lvl() provides x86 platform-specific way to unwind
backtrace with a given log level. Unfortunately, registers dump(s) are
not printed with the same log level - instead, KERN_DEFAULT is always
used.
Arista's switches uses quite common setup with rsyslog, where only
urgent messages goes to console (console_log_level=KERN_ERR), everything
else goes into /var/log/ as the console baud-rate often is indecently
slow (9600 bps).
Backtrace dumps without registers printed have proven to be as useful as
morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED
(which I believe is still the most elegant way to fix raciness of sysrq[1])
the log level should be passed down the stack to register dumping
functions. Besides, there is a potential use-case for printing traces
with KERN_DEBUG level [2] (where registers dump shouldn't appear with
higher log level).
Add log_lvl parameter to show_iret_regs() as a preparation to add it
to __show_regs() and show_regs_if_on_stack().
[1]: https://lore.kernel.org/lkml/[email protected]/
[2]: https://lore.kernel.org/linux-doc/[email protected]/
Signed-off-by: Dmitry Safonov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Petr Mladek <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
H.J. reported that post 5.7 a segfault of a user space task does not longer
dump the Code bytes when /proc/sys/debug/exception-trace is enabled. It
prints 'Code: Bad RIP value.' instead.
This was broken by a recent change which made probe_kernel_read() reject
non-kernel addresses.
Update show_opcodes() so it retrieves user space opcodes via
copy_from_user_nmi().
Fixes: 98a23609b103 ("maccess: always use strict semantics for probe_kernel_read")
Reported-by: H.J. Lu <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
If a user task's stack is empty, or if it only has user regs, ORC
reports it as a reliable empty stack. But arch_stack_walk_reliable()
incorrectly treats it as unreliable.
That happens because the only success path for user tasks is inside the
loop, which only iterates on non-empty stacks. Generally, a user task
must end in a user regs frame, but an empty stack is an exception to
that rule.
Thanks to commit 71c95825289f ("x86/unwind/orc: Fix error handling in
__unwind_start()"), unwind_start() now sets state->error appropriately.
So now for both ORC and FP unwinders, unwind_done() and !unwind_error()
always means the end of the stack was successfully reached. So the
success path for kthreads is no longer needed -- it can also be used for
empty user tasks.
Reported-by: Wang ShaoBo <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Wang ShaoBo <[email protected]>
Link: https://lkml.kernel.org/r/f136a4e5f019219cbc4f4da33b30c2f44fa65b84.1594994374.git.jpoimboe@redhat.com
|
|
The ORC unwinder fails to unwind newly forked tasks which haven't yet
run on the CPU. It correctly reads the 'ret_from_fork' instruction
pointer from the stack, but it incorrectly interprets that value as a
call stack address rather than a "signal" one, so the address gets
incorrectly decremented in the call to orc_find(), resulting in bad ORC
data.
Fix it by forcing 'ret_from_fork' frames to be signal frames.
Reported-by: Wang ShaoBo <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Wang ShaoBo <[email protected]>
Link: https://lkml.kernel.org/r/f91a8778dde8aae7f71884b5df2b16d552040441.1594994374.git.jpoimboe@redhat.com
|
|
|
|
On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is
page-aligned, but the end of the .bss..page_aligned section is not
guaranteed to be page-aligned.
As a result, objects from other .bss sections may end up on the same 4k
page as the idt_table, and will accidentially get mapped read-only during
boot, causing unexpected page-faults when the kernel writes to them.
This could be worked around by making the objects in the page aligned
sections page sized, but that's wrong.
Explicit sections which store only page aligned objects have an implicit
guarantee that the object is alone in the page in which it is placed. That
works for all objects except the last one. That's inconsistent.
Enforcing page sized objects for these sections would wreckage memory
sanitizers, because the object becomes artificially larger than it should
be and out of bound access becomes legit.
Align the end of the .bss..page_aligned and .data..page_aligned section on
page-size so all objects places in these sections are guaranteed to have
their own page.
[ tglx: Amended changelog ]
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
To allow the kernel not to play games with set_fs to call exec
implement kernel_execve. The function kernel_execve takes pointers
into kernel memory and copies the values pointed to onto the new
userspace stack.
The calls with arguments from kernel space of do_execve are replaced
with calls to kernel_execve.
The calls do_execve and do_execveat are made static as there are now
no callers outside of exec.
The comments that mention do_execve are updated to refer to
kernel_execve or execve depending on the circumstances. In addition
to correcting the comments, this makes it easy to grep for do_execve
and verify it is not used.
Inspired-by: https://lkml.kernel.org/r/[email protected]
Reviewed-by: Kees Cook <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: "Eric W. Biederman" <[email protected]>
|
|
This fixes a regression encountered while running the
gdb.base/corefile.exp test in GDB's test suite.
In my testing, the typo prevented the sw_reserved field of struct
fxregs_state from being output to the kernel XSAVES area. Thus the
correct mask corresponding to XCR0 was not present in the core file for
GDB to interrogate, resulting in the following behavior:
[kev@f32-1 gdb]$ ./gdb -q testsuite/outputs/gdb.base/corefile/corefile testsuite/outputs/gdb.base/corefile/corefile.core
Reading symbols from testsuite/outputs/gdb.base/corefile/corefile...
[New LWP 232880]
warning: Unexpected size of section `.reg-xstate/232880' in core file.
With the typo fixed, the test works again as expected.
Signed-off-by: Kevin Buettner <[email protected]>
Fixes: 9e4636545933 ("copy_xstate_to_kernel(): don't leave parts of destination uninitialized")
Cc: Al Viro <[email protected]>
Cc: Dave Airlie <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull x86 fixes from Thomas Gleixner:
"A pile of fixes for x86:
- Fix the I/O bitmap invalidation on XEN PV, which was overlooked in
the recent ioperm/iopl rework. This caused the TSS and XEN's I/O
bitmap to get out of sync.
- Use the proper vectors for HYPERV.
- Make disabling of stack protector for the entry code work with GCC
builds which enable stack protector by default. Removing the option
is not sufficient, it needs an explicit -fno-stack-protector to
shut it off.
- Mark check_user_regs() noinstr as it is called from noinstr code.
The missing annotation causes it to be placed in the text section
which makes it instrumentable.
- Add the missing interrupt disable in exc_alignment_check()
- Fixup a XEN_PV build dependency in the 32bit entry code
- A few fixes to make the Clang integrated assembler happy
- Move EFI stub build to the right place for out of tree builds
- Make prepare_exit_to_usermode() static. It's not longer called from
ASM code"
* tag 'x86-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Don't add the EFI stub to targets
x86/entry: Actually disable stack protector
x86/ioperm: Fix io bitmap invalidation on Xen PV
x86: math-emu: Fix up 'cmp' insn for clang ias
x86/entry: Fix vectors to IDTENTRY_SYSVEC for CONFIG_HYPERV
x86/entry: Add compatibility with IAS
x86/entry/common: Make prepare_exit_to_usermode() static
x86/entry: Mark check_user_regs() noinstr
x86/traps: Disable interrupts in exc_aligment_check()
x86/entry/32: Fix XEN_PV build dependency
|
|
tss_invalidate_io_bitmap() wasn't wired up properly through the pvop
machinery, so the TSS and Xen's io bitmap would get out of sync
whenever disabling a valid io bitmap.
Add a new pvop for tss_invalidate_io_bitmap() to fix it.
This is XSA-329.
Fixes: 22fe5b0439dd ("x86/ioperm: Move TSS bitmap update to exit to user work")
Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/d53075590e1f91c19f8af705059d3ff99424c020.1595030016.git.luto@kernel.org
|
|
Setting interrupt affinity on inactive interrupts is inconsistent when
hierarchical irq domains are enabled. The core code should just store the
affinity and not call into the irq chip driver for inactive interrupts
because the chip drivers may not be in a state to handle such requests.
X86 has a hacky workaround for that but all other irq chips have not which
causes problems e.g. on GIC V3 ITS.
Instead of adding more ugly hacks all over the place, solve the problem in
the core code. If the affinity is set on an inactive interrupt then:
- Store it in the irq descriptors affinity mask
- Update the effective affinity to reflect that so user space has
a consistent view
- Don't call into the irq chip driver
This is the core equivalent of the X86 workaround and works correctly
because the affinity setting is established in the irq chip when the
interrupt is activated later on.
Note, that this is only effective when hierarchical irq domains are enabled
by the architecture. Doing it unconditionally would break legacy irq chip
implementations.
For hierarchial irq domains this works correctly as none of the drivers can
have a dependency on affinity setting in inactive state by design.
Remove the X86 workaround as it is not longer required.
Fixes: 02edee152d6e ("x86/apic/vector: Ignore set_affinity call for inactive interrupts")
Reported-by: Ali Saidi <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Ali Saidi <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
In removing UV1 support, efi_have_uv1_memmap is no longer used.
Signed-off-by: Steve Wahl <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Ard Biesheuvel <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
UV1 is not longer supported.
Signed-off-by: Steve Wahl <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.
In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:
git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
xargs perl -pi -e \
's/\buninitialized_var\(([^\)]+)\)/\1/g;
s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'
drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.
No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.
[1] https://lore.kernel.org/lkml/[email protected]/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/
Reviewed-by: Leon Romanovsky <[email protected]> # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe <[email protected]> # IB
Acked-by: Kalle Valo <[email protected]> # wireless drivers
Reviewed-by: Chao Yu <[email protected]> # erofs
Signed-off-by: Kees Cook <[email protected]>
|
|
Quite some non OF/ACPI users of irqdomains allocate firmware nodes of type
IRQCHIP_FWNODE_NAMED or IRQCHIP_FWNODE_NAMED_ID and free them right after
creating the irqdomain. The only purpose of these FW nodes is to convey
name information. When this was introduced the core code did not store the
pointer to the node in the irqdomain. A recent change stored the firmware
node pointer in irqdomain for other reasons and missed to notice that the
usage sites which do the alloc_fwnode/create_domain/free_fwnode sequence
are broken by this. Storing a dangling pointer is dangerous itself, but in
case that the domain is destroyed later on this leads to a double free.
Remove the freeing of the firmware node after creating the irqdomain from
all affected call sites to cure this.
Fixes: 711419e504eb ("irqdomain: Add the missing assignment of domain->fwnode for named fwnode")
Reported-by: Andy Shevchenko <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
While the nmi_enter() users did
trace_hardirqs_{off_prepare,on_finish}() there was no matching
lockdep_hardirqs_*() calls to complete the picture.
Introduce idtentry_{enter,exit}_nmi() to enable proper IRQ state
tracking across the NMIs.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Ingo Molnar <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
|
|
exc_alignment_check() fails to disable interrupts before returning to the
entry code.
Fixes: ca4c6a9858c2 ("x86/traps: Make interrupt enable/disable symmetric in C code")
Reported-by: [email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
There are cases where a guest tries to switch spinlocks to bare metal
behavior (e.g. by setting "xen_nopvspin" on XEN platform and
"hv_nopvspin" on HYPER_V).
That feature is missed on KVM, add a new parameter "nopvspin" to disable
PV spinlocks for KVM guest.
The new 'nopvspin' parameter will also replace Xen and Hyper-V specific
parameters in future patches.
Define variable nopvsin as global because it will be used in future
patches as above.
Signed-off-by: Zhenzhong Duan <[email protected]>
Reviewed-by: Vitaly Kuznetsov <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krcmar <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Jim Mattson <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
pr_*() is preferred than printk(KERN_* ...), after change all the print
in arch/x86/kernel/kvm.c will have "kvm-guest: xxx" style.
No functional change.
Signed-off-by: Zhenzhong Duan <[email protected]>
Reviewed-by: Vitaly Kuznetsov <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krcmar <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Jim Mattson <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
initialized"
This reverts commit 34226b6b70980a8f81fff3c09a2c889f77edeeff.
Commit 8990cac6e5ea ("x86/jump_label: Initialize static branching
early") adds jump_label_init() call in setup_arch() to make static
keys initialized early, so we could use the original simpler code
again.
The similar change for XEN is in commit 090d54bcbc54 ("Revert
"x86/paravirt: Set up the virt_spin_lock_key after static keys get
initialized"")
Signed-off-by: Zhenzhong Duan <[email protected]>
Reviewed-by: Vitaly Kuznetsov <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krcmar <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Cc: Wanpeng Li <[email protected]>
Cc: Jim Mattson <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
|