Age | Commit message (Collapse) | Author | Files | Lines |
|
code and resolve conflicts
Pick up and resolve the NMI entry code changes from the locking tree,
and also pick up the latest two fixes from tip:core/entry.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The comments explicitely explain that the work flags check and handling in
kvm_run_vcpu() is done with preemption and interrupts enabled as KVM
invokes the check again right before entering guest mode with interrupts
disabled which guarantees that the work flags are observed and handled
before VMENTER.
Nevertheless the flag pending check in kvm_run_vcpu() uses the helper
variant which requires interrupts to be disabled triggering an instant
lockdep splat. This was caught in testing before and then not fixed up in
the patch before applying. :(
Use the relaxed and intentionally racy __xfer_to_guest_mode_work_pending()
instead.
Fixes: 72c3c0fe54a3 ("x86/kvm: Use generic xfer to guest work function")
Reported-by: Qian Cai <cai@lca.pw> writes:
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/87bljxa2sa.fsf@nanos.tec.linutronix.de
|
|
Resolve conflicts with ongoing lockdep work that fixed the NMI entry code.
Conflicts:
arch/x86/entry/common.c
arch/x86/include/asm/idtentry.h
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Pick up generic entry code fixes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The original version of that used secure_computing() which has no
arguments. Review requested to switch to __secure_computing() which has
one. The function name was correct, but no argument added and of course
compiling without SECCOMP was deemed overrated.
Add the missing function argument.
Fixes: 6823ecabf030 ("seccomp: Provide stub for __secure_computing()")
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The noinstr attribute is to be specified before the return type in the
same way 'inline' is used.
Similar cases were recently fixed for x86 in commit 7f6fa101dfac ("x86:
Correct noinstr qualifiers"), but the generic entry code was based on the
the original version and did not carry the fix over.
Fixes: a5497bab5f72 ("entry: Provide generic interrupt entry/exit code")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200725091951.744848-3-mingo@kernel.org
|
|
Use the generic infrastructure to check for and handle pending work before
transitioning into guest mode.
This now handles TIF_NOTIFY_RESUME as well which was ignored so
far. Handling it is important as this covers task work and task work will
be used to offload the heavy lifting of POSIX CPU timers to thread context.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200722220520.979724969@linutronix.de
|
|
Remove the temporary defines and fixup all references.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220520.855839271@linutronix.de
|
|
Replace the x86 code with the generic variant. Use temporary defines for
idtentry_* which will be cleaned up in the next step.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200722220520.711492752@linutronix.de
|
|
Cleanup the temporary defines and use irqentry_ instead of idtentry_.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220520.602603691@linutronix.de
|
|
Replace the x86 variant with the generic version. Provide the relevant
architecture specific helper functions and defines.
Use a temporary define for idtentry_exit_user which will be cleaned up
seperately.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220520.494648601@linutronix.de
|
|
Replace the syscall entry work handling with the generic version. Provide
the necessary helper inlines to handle the real architecture specific
parts, e.g. ptrace.
Use a temporary define for idtentry_enter_user which will be cleaned up
seperately.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220520.376213694@linutronix.de
|
|
As a preparatory step for moving the syscall and interrupt entry/exit
handling into generic code, provide a pt_regs helper which retrieves the
interrupt state from pt_regs. This is required to check whether interrupts
are reenabled by return from interrupt/exception.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220520.258511584@linutronix.de
|
|
Guests and user space share certain MSRs. KVM sets these MSRs to guest
values once and does not set them back to user space values on every VM
exit to spare the costly MSR operations.
User return notifiers ensure that these MSRs are set back to the correct
values before returning to user space in exit_to_usermode_loop().
There is no reason to evaluate the TIF flag indicating that user return
notifiers need to be invoked in the loop. The important point is that they
are invoked before returning to user space.
Move the invocation out of the loop into the section which does the last
preperatory steps before returning to user space. That section is not
preemptible and runs with interrupts disabled until the actual return.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200722220520.159112003@linutronix.de
|
|
64bit and 32bit entry code have the same open coded syscall entry handling
after the bitwidth specific bits.
Move it to a helper function and share the code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200722220520.051234096@linutronix.de
|
|
The user register sanity check is sprinkled all over the place. Move it
into enter_from_user_mode().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220519.943016204@linutronix.de
|
|
Pick up generic entry code to migrate x86 over.
|
|
Entering a guest is similar to exiting to user space. Pending work like
handling signals, rescheduling, task work etc. needs to be handled before
that.
Provide generic infrastructure to avoid duplication of the same handling
code all over the place.
The transfer to guest mode handling is different from the exit to usermode
handling, e.g. vs. rseq and live patching, so a separate function is used.
The initial list of work items handled is:
TIF_SIGPENDING, TIF_NEED_RESCHED, TIF_NOTIFY_RESUME
Architecture specific TIF flags can be added via defines in the
architecture specific include files.
The calling convention is also different from the syscall/interrupt entry
functions as KVM invokes this from the outer vcpu_run() loop with
interrupts and preemption enabled. To prevent missing a pending work item
it invokes a check for pending TIF work from interrupt disabled code right
before transitioning to guest mode. The lockdep, RCU and tracing state
handling is also done directly around the switch to and from guest mode.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200722220519.833296398@linutronix.de
|
|
Like the syscall entry/exit code interrupt/exception entry after the real
low level ASM bits should not be different accross architectures.
Provide a generic version based on the x86 code.
irqentry_enter() is called after the low level entry code and
irqentry_exit() must be invoked right before returning to the low level
code which just contains the actual return logic. The code before
irqentry_enter() and irqentry_exit() must not be instrumented. Code after
irqentry_enter() and before irqentry_exit() can be instrumented.
irqentry_enter() invokes irqentry_enter_from_user_mode() if the
interrupt/exception came from user mode. If if entered from kernel mode it
handles the kernel mode variant of establishing state for lockdep, RCU and
tracing depending on the kernel context it interrupted (idle, non-idle).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200722220519.723703209@linutronix.de
|
|
Like syscall entry all architectures have similar and pointlessly different
code to handle pending work before returning from a syscall to user space.
1) One-time syscall exit work:
- rseq syscall exit
- audit
- syscall tracing
- tracehook (single stepping)
2) Preparatory work
- Exit to user mode loop (common TIF handling).
- Architecture specific one time work arch_exit_to_user_mode_prepare()
- Address limit and lockdep checks
3) Final transition (lockdep, tracing, context tracking, RCU). Invokes
arch_exit_to_user_mode() to handle e.g. speculation mitigations
Provide a generic version based on the x86 code which has all the RCU and
instrumentation protections right.
Provide a variant for interrupt return to user mode as well which shares
the above #2 and #3 work items.
After syscall_exit_to_user_mode() and irqentry_exit_to_user_mode() the
architecture code just has to return to user space. The code after
returning from these functions must not be instrumented.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220519.613977173@linutronix.de
|
|
On syscall entry certain work needs to be done:
- Establish state (lockdep, context tracking, tracing)
- Conditional work (ptrace, seccomp, audit...)
This code is needlessly duplicated and different in all
architectures.
Provide a generic version based on the x86 implementation which has all the
RCU and instrumentation bits right.
As interrupt/exception entry from user space needs parts of the same
functionality, provide a function for this as well.
syscall_enter_from_user_mode() and irqentry_enter_from_user_mode() must be
called right after the low level ASM entry. The calling code must be
non-instrumentable. After the functions returns state is correct and the
subsequent functions can be instrumented.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220519.513463269@linutronix.de
|
|
To avoid #ifdeffery in the upcoming generic syscall entry work code provide
a stub for __secure_computing() as this is preferred over
secure_computing() because the TIF flag is already evaluated.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kees Cook <keescook@chromium.org>
Link: https://lkml.kernel.org/r/20200722220519.404974280@linutronix.de
|
|
The noinstr qualifier is to be specified before the return type in the
same way inline is used.
These 2 cases were missed by previous patches.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200723161405.852613-1-ira.weiny@intel.com
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into master
Pull perf tooling fixes from Arnaldo Carvalho de Melo:
- Update hashmap.h from libbpf and kvm.h from x86's kernel UAPI.
- Set opt->set in libsubcmd's OPT_CALLBACK_SET(). This fixes
'perf record --switch-output-event event-name' usage"
* tag 'perf-tools-fixes-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
tools arch kvm: Sync kvm headers with the kernel sources
perf tools: Sync hashmap.h with libbpf's
libsubcmd: Fix OPT_CALLBACK_SET()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull x86 fixes from Thomas Gleixner:
"A pile of fixes for x86:
- Fix the I/O bitmap invalidation on XEN PV, which was overlooked in
the recent ioperm/iopl rework. This caused the TSS and XEN's I/O
bitmap to get out of sync.
- Use the proper vectors for HYPERV.
- Make disabling of stack protector for the entry code work with GCC
builds which enable stack protector by default. Removing the option
is not sufficient, it needs an explicit -fno-stack-protector to
shut it off.
- Mark check_user_regs() noinstr as it is called from noinstr code.
The missing annotation causes it to be placed in the text section
which makes it instrumentable.
- Add the missing interrupt disable in exc_alignment_check()
- Fixup a XEN_PV build dependency in the 32bit entry code
- A few fixes to make the Clang integrated assembler happy
- Move EFI stub build to the right place for out of tree builds
- Make prepare_exit_to_usermode() static. It's not longer called from
ASM code"
* tag 'x86-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Don't add the EFI stub to targets
x86/entry: Actually disable stack protector
x86/ioperm: Fix io bitmap invalidation on Xen PV
x86: math-emu: Fix up 'cmp' insn for clang ias
x86/entry: Fix vectors to IDTENTRY_SYSVEC for CONFIG_HYPERV
x86/entry: Add compatibility with IAS
x86/entry/common: Make prepare_exit_to_usermode() static
x86/entry: Mark check_user_regs() noinstr
x86/traps: Disable interrupts in exc_aligment_check()
x86/entry/32: Fix XEN_PV build dependency
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull timer fixes from Thomas Gleixner:
"Two fixes for the timer wheel:
- A timer which is already expired at enqueue time can set the
base->next_expiry value backwards. As a consequence base->clk can
be set back as well. This can lead to timers expiring early. Add a
sanity check to prevent this.
- When a timer is queued with an expiry time beyond the wheel
capacity then it should be queued in the bucket of the last wheel
level which is expiring last.
The code adjusted the expiry time to the maximum wheel capacity,
which is only correct when the wheel clock is 0. Aside of that the
check whether the delta is larger than wheel capacity does not
check the delta, it checks the expiry value itself. As a result
timers can expire at random.
Fix this by checking the right variable and adjust expiry time so
it becomes base->clock plus capacity which places it into the
outmost bucket in the last wheel level"
* tag 'timers-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timer: Fix wheel index calculation on last level
timer: Prevent base->clk from moving backward
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull scheduler fixes from Thomas Gleixner:
"A set of scheduler fixes:
- Plug a load average accounting race which was introduced with a
recent optimization casing load average to show bogus numbers.
- Fix the rseq CPU id initialization for new tasks. sched_fork() does
not update the rseq CPU id so the id is the stale id of the parent
task, which can cause user space data corruption.
- Handle a 0 return value of task_h_load() correctly in the load
balancer, which does not decrease imbalance and therefore pulls
until the maximum number of loops is reached, which might be all
tasks just created by a fork bomb"
* tag 'sched-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: handle case of task_h_load() returning 0
sched: Fix unreliable rseq cpu_id for new tasks
sched: Fix loadavg accounting race
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull irq fixes from Thomas Gleixner:
"Two fixes for the interrupt subsystem:
- Make the handling of the firmware node consistent and do not free
the node after the domain has been created successfully. The core
code stores a pointer to it which can lead to a use after free or
double free.
This used to "work" because the pointer was not stored when the
initial code was written, but at some point later it was required
to store it. Of course nobody noticed that the existing users break
that way.
- Handle affinity setting on inactive interrupts correctly when
hierarchical irq domains are enabled.
When interrupts are inactive with the modern hierarchical irqdomain
design, the interrupt chips are not necessarily in a state where
affinity changes can be handled. The legacy irq chip design allowed
this because interrupts are immediately fully initialized at
allocation time. X86 has a hacky workaround for this, but other
implementations do not.
This cased malfunction on GIC-V3. Instead of playing whack a mole
to find all affected drivers, change the core code to store the
requested affinity setting and then establish it when the interrupt
is allocated, which makes the X86 hack go away"
* tag 'irq-urgent-2020-07-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/affinity: Handle affinity setting on inactive interrupts correctly
irqdomain/treewide: Keep firmware node unconditionally allocated
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb into master
Pull USB fixes from Greg KH:
"Here are a few small USB fixes, and one thunderbolt fix, for 5.8-rc6.
Nothing huge in here, just the normal collection of gadget, dwc2/3,
serial, and other minor USB driver fixes and id additions. Full
details are in the shortlog.
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: iuu_phoenix: fix memory corruption
USB: c67x00: fix use after free in c67x00_giveback_urb
usb: gadget: function: fix missing spinlock in f_uac1_legacy
usb: gadget: udc: atmel: fix uninitialized read in debug printk
usb: gadget: udc: atmel: remove outdated comment in usba_ep_disable()
usb: dwc2: Fix shutdown callback in platform
usb: cdns3: trace: fix some endian issues
usb: cdns3: ep0: fix some endian issues
usb: gadget: udc: gr_udc: fix memleak on error handling path in gr_ep_init()
usb: gadget: fix langid kernel-doc warning in usbstring.c
usb: dwc3: pci: add support for the Intel Jasper Lake
usb: dwc3: pci: add support for the Intel Tiger Lake PCH -H variant
usb: chipidea: core: add wakeup support for extcon
USB: serial: option: add Quectel EG95 LTE modem
thunderbolt: Fix path indices used in USB3 tunnel discovery
USB: serial: ch341: add new Product ID for CH340
USB: serial: option: add GosunCn GM500 series
USB: serial: cypress_m8: enable Simply Automated UPB PIM
|
|
git://git.infradead.org/users/hch/dma-mapping into master
Pull dma-mapping fixes from Christoph Hellwig:
"Ensure we always have fully addressable memory in the dma coherent
pool (Nicolas Saenz Julienne)"
* tag 'dma-mapping-5.8-6' of git://git.infradead.org/users/hch/dma-mapping:
dma-pool: do not allocate pool memory from CMA
dma-pool: make sure atomic pool suits device
dma-pool: introduce dma_guess_pool()
dma-pool: get rid of dma_in_atomic_pool()
dma-direct: provide function to check physical memory area validity
|
|
vmlinux-objs-y is added to targets, which currently means that the EFI
stub gets added to the targets as well. It shouldn't be added since it
is built elsewhere.
This confuses Makefile.build which interprets the EFI stub as a target
$(obj)/$(objtree)/drivers/firmware/efi/libstub/lib.a
and will create drivers/firmware/efi/libstub/ underneath
arch/x86/boot/compressed, to hold this supposed target, if building
out-of-tree. [0]
Fix this by pulling the stub out of vmlinux-objs-y into efi-obj-y.
[0] See scripts/Makefile.build near the end:
# Create directories for object files if they do not exist
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lkml.kernel.org/r/20200715032631.1562882-1-nivedita@alum.mit.edu
|
|
Some builds of GCC enable stack protector by default. Simply removing
the arguments is not sufficient to disable stack protector, as the stack
protector for those GCC builds must be explicitly disabled. Remove the
argument removals and add -fno-stack-protector. Additionally include
missed x32 argument updates, and adjust whitespace for readability.
Fixes: 20355e5f73a7 ("x86/entry: Exclude low level entry code from sanitizing")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/202006261333.585319CA6B@keescook
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi into master
Pull SCSI fix from James Bottomley:
"One small driver fix. Although the one liner makes it sound like a
cosmetic change, it's a regression fix for the megaraid_sas driver"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: megaraid_sas: Remove undefined ENABLE_IRQ_POLL macro
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging into master
Pull hwmon fixes from Guenter Roeck:
- Using SCT on some Tohsiba drives causes firmware hangs. Disable its
use in the drivetemp driver.
- Handle potential buffer overflows in scmi and aspeed-pwm-tacho
driver.
- Energy reporting does not work well on all AMD CPUs. Restrict
amd_energy to known working models.
- Enable reading the CPU temperature on NCT6798D using undocumented
registers.
- Fix read errors seen if PEC is enabled in adm1275 driver.
- Fix setting the pwm1_enable in emc2103 driver.
* tag 'hwmon-for-v5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (drivetemp) Avoid SCT usage on Toshiba DT01ACA family drives
hwmon: (scmi) Fix potential buffer overflow in scmi_hwmon_probe()
hwmon: (nct6775) Accept PECI Calibration as temperature source for NCT6798D
hwmon: (adm1275) Make sure we are reading enough data for different chips
hwmon: (emc2103) fix unable to change fan pwm1_enable attribute
hwmon: (amd_energy) match for supported models
hwmon: (aspeed-pwm-tacho) Avoid possible buffer overflow
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux into master
Pull RISC-V fixes from Palmer Dabbelt:
"Two fixes:
- 16KiB kernel stacks on rv64, which fixes a lot of crashes.
- Rolling an mmiowb() into the scheduler, which when combined with
Will's fix to the mmiowb()-on-spinlock should fix the PREEMPT
issues we've been seeing"
* tag 'riscv-for-linus-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Upgrade smp_mb__after_spinlock() to iorw,iorw
riscv: use 16KB kernel stack on 64-bit
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux into master
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 5.8:
- A fix to the VAS code we merged this cycle, to report the proper
error code to userspace for address translation failures. And a
selftest update to match.
- Another fix for our pkey handling of PROT_EXEC mappings.
- A fix for a crash when booting a "secure VM" under an ultravisor
with certain numbers of CPUs.
Thanks to: Aneesh Kumar K.V, Haren Myneni, Laurent Dufour, Sandipan
Das, Satheesh Rajendran, Thiago Jung Bauermann"
* tag 'powerpc-5.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
selftests/powerpc: Use proper error code to check fault address
powerpc/vas: Report proper error code for address translation failure
powerpc/pseries/svm: Fix incorrect check for shared_lppaca_size
powerpc/book3s64/pkeys: Fix pkey_access_permitted() for execute disable pkey
|
|
It has been observed that Toshiba DT01ACA family drives have
WRITE FPDMA QUEUED command timeouts and sometimes just freeze until
power-cycled under heavy write loads when their temperature is getting
polled in SCT mode. The SMART mode seems to be fine, though.
Let's make sure we don't use SCT mode for these drives then.
While only the 3 TB model was actually caught exhibiting the problem let's
play safe here to avoid data corruption and extend the ban to the whole
family.
Fixes: 5b46903d8bf3 ("hwmon: Driver for disk and solid state drives with temperature sensors")
Cc: stable@vger.kernel.org
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Link: https://lore.kernel.org/r/0cb2e7022b66c6d21d3f189a12a97878d0e7511b.1595075458.git.mail@maciej.szmigiero.name
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
tss_invalidate_io_bitmap() wasn't wired up properly through the pvop
machinery, so the TSS and Xen's io bitmap would get out of sync
whenever disabling a valid io bitmap.
Add a new pvop for tss_invalidate_io_bitmap() to fix it.
This is XSA-329.
Fixes: 22fe5b0439dd ("x86/ioperm: Move TSS bitmap update to exit to user work")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/d53075590e1f91c19f8af705059d3ff99424c020.1595030016.git.luto@kernel.org
|
|
into master
Pull NFS client fixes from Anna Schumaker:
"A few more NFS client bugfixes for Linux 5.8:
NFS:
- Fix interrupted slots by using the SEQUENCE operation
SUNRPC:
- revert d03727b248d0 to fix unkillable IOs
xprtrdma:
- Fix double-free in rpcrdma_ep_create()
- Fix recursion into rpcrdma_xprt_disconnect()
- Fix return code from rpcrdma_xprt_connect()
- Fix handling of connect errors
- Fix incorrect header size calculations"
* tag 'nfs-for-5.8-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
SUNRPC reverting d03727b248d0 ("NFSv4 fix CLOSE not waiting for direct IO compeletion")
xprtrdma: fix incorrect header size calculations
NFS: Fix interrupted slots by sending a solo SEQUENCE operation
xprtrdma: Fix handling of connect errors
xprtrdma: Fix return code from rpcrdma_xprt_connect()
xprtrdma: Fix recursion into rpcrdma_xprt_disconnect()
xprtrdma: Fix double-free in rpcrdma_ep_create()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc into master
Pull ARM SoC fixes from Arnd Bergmann:
"This time there are a number of actual code fixes, plus a small set of
device tree issues getting addressed:
Renesas:
- one defconfig cleanup to allow a later Kconfig change
Intel socfpga:
- enable QSPI devices on some machines
- fix DTC validation warnings
TI OMAP:
- Two DEBUG_ATOMIC_SLEEP fixes for ti-sysc interconnect target
module driver
- A regression fix for ti-sysc no-idle handling that caused issues
compared to earlier platform data based booting
- A fix for memory leak for omap_hwmod_allocate_module
- Fix d_can driver probe for am437x
NXP i.MX:
- A couple of fixes on i.MX platform device registration code to
stop the use of invalid IRQ 0.
- Fix a regression seen on ls1021a platform, caused by commit
52102a3ba6a61 ("soc: imx: move cpu code to drivers/soc/imx").
- Fix a misconfiguration of audio SSI on imx6qdl-gw551x board.
Amlogic Meson:
- misc DT fixes
- SoC ID fixes to detect all chips correctly"
* tag 'arm-fixes-5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
arm64: dts: spcfpga: Align GIC, NAND and UART nodenames with dtschema
ARM: dts: socfpga: Align L2 cache-controller nodename with dtschema
arm64: dts: stratix10: increase QSPI reg address in nand dts file
arm64: dts: stratix10: add status to qspi dts node
arm64: dts: agilex: add status to qspi dts node
ARM: dts: Fix dcan driver probe failed on am437x platform
ARM: OMAP2+: Fix possible memory leak in omap_hwmod_allocate_module
arm64: defconfig: Enable CONFIG_PCIE_RCAR_HOST
soc: imx: check ls1021a
ARM: imx: Remove imx_add_imx_dma() unused irq_err argument
ARM: imx: Provide correct number of resources when registering gpio devices
ARM: dts: imx6qdl-gw551x: fix audio SSI
bus: ti-sysc: Do not disable on suspend for no-idle
bus: ti-sysc: Fix sleeping function called from invalid context for RTC quirk
bus: ti-sysc: Fix wakeirq sleeping function called from invalid context
ARM: dts: meson: Align L2 cache-controller nodename with dtschema
arm64: dts: meson-gxl-s805x: reduce initial Mali450 core frequency
arm64: dts: meson: add missing gxl rng clock
soc: amlogic: meson-gx-socinfo: Fix S905X3 and S905D3 ID's
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux into master
Pull arm64 fixes from Will Deacon:
"A batch of arm64 fixes.
Although the diffstat is a bit larger than we'd usually have at this
stage, a decent amount of it is the addition of comments describing
our syscall tracing behaviour, and also a sweep across all the modular
arm64 PMU drivers to make them rebust against unloading and unbinding.
There are a couple of minor things kicking around at the moment (CPU
errata and module PLTs for very large modules), but I'm not expecting
any significant changes now for us in 5.8.
- Fix kernel text addresses for relocatable images booting using EFI
and with KASLR disabled so that they match the vmlinux ELF binary.
- Fix unloading and unbinding of PMU driver modules.
- Fix generic mmiowb() when writeX() is called from preemptible
context (reported by the riscv folks).
- Fix ptrace hardware single-step interactions with signal handlers,
system calls and reverse debugging.
- Fix reporting of 64-bit x0 register for 32-bit tasks via
'perf_regs'.
- Add comments describing syscall entry/exit tracing ABI"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
drivers/perf: Prevent forced unbinding of PMU drivers
asm-generic/mmiowb: Allow mmiowb_set_pending() when preemptible()
arm64: Use test_tsk_thread_flag() for checking TIF_SINGLESTEP
arm64: ptrace: Use NO_SYSCALL instead of -1 in syscall_trace_enter()
arm64: syscall: Expand the comment about ptrace and syscall(-1)
arm64: ptrace: Add a comment describing our syscall entry/exit trap ABI
arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return
arm64: ptrace: Override SPSR.SS when single-stepping is enabled
arm64: ptrace: Consistently use pseudo-singlestep exceptions
drivers/perf: Fix kernel panic when rmmod PMU modules during perf sampling
efi/libstub/arm64: Retain 2MB kernel Image alignment if !KASLR
|
|
Setting interrupt affinity on inactive interrupts is inconsistent when
hierarchical irq domains are enabled. The core code should just store the
affinity and not call into the irq chip driver for inactive interrupts
because the chip drivers may not be in a state to handle such requests.
X86 has a hacky workaround for that but all other irq chips have not which
causes problems e.g. on GIC V3 ITS.
Instead of adding more ugly hacks all over the place, solve the problem in
the core code. If the affinity is set on an inactive interrupt then:
- Store it in the irq descriptors affinity mask
- Update the effective affinity to reflect that so user space has
a consistent view
- Don't call into the irq chip driver
This is the core equivalent of the X86 workaround and works correctly
because the affinity setting is established in the irq chip when the
interrupt is activated later on.
Note, that this is only effective when hierarchical irq domains are enabled
by the architecture. Doing it unconditionally would break legacy irq chip
implementations.
For hierarchial irq domains this works correctly as none of the drivers can
have a dependency on affinity setting in inactive state by design.
Remove the X86 workaround as it is not longer required.
Fixes: 02edee152d6e ("x86/apic/vector: Ignore set_affinity call for inactive interrupts")
Reported-by: Ali Saidi <alisaidi@amazon.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ali Saidi <alisaidi@amazon.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20200529015501.15771-1-alisaidi@amazon.com
Link: https://lkml.kernel.org/r/877dv2rv25.fsf@nanos.tec.linutronix.de
|
|
When an expiration delta falls into the last level of the wheel, that delta
has be compared against the maximum possible delay and reduced to fit in if
necessary.
However instead of comparing the delta against the maximum, the code
compares the actual expiry against the maximum. Then instead of fixing the
delta to fit in, it sets the maximum delta as the expiry value.
This can result in various undesired outcomes, the worst possible one
being a timer expiring 15 days ahead to fire immediately.
Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200717140551.29076-2-frederic@kernel.org
|
|
compeletion")
Reverting commit d03727b248d0 "NFSv4 fix CLOSE not waiting for
direct IO compeletion". This patch made it so that fput() by calling
inode_dio_done() in nfs_file_release() would wait uninterruptably
for any outstanding directIO to the file (but that wait on IO should
be killable).
The problem the patch was also trying to address was REMOVE returning
ERR_ACCESS because the file is still opened, is supposed to be resolved
by server returning ERR_FILE_OPEN and not ERR_ACCESS.
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
|
|
master
Pull io_uring fix from Jens Axboe:
"Fix for a case where, with automatic buffer selection, we can leak the
buffer descriptor for recvmsg"
* tag 'io_uring-5.8-2020-07-17' of git://git.kernel.dk/linux-block:
io_uring: fix recvmsg memory leak with buffer selection
|
|
Pull block fix from Jens Axboe:
"Single NVMe multipath capacity fix"
* tag 'block-5.8-2020-07-17' of git://git.kernel.dk/linux-block:
nvme: explicitly update mpath disk capacity on revalidation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse into master
Pull fuse fixes from Miklos Szeredi:
- two regressions in this cycle caused by the conversion of writepage
list to an rb_tree
- two regressions in v5.4 cause by the conversion to the new mount API
- saner behavior of fsconfig(2) for the reconfigure case
- an ancient issue with FS_IOC_{GET,SET}FLAGS ioctls
* tag 'fuse-fixes-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: Fix parameter for FS_IOC_{GET,SET}FLAGS
fuse: don't ignore errors from fuse_writepages_fill()
fuse: clean up condition for writepage sending
fuse: reject options on reconfigure via fsconfig(2)
fuse: ignore 'data' argument of mount(..., MS_REMOUNT)
fuse: use ->reconfigure() instead of ->remount_fs()
fuse: fix warning in tree_insert() and clean up writepage insertion
fuse: move rb_erase() before tree_insert()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs into master
Pull overlayfs fixes from Miklos Szeredi:
- fix a regression introduced in v4.20 in handling a regenerated
squashfs lower layer
- two regression fixes for this cycle, one of which is Oops inducing
- miscellaneous issues
* tag 'ovl-fixes-5.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: fix lookup of indexed hardlinks with metacopy
ovl: fix unneeded call to ovl_change_flags()
ovl: fix mount option checks for nfs_export with no upperdir
ovl: force read-only sb on failure to create index dir
ovl: fix regression with re-formatted lower squashfs
ovl: fix oops in ovl_indexdir_cleanup() with nfs_export=on
ovl: relax WARN_ON() when decoding lower directory file handle
ovl: remove not used argument in ovl_check_origin
ovl: change ovl_copy_up_flags static
ovl: inode reference leak in ovl_is_inuse true case.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi into master
Pull spi fixes from Mark Brown:
"A couple of small driver specific fixes for fairly minor issues"
* tag 'spi-fix-v5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-sun6i: sun6i_spi_transfer_one(): fix setting of clock rate
spi: mediatek: use correct SPI_CFG2_REG MACRO
|