aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kernel/traps.c
AgeCommit message (Collapse)AuthorFilesLines
2020-12-22x86/split-lock: Avoid returning with interrupts enabledAndi Kleen1-1/+2
When a split lock is detected always make sure to disable interrupts before returning from the trap handler. The kernel exit code assumes that all exits run with interrupts disabled, otherwise the SWAPGS sequence can race against interrupts and cause recursing page faults and later panics. The problem will only happen on CPUs with split lock disable functionality, so Icelake Server, Tiger Lake, Snow Ridge, Jacobsville. Fixes: ca4c6a9858c2 ("x86/traps: Make interrupt enable/disable symmetric in C code") Fixes: bce9b042ec73 ("x86/traps: Disable interrupts in exc_aligment_check()") # v5.8+ Signed-off-by: Andi Kleen <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Tony Luck <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-14Merge tag 'core-entry-2020-12-14' of ↵Linus Torvalds1-6/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core entry/exit updates from Thomas Gleixner: "A set of updates for entry/exit handling: - More generalization of entry/exit functionality - The consolidation work to reclaim TIF flags on x86 and also for non-x86 specific TIF flags which are solely relevant for syscall related work and have been moved into their own storage space. The x86 specific part had to be merged in to avoid a major conflict. - The TIF_NOTIFY_SIGNAL work which replaces the inefficient signal delivery mode of task work and results in an impressive performance improvement for io_uring. The non-x86 consolidation of this is going to come seperate via Jens. - The selective syscall redirection facility which provides a clean and efficient way to support the non-Linux syscalls of WINE by catching them at syscall entry and redirecting them to the user space emulation. This can be utilized for other purposes as well and has been designed carefully to avoid overhead for the regular fastpath. This includes the core changes and the x86 support code. - Simplification of the context tracking entry/exit handling for the users of the generic entry code which guarantee the proper ordering and protection. - Preparatory changes to make the generic entry code accomodate S390 specific requirements which are mostly related to their syscall restart mechanism" * tag 'core-entry-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) entry: Add syscall_exit_to_user_mode_work() entry: Add exit_to_user_mode() wrapper entry_Add_enter_from_user_mode_wrapper entry: Rename exit_to_user_mode() entry: Rename enter_from_user_mode() docs: Document Syscall User Dispatch selftests: Add benchmark for syscall user dispatch selftests: Add kselftest for syscall user dispatch entry: Support Syscall User Dispatch on common syscall entry kernel: Implement selective syscall userspace redirection signal: Expose SYS_USER_DISPATCH si_code type x86: vdso: Expose sigreturn address on vdso to the kernel MAINTAINERS: Add entry for common entry code entry: Fix boot for !CONFIG_GENERIC_ENTRY x86: Support HAVE_CONTEXT_TRACKING_OFFSTACK context_tracking: Only define schedule_user() on !HAVE_CONTEXT_TRACKING_OFFSTACK archs sched: Detect call to schedule from critical entry code context_tracking: Don't implement exception_enter/exit() on CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK context_tracking: Introduce HAVE_CONTEXT_TRACKING_OFFSTACK x86: Reclaim unused x86 TI flags ...
2020-11-18x86/traps: Attempt to fixup exceptions in vDSO before signalingSean Christopherson1-0/+10
vDSO functions can now leverage an exception fixup mechanism similar to kernel exception fixup. For vDSO exception fixup, the initial user is Intel's Software Guard Extensions (SGX), which will wrap the low-level transitions to/from the enclave, i.e. EENTER and ERESUME instructions, in a vDSO function and leverage fixup to intercept exceptions that would otherwise generate a signal. This allows the vDSO wrapper to return the fault information directly to its caller, obviating the need for SGX applications and libraries to juggle signal handlers. Attempt to fixup vDSO exceptions immediately prior to populating and sending signal information. Except for the delivery mechanism, an exception in a vDSO function should be treated like any other exception in userspace, e.g. any fault that is successfully handled by the kernel should not be directly visible to userspace. Although it's debatable whether or not all exceptions are of interest to enclaves, defer to the vDSO fixup to decide whether to do fixup or generate a signal. Future users of vDSO fixup, if there ever are any, will undoubtedly have different requirements than SGX enclaves, e.g. the fixup vs. signal logic can be made function specific if/when necessary. Suggested-by: Andy Lutomirski <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Jethro Beekman <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-11-04x86/entry: Move nmi entry/exit into common codeThomas Gleixner1-6/+7
Lockdep state handling on NMI enter and exit is nothing specific to X86. It's not any different on other architectures. Also the extra state type is not necessary, irqentry_state_t can carry the necessary information as well. Move it to common code and extend irqentry_state_t to carry lockdep state. [ Ira: Make exit_rcu and lockdep a union as they are mutually exclusive between the IRQ and NMI exceptions, and add kernel documentation for struct irqentry_state_t ] Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-27x86/debug: Fix DR_STEP vs ptrace_get_debugreg(6)Peter Zijlstra1-3/+6
Commit d53d9bc0cf78 ("x86/debug: Change thread.debugreg6 to thread.virtual_dr6") changed the semantics of the variable from random collection of bits, to exactly only those bits that ptrace() needs. Unfortunately this lost DR_STEP for PTRACE_{BLOCK,SINGLE}STEP. Furthermore, it turns out that userspace expects DR_STEP to be unconditionally available, even for manual TF usage outside of PTRACE_{BLOCK,SINGLE}_STEP. Fixes: d53d9bc0cf78 ("x86/debug: Change thread.debugreg6 to thread.virtual_dr6") Reported-by: Kyle Huey <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Kyle Huey <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-27x86/debug: Only clear/set ->virtual_dr6 for userspace #DBPeter Zijlstra1-6/+6
The ->virtual_dr6 is the value used by ptrace_{get,set}_debugreg(6). A kernel #DB clearing it could mean spurious malfunction of ptrace() expectations. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Kyle Huey <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-27x86/debug: Fix BTF handlingPeter Zijlstra1-7/+21
The SDM states that #DB clears DEBUGCTLMSR_BTF, this means that when the bit is set for userspace (TIF_BLOCKSTEP) and a kernel #DB happens first, the BTF bit meant for userspace execution is lost. Have the kernel #DB handler restore the BTF bit when it was requested for userspace. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Kyle Huey <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-14Merge tag 'x86_seves_for_v5.10' of ↵Linus Torvalds1-0/+48
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV-ES support from Borislav Petkov: "SEV-ES enhances the current guest memory encryption support called SEV by also encrypting the guest register state, making the registers inaccessible to the hypervisor by en-/decrypting them on world switches. Thus, it adds additional protection to Linux guests against exfiltration, control flow and rollback attacks. With SEV-ES, the guest is in full control of what registers the hypervisor can access. This is provided by a guest-host exchange mechanism based on a new exception vector called VMM Communication Exception (#VC), a new instruction called VMGEXIT and a shared Guest-Host Communication Block which is a decrypted page shared between the guest and the hypervisor. Intercepts to the hypervisor become #VC exceptions in an SEV-ES guest so in order for that exception mechanism to work, the early x86 init code needed to be made able to handle exceptions, which, in itself, brings a bunch of very nice cleanups and improvements to the early boot code like an early page fault handler, allowing for on-demand building of the identity mapping. With that, !KASLR configurations do not use the EFI page table anymore but switch to a kernel-controlled one. The main part of this series adds the support for that new exchange mechanism. The goal has been to keep this as much as possibly separate from the core x86 code by concentrating the machinery in two SEV-ES-specific files: arch/x86/kernel/sev-es-shared.c arch/x86/kernel/sev-es.c Other interaction with core x86 code has been kept at minimum and behind static keys to minimize the performance impact on !SEV-ES setups. Work by Joerg Roedel and Thomas Lendacky and others" * tag 'x86_seves_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (73 commits) x86/sev-es: Use GHCB accessor for setting the MMIO scratch buffer x86/sev-es: Check required CPU features for SEV-ES x86/efi: Add GHCB mappings when SEV-ES is active x86/sev-es: Handle NMI State x86/sev-es: Support CPU offline/online x86/head/64: Don't call verify_cpu() on starting APs x86/smpboot: Load TSS and getcpu GDT entry before loading IDT x86/realmode: Setup AP jump table x86/realmode: Add SEV-ES specific trampoline entry point x86/vmware: Add VMware-specific handling for VMMCALL under SEV-ES x86/kvm: Add KVM-specific VMMCALL handling under SEV-ES x86/paravirt: Allow hypervisor-specific VMMCALL handling under SEV-ES x86/sev-es: Handle #DB Events x86/sev-es: Handle #AC Events x86/sev-es: Handle VMMCALL Events x86/sev-es: Handle MWAIT/MWAITX Events x86/sev-es: Handle MONITOR/MONITORX Events x86/sev-es: Handle INVD Events x86/sev-es: Handle RDPMC Events x86/sev-es: Handle RDTSC(P) Events ...
2020-10-13Merge tag 'x86_urgent_for_v5.10-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Fix the #DE oops message string format which confused tools parsing crash information (Thomas Gleixner) - Remove an unused variable in the UV5 code which was triggering a build warning with clang (Mike Travis) * tag 'x86_urgent_for_v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/uv: Remove unused variable in UV5 NMI handler x86/traps: Fix #DE Oops message regression
2020-10-13x86/traps: Fix #DE Oops message regressionThomas Gleixner1-1/+1
The conversion of #DE to the idtentry mechanism introduced a change in the Ooops message which confuses tools which parse crash information in dmesg. Remove the underscore from 'divide_error' to restore previous behaviour. Fixes: 9d06c4027f21 ("x86/entry: Convert Divide Error to IDTENTRY") Reported-by: Dmitry Vyukov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/CACT4Y+bTZFkuZd7+bPArowOv-7Die+WZpfOWnEO_Wgs3U59+oA@mail.gmail.com
2020-09-09x86/entry/64: Add entry code for #VC handlerJoerg Roedel1-0/+45
The #VC handler needs special entry code because: 1. It runs on an IST stack 2. It needs to be able to handle nested #VC exceptions To make this work, the entry code is implemented to pretend it doesn't use an IST stack. When entered from user-mode or early SYSCALL entry path it switches to the task stack. If entered from kernel-mode it tries to switch back to the previous stack in the IRET frame. The stack found in the IRET frame is validated first, and if it is not safe to use it for the #VC handler, the code will switch to a fall-back stack (the #VC2 IST stack). From there, it can cause nested exceptions again. Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-09-09x86/sev-es: Setup per-CPU GHCBs for the runtime handlerTom Lendacky1-0/+3
The runtime handler needs one GHCB per-CPU. Set them up and map them unencrypted. [ bp: Touchups and simplification. ] Signed-off-by: Tom Lendacky <[email protected]> Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-09-04x86/debug: Change thread.debugreg6 to thread.virtual_dr6Peter Zijlstra1-9/+16
Current usage of thread.debugreg6 is convoluted at best. It starts life as a copy of the hardware DR6 value, but then various bits are cleared and set. Replace this with a new variable thread.virtual_dr6 that is initialized to 0 when DR6 is read and only gains bits, at the same time the actual (on stack) dr6 value which is read from the hardware only gets bits cleared. Suggested-by: Andy Lutomirski <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Daniel Thompson <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-04x86/debug: Support negative polarity DR6 bitsPeter Zijlstra1-3/+2
DR6 has a whole bunch of bits that have negative polarity; they were architecturally reserved and defined to be 1 and are now getting used. Since they're 1 by default, 0 becomes the signal value. Handle this by xor'ing the read DR6 value by the reserved mask, this will flip them around such that 1 is the signal value (positive polarity). Current Linux doesn't yet support any of these bits, but there's two defined: - DR6[11] Bus Lock Debug Exception (ISEr39) - DR6[16] Restricted Transactional Memory (SDM) Update ptrace_{set,get}_debugreg() to provide/consume the value in architectural polarity. Although afaict ptrace_set_debugreg(6) is pointless, the value is not consumed anywhere. Change hw_breakpoint_restore() to alway write the DR6_RESERVED value to DR6, again, no consumer for that write. Suggested-by: Andrew Cooper <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Daniel Thompson <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-04x86/debug: Remove the historical junkPeter Zijlstra1-11/+12
Remove the historical junk and replace it with a WARN and a comment. The problem is that even though the kernel only uses TF single-step in kprobes and KGDB, both of which consume the event before this, QEMU/KVM has bugs in this area that can trigger this state so it has to be dealt with. Suggested-by: Brian Gerst <[email protected]> Suggested-by: Andy Lutomirski <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Daniel Thompson <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-04x86/debug: Move cond_local_irq_enable() block into exc_debug_user()Peter Zijlstra1-29/+29
The cond_local_irq_enable() block, dealing with vm86 and sending signals is only relevant for #DB-from-user, move it there. This then reduces handle_debug() to only the notifier call, so rename it to notify_debug(). Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Daniel Thompson <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-04x86/debug: Move historical SYSENTER junk into exc_debug_kernel()Peter Zijlstra1-24/+25
The historical SYSENTER junk is explicitly for from-kernel, so move it to the #DB-from-kernel handler. It is ordered after the notifier, which is important for KGDB which uses TF single-step and needs to consume the event before that point. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Daniel Thompson <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-04x86/debug: Simplify #DB signal codePeter Zijlstra1-6/+9
There's no point in calculating si_code if it's not going to be used. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Daniel Thompson <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-04x86/debug: Remove handle_debug(.user) argumentPeter Zijlstra1-11/+10
The handle_debug(.user) argument is used to terminate the #DB handler early for the INT1-from-kernel case, since the kernel doesn't use INT1. Remove the argument and handle this explicitly in #DB-from-kernel. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Daniel Thompson <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-04x86/debug: Move kprobe_debug_handler() into exc_debug_kernel()Peter Zijlstra1-6/+4
Kprobes are on kernel text, and thus only matter for #DB-from-kernel. Kprobes are ordered before the generic notifier, preserve that order. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Daniel Thompson <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-04x86/debug: Sync BTF earlierPeter Zijlstra1-7/+7
Move the BTF sync near the DR6 load, as this will be the only common code guaranteed to run on every #DB. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Daniel Thompson <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-09-04x86/debug: Allow a single level of #DB recursionAndy Lutomirski1-34/+31
Trying to clear DR7 around a #DB from usermode malfunctions if the tasks schedules when delivering SIGTRAP. Rather than trying to define a special no-recursion region, just allow a single level of recursion. The same mechanism is used for NMI, and it hasn't caused any problems yet. Fixes: 9f58fdde95c9 ("x86/db: Split out dr6/7 handling") Reported-by: Kyle Huey <[email protected]> Debugged-by: Josh Poimboeuf <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Daniel Thompson <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/8b9bd05f187231df008d48cf818a6a311cbd5c98.1597882384.git.luto@kernel.org Link: https://lore.kernel.org/r/[email protected]
2020-08-07mm: remove unneeded includes of <asm/pgalloc.h>Mike Rapoport1-1/+0
Patch series "mm: cleanup usage of <asm/pgalloc.h>" Most architectures have very similar versions of pXd_alloc_one() and pXd_free_one() for intermediate levels of page table. These patches add generic versions of these functions in <asm-generic/pgalloc.h> and enable use of the generic functions where appropriate. In addition, functions declared and defined in <asm/pgalloc.h> headers are used mostly by core mm and early mm initialization in arch and there is no actual reason to have the <asm/pgalloc.h> included all over the place. The first patch in this series removes unneeded includes of <asm/pgalloc.h> In the end it didn't work out as neatly as I hoped and moving pXd_alloc_track() definitions to <asm-generic/pgalloc.h> would require unnecessary changes to arches that have custom page table allocations, so I've decided to move lib/ioremap.c to mm/ and make pgalloc-track.h local to mm/. This patch (of 8): In most cases <asm/pgalloc.h> header is required only for allocations of page table memory. Most of the .c files that include that header do not use symbols declared in <asm/pgalloc.h> and do not require that header. As for the other header files that used to include <asm/pgalloc.h>, it is possible to move that include into the .c file that actually uses symbols from <asm/pgalloc.h> and drop the include from the header file. The process was somewhat automated using sed -i -E '/[<"]asm\/pgalloc\.h/d' \ $(grep -L -w -f /tmp/xx \ $(git grep -E -l '[<"]asm/pgalloc\.h')) where /tmp/xx contains all the symbols defined in arch/*/include/asm/pgalloc.h. [[email protected]: fix powerpc warning] Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Pekka Enberg <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> [m68k] Cc: Abdul Haleem <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Max Filippov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Satheesh Rajendran <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Matthew Wilcox <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-07-26Merge branch 'locking/nmi' into x86/entryIngo Molnar1-11/+6
Resolve conflicts with ongoing lockdep work that fixed the NMI entry code. Conflicts: arch/x86/entry/common.c arch/x86/include/asm/idtentry.h Signed-off-by: Ingo Molnar <[email protected]>
2020-07-24x86/entry: Cleanup idtentry_enter/exitThomas Gleixner1-3/+3
Remove the temporary defines and fixup all references. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-24x86/entry: Cleanup idtentry_entry/exit_userThomas Gleixner1-9/+9
Cleanup the temporary defines and use irqentry_ instead of idtentry_. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-10x86/entry: Fix NMI vs IRQ state trackingPeter Zijlstra1-11/+6
While the nmi_enter() users did trace_hardirqs_{off_prepare,on_finish}() there was no matching lockdep_hardirqs_*() calls to complete the picture. Introduce idtentry_{enter,exit}_nmi() to enable proper IRQ state tracking across the NMIs. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-10Merge branch 'x86/urgent' into x86/entry to pick up upstream fixes.Thomas Gleixner1-0/+2
2020-07-09x86/traps: Disable interrupts in exc_aligment_check()Thomas Gleixner1-0/+2
exc_alignment_check() fails to disable interrupts before returning to the entry code. Fixes: ca4c6a9858c2 ("x86/traps: Make interrupt enable/disable symmetric in C code") Reported-by: [email protected] Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-06x86/entry: Rename idtentry_enter/exit_cond_rcu() to idtentry_enter/exit()Andy Lutomirski1-3/+3
They were originally called _cond_rcu because they were special versions with conditional RCU handling. Now they're the standard entry and exit path, so the _cond_rcu part is just confusing. Drop it. Also change the signature to make them more extensible and more foolproof. No functional change -- it's pure refactoring. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/247fc67685263e0b673e1d7f808182d28ff80359.1593795633.git.luto@kernel.org
2020-07-05Merge tag 'x86-urgent-2020-07-05' of ↵Linus Torvalds1-1/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A series of fixes for x86: - Reset MXCSR in kernel_fpu_begin() to prevent using a stale user space value. - Prevent writing MSR_TEST_CTRL on CPUs which are not explicitly whitelisted for split lock detection. Some CPUs which do not support it crash even when the MSR is written to 0 which is the default value. - Fix the XEN PV fallout of the entry code rework - Fix the 32bit fallout of the entry code rework - Add more selftests to ensure that these entry problems don't come back. - Disable 16 bit segments on XEN PV. It's not supported because XEN PV does not implement ESPFIX64" * tag 'x86-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ldt: Disable 16-bit segments on Xen PV x86/entry/32: Fix #MC and #DB wiring on x86_32 x86/entry/xen: Route #DB correctly on Xen PV x86/entry, selftests: Further improve user entry sanity checks x86/entry/compat: Clear RAX high bits on Xen PV SYSENTER selftests/x86: Consolidate and fix get/set_eflags() helpers selftests/x86/syscall_nt: Clear weird flags after each test selftests/x86/syscall_nt: Add more flag combinations x86/entry/64/compat: Fix Xen PV SYSENTER frame setup x86/entry: Move SYSENTER's regs->sp and regs->flags fixups into C x86/entry: Assert that syscalls are on the right stack x86/split_lock: Don't write MSR_TEST_CTRL on CPUs that aren't whitelisted x86/fpu: Reset MXCSR to default in kernel_fpu_begin()
2020-07-04x86/entry/32: Fix #MC and #DB wiring on x86_32Andy Lutomirski1-1/+1
DEFINE_IDTENTRY_MCE and DEFINE_IDTENTRY_DEBUG were wired up as non-RAW on x86_32, but the code expected them to be RAW. Get rid of all the macro indirection for them on 32-bit and just use DECLARE_IDTENTRY_RAW and DEFINE_IDTENTRY_RAW directly. Also add a warning to make sure that we only hit the _kernel paths in kernel mode. Reported-by: Naresh Kamboju <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/9e90a7ee8e72fd757db6d92e1e5ff16339c1ecf9.1593795633.git.luto@kernel.org
2020-07-04x86/entry/xen: Route #DB correctly on Xen PVAndy Lutomirski1-0/+12
On Xen PV, #DB doesn't use IST. It still needs to be correctly routed depending on whether it came from user or kernel mode. Get rid of DECLARE/DEFINE_IDTENTRY_XEN -- it was too hard to follow the logic. Instead, route #DB and NMI through DECLARE/DEFINE_IDTENTRY_RAW on Xen, and do the right thing for #DB. Also add more warnings to the exc_debug* handlers to make this type of failure more obvious. This fixes various forms of corruption that happen when usermode triggers #DB on Xen PV. Fixes: 4c0dcd8350a0 ("x86/entry: Implement user mode C entry points for #DB and #MCE") Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/4163e733cce0b41658e252c6c6b3464f33fdff17.1593795633.git.luto@kernel.org
2020-06-26Merge branch 'linus' into x86/entry, to resolve conflictsIngo Molnar1-1/+2
Conflicts: arch/x86/kernel/traps.c Signed-off-by: Ingo Molnar <[email protected]>
2020-06-25x86/entry: Fix #UD vs WARN morePeter Zijlstra1-34/+38
vmlinux.o: warning: objtool: exc_invalid_op()+0x47: call to probe_kernel_read() leaves .noinstr.text section Since we use UD2 as a short-cut for 'CALL __WARN', treat it as such. Have the bare exception handler do the report_bug() thing. Fixes: 15a416e8aaa7 ("x86/entry: Treat BUG/WARN as NMI-like entries") Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-06-25x86/entry: Fixup bad_iret vs noinstrPeter Zijlstra1-3/+3
vmlinux.o: warning: objtool: fixup_bad_iret()+0x8e: call to memcpy() leaves .noinstr.text section Worse, when KASAN there is no telling what memcpy() actually is. Force the use of __memcpy() which is our assmebly implementation. Reported-by: Marco Elver <[email protected]> Suggested-by: Marco Elver <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Marco Elver <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-06-18maccess: rename probe_kernel_address to get_kernel_nofaultChristoph Hellwig1-1/+1
Better describe what this helper does, and match the naming of copy_from_kernel_nofault. Also switch the argument order around, so that it acts and looks like get_user(). Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-06-17maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofaultChristoph Hellwig1-1/+2
Better describe what these functions do. Suggested-by: Linus Torvalds <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-06-12x86/entry: Treat BUG/WARN as NMI-like entriesAndy Lutomirski1-26/+38
BUG/WARN are cleverly optimized using UD2 to handle the BUG/WARN out of line in an exception fixup. But if BUG or WARN is issued in a funny RCU context, then the idtentry_enter...() path might helpfully WARN that the RCU context is invalid, which results in infinite recursion. Split the BUG/WARN handling into an nmi_enter()/nmi_exit() path in exc_invalid_op() to increase the chance to survive the experience. [ tglx: Make the declaration match the implementation ] Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/f8fe40e0088749734b4435b554f73eee53dcf7a8.1591932307.git.luto@kernel.org
2020-06-11x86/entry: Re-order #DB handler to avoid *SAN instrumentationPeter Zijlstra1-28/+27
vmlinux.o: warning: objtool: exc_debug()+0xbb: call to clear_ti_thread_flag.constprop.0() leaves .noinstr.text section vmlinux.o: warning: objtool: noist_exc_debug()+0x55: call to clear_ti_thread_flag.constprop.0() leaves .noinstr.text section Rework things so that handle_debug() looses the noinstr and move the clear_thread_flag() into that. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-06-11x86/idt: Cleanup trap_init()Thomas Gleixner1-9/+0
No point in having all the IDT cruft in trap_init(). Move it into the IDT code and fixup the comments. Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-06-11x86/entry: Rename trace_hardirqs_off_prepare()Peter Zijlstra1-2/+2
The typical pattern for trace_hardirqs_off_prepare() is: ENTRY lockdep_hardirqs_off(); // because hardware ... do entry magic instrumentation_begin(); trace_hardirqs_off_prepare(); ... do actual work trace_hardirqs_on_prepare(); lockdep_hardirqs_on_prepare(); instrumentation_end(); ... do exit magic lockdep_hardirqs_on(); which shows that it's named wrong, rename it to trace_hardirqs_off_finish(), as it concludes the hardirq_off transition. Also, given that the above is the only correct order, make the traditional all-in-one trace_hardirqs_off() follow suit. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-06-11x86/entry: Remove debug IDT frobbingPeter Zijlstra1-9/+0
This is all unused now. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-06-11x86/entry: Introduce local_db_{save,restore}()Peter Zijlstra1-16/+2
In order to allow other exceptions than #DB to disable breakpoints, provide common helpers. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-06-11x86/entry: Move paranoid irq tracing out of ASM codeThomas Gleixner1-0/+11
The last step to remove the irq tracing cruft from ASM. Ignore #DF as the maschine is going to die anyway. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-11x86/idtentry: Switch to conditional RCU handlingThomas Gleixner1-5/+5
Switch all idtentry_enter/exit() users over to the new conditional RCU handling scheme and make the user mode entries in #DB, #INT3 and #MCE use the user mode idtentry functions. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-11x86/entry: Fix allnoconfig build warningIngo Molnar1-1/+1
The following commit: 095b7a3e7745 ("x86/entry: Convert double fault exception to IDTENTRY_DF") introduced a new build warning on 64-bit allnoconfig kernels, that have CONFIG_VMAP_STACK disabled: arch/x86/kernel/traps.c:332:16: warning: unused variable ‘address’ [-Wunused-variable] This variable is only used if CONFIG_VMAP_STACK is defined, so make it dependent on that, not CONFIG_X86_64. Signed-off-by: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Alexandre Chartre <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]>
2020-06-11x86/entry: Convert double fault exception to IDTENTRY_DFThomas Gleixner1-3/+14
Convert #DF to IDTENTRY_DF - Implement the C entry point with DEFINE_IDTENTRY_DF - Emit the ASM stub with DECLARE_IDTENTRY_DF on 64bit - Remove the ASM idtentry in 64bit - Adjust the 32bit shim code - Fixup the XEN/PV code - Remove the old prototypes No functional change. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Alexandre Chartre <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-06-11x86/traps: Address objtool noinstr complaints in #DBThomas Gleixner1-2/+8
The functions invoked from handle_debug() can be instrumented. Tell objtool about it. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Alexandre Chartre <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-06-11x86/traps: Restructure #DB handlingThomas Gleixner1-34/+35
Now that there are separate entry points, move the kernel/user_mode specifc checks into the entry functions so the common handling code does not need the extra mode checks. Make the code more readable while at it. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Alexandre Chartre <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Link: https://lkml.kernel.org/r/[email protected]