aboutsummaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)AuthorFilesLines
2018-09-03x86/paravirt: Move the Xen-only pv_mmu_ops under the PARAVIRT_XXL umbrellaJuergen Gross12-112/+103
Most of the paravirt ops defined in pv_mmu_ops are for Xen PV guests only. Define them only if CONFIG_PARAVIRT_XXL is set. Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/paravirt: Move the pv_irq_ops under the PARAVIRT_XXL umbrellaJuergen Gross9-18/+15
All of the paravirt ops defined in pv_irq_ops are for Xen PV guests or VSMP only. Define them only if CONFIG_PARAVIRT_XXL is set. Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/paravirt: Move the Xen-only pv_cpu_ops under the PARAVIRT_XXL umbrellaJuergen Gross16-22/+78
Most of the paravirt ops defined in pv_cpu_ops are for Xen PV guests only. Define them only if CONFIG_PARAVIRT_XXL is set. Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/paravirt: Move items in pv_info under PARAVIRT_XXL umbrellaJuergen Gross6-3/+9
All items but name in pv_info are needed by Xen PV only. Define them with CONFIG_PARAVIRT_XXL set only. Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/paravirt: Introduce new config option PARAVIRT_XXLJuergen Gross4-0/+7
A large amount of paravirt ops is used by Xen PV guests only. Add a new config option PARAVIRT_XXL which is selected by XEN_PV. Later we can put the Xen PV only paravirt ops under the PARAVIRT_XXL umbrella. Since irq related paravirt ops are used only by VSMP and Xen PV, let VSMP select PARAVIRT_XXL, too, in order to enable moving the irq ops under PARAVIRT_XXL. Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/paravirt: Remove unused paravirt bitsJuergen Gross3-13/+1
The macros ENABLE_INTERRUPTS_SYSEXIT, GET_CR0_INTO_EAX and PARAVIRT_ADJUST_EXCEPTION_FRAME are used nowhere. Remove their definitions. Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/paravirt: Use a single ops structureJuergen Gross22-449/+412
Instead of using six globally visible paravirt ops structures combine them in a single structure, keeping the original structures as sub-structures. This avoids the need to assemble struct paravirt_patch_template at runtime on the stack each time apply_paravirt() is being called (i.e. when loading a module). [ tglx: Made the struct and the initializer tabular for readability sake ] Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/paravirt: Remove clobbers from struct paravirt_patch_siteJuergen Gross2-19/+15
There is no need any longer to store the clobbers in struct paravirt_patch_site. Remove clobbers from the struct and from the related macros. While at it fix some lines longer than 80 characters. Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/paravirt: Remove clobbers parameter from paravirt patch functionsJuergen Gross6-23/+16
The clobbers parameter from paravirt_patch_default() et al isn't used any longer. Remove it. Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/paravirt: Make paravirt_patch_call() and paravirt_patch_jmp() staticJuergen Gross2-12/+6
paravirt_patch_call() and paravirt_patch_jmp() are used in paravirt.c only. Convert them to static. Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/xen: Add SPDX identifier in arch/x86/xen filesJuergen Gross11-64/+20
Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/xen: Link platform-pci-unplug.o only if CONFIG_XEN_PVHVMJuergen Gross2-3/+1
Instead of using one large #ifdef CONFIG_XEN_PVHVM in arch/x86/xen/platform-pci-unplug.c add the object file depending on CONFIG_XEN_PVHVM being set. Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/xen: Move pv specific parts of arch/x86/xen/mmu.c to mmu_pv.cJuergen Gross3-187/+139
There are some PV specific functions in arch/x86/xen/mmu.c which can be moved to mmu_pv.c. This in turn enables to build multicalls.c dependent on CONFIG_XEN_PV. Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/xen: Move pv irq related functions under CONFIG_XEN_PV umbrellaJuergen Gross3-16/+41
All functions in arch/x86/xen/irq.c and arch/x86/xen/xen-asm*.S are specific to PV guests. Include them in the kernel with CONFIG_XEN_PV only. Make the PV specific code in arch/x86/entry/entry_*.S dependent on CONFIG_XEN_PV instead of CONFIG_XEN. The HVM specific code should depend on CONFIG_XEN_PVHVM. While at it reformat the Makefile to make it more readable. Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/fault: BUG() when uaccess helpers fault on kernel addressesJann Horn1-0/+58
There have been multiple kernel vulnerabilities that permitted userspace to pass completely unchecked pointers through to userspace accessors: - the waitid() bug - commit 96ca579a1ecc ("waitid(): Add missing access_ok() checks") - the sg/bsg read/write APIs - the infiniband read/write APIs These don't happen all that often, but when they do happen, it is hard to test for them properly; and it is probably also hard to discover them with fuzzing. Even when an unmapped kernel address is supplied to such buggy code, it just returns -EFAULT instead of doing a proper BUG() or at least WARN(). Try to make such misbehaving code a bit more visible by refusing to do a fixup in the pagefault handler code when a userspace accessor causes a #PF on a kernel address and the current context isn't whitelisted. Signed-off-by: Jann Horn <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Kees Cook <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: "Naveen N. Rao" <[email protected]> Cc: Anil S Keshavamurthy <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: [email protected] Cc: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/fault: Plumb error code and fault address through to fault handlersJann Horn6-21/+44
This is preparation for looking at trap number and fault address in the handlers for uaccess errors. No functional change. Signed-off-by: Jann Horn <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Kees Cook <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: "Naveen N. Rao" <[email protected]> Cc: Anil S Keshavamurthy <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: [email protected] Cc: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/extable: Introduce _ASM_EXTABLE_UA for uaccess fixupsJann Horn12-142/+160
Currently, most fixups for attempting to access userspace memory are handled using _ASM_EXTABLE, which is also used for various other types of fixups (e.g. safe MSR access, IRET failures, and a bunch of other things). In order to make it possible to add special safety checks to uaccess fixups (in particular, checking whether the fault address is actually in userspace), introduce a new exception table handler ex_handler_uaccess() and wire it up to all the user access fixups (excluding ones that already use _ASM_EXTABLE_EX). Signed-off-by: Jann Horn <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Kees Cook <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Masami Hiramatsu <[email protected]> Cc: "Naveen N. Rao" <[email protected]> Cc: Anil S Keshavamurthy <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: [email protected] Cc: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/kprobes: Stop calling fixup_exception() from kprobe_fault_handler()Jann Horn1-9/+0
This removes the call into exception fixup that was added in commit c28f896634f2 ("[PATCH] kprobes: fix broken fault handling for x86_64"). On X86, kprobe_fault_handler() is called from two places: do_general_protection() (for #GP) and kprobes_fault() (for #PF). In both paths, the fixup_exception() call in the kprobe fault handler is redundant. In case of #GP, fixup_exception() is called immediately before kprobe_fault_handler() is invoked, so no need to try that again. This assumes that the kprobe's fault handler isn't going to do something crazy like changing RIP so that it suddenly points to an instruction that does userspace access. For #PF on a kernel address from kernel space, after the kprobe fault handler has run, no_context() is invoked, which calls fixup_exception(). Signed-off-by: Jann Horn <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Kees Cook <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: "Naveen N. Rao" <[email protected]> Cc: Anil S Keshavamurthy <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: [email protected] Cc: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/kprobes: Inline kprobe_exceptions_notify() into do_general_protection()Jann Horn2-30/+11
The opaque plumbing of #GP from do_general_protection() through notify_die() into kprobe_exceptions_notify() makes it hard to understand what's going on. Suggested-by: Andy Lutomirski <[email protected]> Signed-off-by: Jann Horn <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Kees Cook <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: [email protected] Cc: [email protected] Cc: "Naveen N. Rao" <[email protected]> Cc: Anil S Keshavamurthy <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: [email protected] Cc: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86/kprobes: Refactor kprobes_fault() like kprobe_exceptions_notify()Jann Horn1-11/+13
This is an extension of commit b506a9d08bae ("x86: code clarification patch to Kprobes arch code"). As that commit explains, even though kprobe_running() can't be called with preemption enabled, preemption does not need to be disabled. If preemption is enabled, then this can't be originate from a kprobe. Also, use X86_TRAP_PF instead of 14. Signed-off-by: Jann Horn <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Kees Cook <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: [email protected] Cc: [email protected] Cc: "Naveen N. Rao" <[email protected]> Cc: Anil S Keshavamurthy <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Alexander Viro <[email protected]> Cc: [email protected] Cc: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-09-03x86: Fix kernel-doc atomic.h warningsRandy Dunlap3-16/+16
Fix kernel-doc warnings in arch/x86/include/asm/atomic.h that are caused by having a #define macro between the kernel-doc notation and the function name. Fixed by moving the #define macro to after the function implementation. Make the same change for atomic64_{32,64}.h for consistency even though there were no kernel-doc warnings found in these header files, but there would be if they were used in generation of documentation. Fixes these kernel-doc warnings: ../arch/x86/include/asm/atomic.h:84: warning: Excess function parameter 'i' description in 'arch_atomic_sub_and_test' ../arch/x86/include/asm/atomic.h:84: warning: Excess function parameter 'v' description in 'arch_atomic_sub_and_test' ../arch/x86/include/asm/atomic.h:96: warning: Excess function parameter 'v' description in 'arch_atomic_inc' ../arch/x86/include/asm/atomic.h:109: warning: Excess function parameter 'v' description in 'arch_atomic_dec' ../arch/x86/include/asm/atomic.h:124: warning: Excess function parameter 'v' description in 'arch_atomic_dec_and_test' ../arch/x86/include/asm/atomic.h:138: warning: Excess function parameter 'v' description in 'arch_atomic_inc_and_test' ../arch/x86/include/asm/atomic.h:153: warning: Excess function parameter 'i' description in 'arch_atomic_add_negative' ../arch/x86/include/asm/atomic.h:153: warning: Excess function parameter 'v' description in 'arch_atomic_add_negative' Fixes: 18cc1814d4e7 ("atomics/treewide: Make test ops optional") Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Mark Rutland <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-09-02Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds20-46/+167
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "Speculation: - Make the microcode check more robust - Make the L1TF memory limit depend on the internal cache physical address space and not on the CPUID advertised physical address space, which might be significantly smaller. This avoids disabling L1TF on machines which utilize the full physical address space. - Fix the GDT mapping for EFI calls on 32bit PTI - Fix the MCE nospec implementation to prevent #GP Fixes and robustness: - Use the proper operand order for LSL in the VDSO - Prevent NMI uaccess race against CR3 switching - Add a lockdep check to verify that text_mutex is held in text_poke() functions - Repair the fallout of giving native_restore_fl() a prototype - Prevent kernel memory dumps based on usermode RIP - Wipe KASAN shadow stack before rewinding the stack to prevent false positives - Move the AMS GOTO enforcement to the actual build stage to allow user API header extraction without a compiler - Fix a section mismatch introduced by the on demand VDSO mapping change Miscellaneous: - Trivial typo, GCC quirk removal and CC_SET/OUT() cleanups" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/pti: Fix section mismatch warning/error x86/vdso: Fix lsl operand order x86/mce: Fix set_mce_nospec() to avoid #GP fault x86/efi: Load fixmap GDT in efi_call_phys_epilog() x86/nmi: Fix NMI uaccess race against CR3 switching x86: Allow generating user-space headers without a compiler x86/dumpstack: Don't dump kernel memory based on usermode RIP x86/asm: Use CC_SET()/CC_OUT() in __gen_sigismember() x86/alternatives: Lockdep-enforce text_mutex in text_poke*() x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit() x86/irqflags: Mark native_restore_fl extern inline x86/build: Remove jump label quirk for GCC older than 4.5.2 x86/Kconfig: Fix trivial typo x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+ x86/spectre: Add missing family 6 check to microcode check
2018-09-02x86/microcode: Update the new microcode revision unconditionallyFilippo Sironi2-14/+21
Handle the case where microcode gets loaded on the BSP's hyperthread sibling first and the boot_cpu_data's microcode revision doesn't get updated because of early exit due to the siblings sharing a microcode engine. For that, simply write the updated revision on all CPUs unconditionally. Signed-off-by: Filippo Sironi <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected]
2018-09-02x86/microcode: Make sure boot_cpu_data.microcode is up-to-datePrarit Bhargava2-0/+8
When preparing an MCE record for logging, boot_cpu_data.microcode is used to read out the microcode revision on the box. However, on systems where late microcode update has happened, the microcode revision output in a MCE log record is wrong because boot_cpu_data.microcode is not updated when the microcode gets updated. But, the microcode revision saved in boot_cpu_data's microcode member should be kept up-to-date, regardless, for consistency. Make it so. Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records") Signed-off-by: Prarit Bhargava <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected]
2018-09-02x86/microcode: Make revision and processor flags world-readableJacek Tomaka1-2/+2
The microcode revision is already readable for non-root users via /proc/cpuinfo. Thus, there's no reason to keep the same information readable by root only in /sys/devices/system/cpu/cpuX/microcode/. Make .../processor_flags world-readable too, while at it. Reported-by: Tim Burgess <[email protected]> Signed-off-by: Jacek Tomaka <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected]
2018-09-02x86/pti: Fix section mismatch warning/errorRandy Dunlap1-1/+1
Fix the section mismatch warning in arch/x86/mm/pti.c: WARNING: vmlinux.o(.text+0x6972a): Section mismatch in reference from the function pti_clone_pgtable() to the function .init.text:pti_user_pagetable_walk_pte() The function pti_clone_pgtable() references the function __init pti_user_pagetable_walk_pte(). This is often because pti_clone_pgtable lacks a __init annotation or the annotation of pti_user_pagetable_walk_pte is wrong. FATAL: modpost: Section mismatches detected. Fixes: 85900ea51577 ("x86/pti: Map the vsyscall page if needed") Reported-by: kbuild test robot <[email protected]> Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Andy Lutomirski <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-09-01x86/vdso: Fix lsl operand orderSamuel Neves1-1/+1
In the __getcpu function, lsl is using the wrong target and destination registers. Luckily, the compiler tends to choose %eax for both variables, so it has been working so far. Fixes: a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available") Signed-off-by: Samuel Neves <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-09-01x86/mce: Fix set_mce_nospec() to avoid #GP faultLuckTony1-1/+24
The trick with flipping bit 63 to avoid loading the address of the 1:1 mapping of the poisoned page while the 1:1 map is updated used to work when unmapping the page. But it falls down horribly when attempting to directly set the page as uncacheable. The problem is that when the cache mode is changed to uncachable, the pages needs to be flushed from the cache first. But the decoy address is non-canonical due to bit 63 flipped, and the CLFLUSH instruction throws a #GP fault. Add code to change_page_attr_set_clr() to fix the address before calling flush. Fixes: 284ce4011ba6 ("x86/memory_failure: Introduce {set, clear}_mce_nospec()") Suggested-by: Linus Torvalds <[email protected]> Signed-off-by: Tony Luck <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Linus Torvalds <[email protected]> Cc: Peter Anvin <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: linux-edac <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Jiang <[email protected]> Link: https://lkml.kernel.org/r/20180831165506.GA9605@agluck-desk
2018-08-31x86/efi: Load fixmap GDT in efi_call_phys_epilog()Joerg Roedel1-6/+2
When PTI is enabled on x86-32 the kernel uses the GDT mapped in the fixmap for the simple reason that this address is also mapped for user-space. The efi_call_phys_prolog()/efi_call_phys_epilog() wrappers change the GDT to call EFI runtime services and switch back to the kernel GDT when they return. But the switch-back uses the writable GDT, not the fixmap GDT. When that happened and and the CPU returns to user-space it switches to the user %cr3 and tries to restore user segment registers. This fails because the writable GDT is not mapped in the user page-table, and without a GDT the fault handlers also can't be launched. The result is a triple fault and reboot of the machine. Fix that by restoring the GDT back to the fixmap GDT which is also mapped in the user page-table. Fixes: 7757d607c6b3 x86/pti: ('Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32') Reported-by: Guenter Roeck <[email protected]> Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Guenter Roeck <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Pavel Machek <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-31Merge tag 'for-linus-4.19b-rc2-tag' of ↵Linus Torvalds2-10/+6
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - minor cleanup avoiding a warning when building with new gcc - a patch to add a new sysfs node for Xen frontend/backend drivers to make it easier to obtain the state of a pv device - two fixes for 32-bit pv-guests to avoid intermediate L1TF vulnerable PTEs * tag 'for-linus-4.19b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: remove redundant variable save_pud xen: export device state to sysfs x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear x86/xen: don't write ptes directly in 32-bit PV guests
2018-08-31x86/nmi: Fix NMI uaccess race against CR3 switchingAndy Lutomirski4-1/+53
A NMI can hit in the middle of context switching or in the middle of switch_mm_irqs_off(). In either case, CR3 might not match current->mm, which could cause copy_from_user_nmi() and friends to read the wrong memory. Fix it by adding a new nmi_uaccess_okay() helper and checking it in copy_from_user_nmi() and in __copy_from_user_nmi()'s callers. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Jann Horn <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/dd956eba16646fd0b15c3c0741269dfd84452dac.1535557289.git.luto@kernel.org
2018-08-31x86: Allow generating user-space headers without a compilerBen Hutchings1-4/+7
When bootstrapping an architecture, it's usual to generate the kernel's user-space headers (make headers_install) before building a compiler. Move the compiler check (for asm goto support) to the archprepare target so that it is only done when building code for the target. Fixes: e501ce957a78 ("x86: Force asm-goto") Reported-by: Helmut Grohne <[email protected]> Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-31x86/dumpstack: Don't dump kernel memory based on usermode RIPJann Horn3-5/+15
show_opcodes() is used both for dumping kernel instructions and for dumping user instructions. If userspace causes #PF by jumping to a kernel address, show_opcodes() can be reached with regs->ip controlled by the user, pointing to kernel code. Make sure that userspace can't trick us into dumping kernel memory into dmesg. Fixes: 7cccf0725cf7 ("x86/dumpstack: Add a show_ip() function") Signed-off-by: Jann Horn <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Kees Cook <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-30KVM: x86: Unexport x86_emulate_instruction()Sean Christopherson3-15/+18
Allowing x86_emulate_instruction() to be called directly has led to subtle bugs being introduced, e.g. not setting EMULTYPE_NO_REEXECUTE in the emulation type. While most of the blame lies on re-execute being opt-out, exporting x86_emulate_instruction() also exposes its cr2 parameter, which may have contributed to commit d391f1207067 ("x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested") using x86_emulate_instruction() instead of emulate_instruction() because "hey, I have a cr2!", which in turn introduced its EMULTYPE_NO_REEXECUTE bug. Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: x86: Rename emulate_instruction() to kvm_emulate_instruction()Sean Christopherson4-17/+17
Lack of the kvm_ prefix gives the impression that it's a VMX or SVM specific function, and there's no conflict that prevents adding the kvm_ prefix. Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: x86: Do not re-{try,execute} after failed emulation in L2Sean Christopherson2-2/+11
Commit a6f177efaa58 ("KVM: Reenter guest after emulation failure if due to access to non-mmio address") added reexecute_instruction() to handle the scenario where two (or more) vCPUS race to write a shadowed page, i.e. reexecute_instruction() is intended to return true if and only if the instruction being emulated was accessing a shadowed page. As L0 is only explicitly shadowing L1 tables, an emulation failure of a nested VM instruction cannot be due to a race to write a shadowed page and so should never be re-executed. This fixes an issue where an "MMIO" emulation failure[1] in L2 is all but guaranteed to result in an infinite loop when TDP is enabled. Because "cr2" is actually an L2 GPA when TDP is enabled, calling kvm_mmu_gva_to_gpa_write() to translate cr2 in the non-direct mapped case (L2 is never direct mapped) will almost always yield UNMAPPED_GVA and cause reexecute_instruction() to immediately return true. The !mmio_info_in_cache() check in kvm_mmu_page_fault() doesn't catch this case because mmio_info_in_cache() returns false for a nested MMU (the MMIO caching currently handles L1 only, e.g. to cache nested guests' GPAs we'd have to manually flush the cache when switching between VMs and when L1 updated its page tables controlling the nested guest). Way back when, commit 68be0803456b ("KVM: x86: never re-execute instruction with enabled tdp") changed reexecute_instruction() to always return false when using TDP under the assumption that KVM would only get into the emulator for MMIO. Commit 95b3cf69bdf8 ("KVM: x86: let reexecute_instruction work for tdp") effectively reverted that behavior in order to handle the scenario where emulation failed due to an access from L1 to the shadow page tables for L2, but it didn't account for the case where emulation failed in L2 with TDP enabled. All of the above logic also applies to retry_instruction(), added by commit 1cb3f3ae5a38 ("KVM: x86: retry non-page-table writing instructions"). An indefinite loop in retry_instruction() should be impossible as it protects against retrying the same instruction over and over, but it's still correct to not retry an L2 instruction in the first place. Fix the immediate issue by adding a check for a nested guest when determining whether or not to allow retry in kvm_mmu_page_fault(). In addition to fixing the immediate bug, add WARN_ON_ONCE in the retry functions since they are not designed to handle nested cases, i.e. they need to be modified even if there is some scenario in the future where we want to allow retrying a nested guest. [1] This issue was encountered after commit 3a2936dedd20 ("kvm: mmu: Don't expose private memslots to L2") changed the page fault path to return KVM_PFN_NOSLOT when translating an L2 access to a prive memslot. Returning KVM_PFN_NOSLOT is semantically correct when we want to hide a memslot from L2, i.e. there effectively is no defined memory region for L2, but it has the unfortunate side effect of making KVM think the GFN is a MMIO page, thus triggering emulation. The failure occurred with in-development code that deliberately exposed a private memslot to L2, which L2 accessed with an instruction that is not emulated by KVM. Fixes: 95b3cf69bdf8 ("KVM: x86: let reexecute_instruction work for tdp") Fixes: 1cb3f3ae5a38 ("KVM: x86: retry non-page-table writing instructions") Signed-off-by: Sean Christopherson <[email protected]> Cc: Jim Mattson <[email protected]> Cc: Krish Sadhukhan <[email protected]> Cc: Xiao Guangrong <[email protected]> Cc: [email protected] Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: x86: Default to not allowing emulation retry in kvm_mmu_page_faultSean Christopherson1-6/+12
Effectively force kvm_mmu_page_fault() to opt-in to allowing retry to make it more obvious when and why it allows emulation to be retried. Previously this approach was less convenient due to retry and re-execute behavior being controlled by separate flags that were also inverted in their implementations (opt-in versus opt-out). Suggested-by: Paolo Bonzini <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Cc: [email protected] Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: x86: Merge EMULTYPE_RETRY and EMULTYPE_ALLOW_REEXECUTESean Christopherson3-7/+6
retry_instruction() and reexecute_instruction() are a package deal, i.e. there is no scenario where one is allowed and the other is not. Merge their controlling emulation type flags to enforce this in code. Name the combined flag EMULTYPE_ALLOW_RETRY to make it abundantly clear that we are allowing re{try,execute} to occur, as opposed to explicitly requesting retry of a previously failed instruction. Signed-off-by: Sean Christopherson <[email protected]> Cc: [email protected] Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: x86: Invert emulation re-execute behavior to make it opt-inSean Christopherson3-7/+5
Re-execution of an instruction after emulation decode failure is intended to be used only when emulating shadow page accesses. Invert the flag to make allowing re-execution opt-in since that behavior is by far in the minority. Signed-off-by: Sean Christopherson <[email protected]> Cc: [email protected] Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: x86: SVM: Set EMULTYPE_NO_REEXECUTE for RSM emulationSean Christopherson2-2/+9
Re-execution after an emulation decode failure is only intended to handle a case where two or vCPUs race to write a shadowed page, i.e. we should never re-execute an instruction as part of RSM emulation. Add a new helper, kvm_emulate_instruction_from_buffer(), to support emulating from a pre-defined buffer. This eliminates the last direct call to x86_emulate_instruction() outside of kvm_mmu_page_fault(), which means x86_emulate_instruction() can be unexported in a future patch. Fixes: 7607b7174405 ("KVM: SVM: install RSM intercept") Cc: Brijesh Singh <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Cc: [email protected] Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: VMX: Do not allow reexecute_instruction() when skipping MMIO instrSean Christopherson1-2/+2
Re-execution after an emulation decode failure is only intended to handle a case where two or vCPUs race to write a shadowed page, i.e. we should never re-execute an instruction as part of MMIO emulation. As handle_ept_misconfig() is only used for MMIO emulation, it should pass EMULTYPE_NO_REEXECUTE when using the emulator to skip an instr in the fast-MMIO case where VM_EXIT_INSTRUCTION_LEN is invalid. And because the cr2 value passed to x86_emulate_instruction() is only destined for use when retrying or reexecuting, we can simply call emulate_instruction(). Fixes: d391f1207067 ("x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested") Cc: Vitaly Kuznetsov <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Cc: [email protected] Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: SVM: remove unused variable dst_vaddr_endColin Ian King1-2/+1
Variable dst_vaddr_end is being assigned but is never used hence it is redundant and can be removed. Cleans up clang warning: variable 'dst_vaddr_end' set but not used [-Wunused-but-set-variable] Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: nVMX: avoid redundant double assignment of nested_run_pendingVitaly Kuznetsov1-3/+0
nested_run_pending is set 20 lines above and check_vmentry_prereqs()/ check_vmentry_postreqs() don't seem to be resetting it (the later, however, checks it). Signed-off-by: Vitaly Kuznetsov <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Jim Mattson <[email protected]> Reviewed-by: Eduardo Valentin <[email protected]> Reviewed-by: Krish Sadhukhan <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30x86/asm: Use CC_SET()/CC_OUT() in __gen_sigismember()Uros Bizjak1-3/+4
Replace open-coded set instructions with CC_SET()/CC_OUT(). Signed-off-by: Uros Bizjak <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-30x86/alternatives: Lockdep-enforce text_mutex in text_poke*()Jiri Kosina1-4/+5
text_poke() and text_poke_bp() must be called with text_mutex held. Put proper lockdep anotation in place instead of just mentioning the requirement in a comment. Reported-by: Peter Zijlstra <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Andy Lutomirski <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-30x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()Jann Horn1-0/+4
Reset the KASAN shadow state of the task stack before rewinding RSP. Without this, a kernel oops will leave parts of the stack poisoned, and code running under do_exit() can trip over such poisoned regions and cause nonsensical false-positive KASAN reports about stack-out-of-bounds bugs. This does not wipe the exception stacks; if an oops happens on an exception stack, it might result in random KASAN false-positives from other tasks afterwards. This is probably relatively uninteresting, since if the kernel oopses on an exception stack, there are most likely bigger things to worry about. It'd be more interesting if vmapped stacks and KASAN were compatible, since then handle_stack_overflow() would oops from exception stack context. Fixes: 2deb4be28077 ("x86/dumpstack: When OOPSing, rewind the stack before do_exit()") Signed-off-by: Jann Horn <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Andrey Ryabinin <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Kees Cook <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-30x86/irqflags: Mark native_restore_fl extern inlineNick Desaulniers1-1/+2
This should have been marked extern inline in order to pick up the out of line definition in arch/x86/kernel/irqflags.S. Fixes: 208cbb325589 ("x86/irqflags: Provide a declaration for native_save_fl") Reported-by: Ben Hutchings <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-30x86/build: Remove jump label quirk for GCC older than 4.5.2Masahiro Yamada1-12/+0
Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6") bumped the minimum GCC version to 4.6 for all architectures. Remove the workaround code. It was the only user of cc-if-fullversion. Remove the macro as well. Signed-off-by: Masahiro Yamada <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Michal Marek <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-29Merge branch 'linus' of ↵Linus Torvalds1-33/+33
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Check for the right CPU feature bit in sm4-ce on arm64. - Fix scatterwalk WARN_ON in aes-gcm-ce on arm64. - Fix unaligned fault in aesni on x86. - Fix potential NULL pointer dereference on exit in chtls. - Fix DMA mapping direction for RSA in caam. - Fix error path return value for xts setkey in caam. - Fix address endianness when DMA unmapping in caam. - Fix sleep-in-atomic in vmx. - Fix command corruption when queue is full in cavium/nitrox. * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: cavium/nitrox - fix for command corruption in queue full case with backlog submissions. crypto: vmx - Fix sleep-in-atomic bugs crypto: arm64/aes-gcm-ce - fix scatterwalk API violation crypto: aesni - Use unaligned loads from gcm_context_data crypto: chtls - fix null dereference chtls_free_uld() crypto: arm64/sm4-ce - check for the right CPU feature bit crypto: caam - fix DMA mapping direction for RSA forms 2 & 3 crypto: caam/qi - fix error path in xts setkey crypto: caam/jr - fix descriptor DMA unmapping
2018-08-29y2038: utimes: Rework #ifdef guards for compat syscallsArnd Bergmann1-0/+1
After changing over to 64-bit time_t syscalls, many architectures will want compat_sys_utimensat() but not respective handlers for utime(), utimes() and futimesat(). This adds a new __ARCH_WANT_SYS_UTIME32 to complement __ARCH_WANT_SYS_UTIME. For now, all 64-bit architectures that support CONFIG_COMPAT set it, but future 64-bit architectures will not (tile would not have needed it either, but got removed). As older 32-bit architectures get converted to using CONFIG_64BIT_TIME, they will have to use __ARCH_WANT_SYS_UTIME32 instead of __ARCH_WANT_SYS_UTIME. Architectures using the generic syscall ABI don't need either of them as they never had a utime syscall. Since the compat_utimbuf structure is now required outside of CONFIG_COMPAT, I'm moving it into compat_time.h. Signed-off-by: Arnd Bergmann <[email protected]> --- changed from last version: - renamed __ARCH_WANT_COMPAT_SYS_UTIME to __ARCH_WANT_SYS_UTIME32