aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu/common.c
AgeCommit message (Collapse)AuthorFilesLines
2023-02-21Merge tag 'x86_cpu_for_v6.3_rc1' of ↵Linus Torvalds1-11/+16
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpuid updates from Borislav Petkov: - Cache the AMD debug registers in per-CPU variables to avoid MSR writes where possible, when supporting a debug registers swap feature for SEV-ES guests - Add support for AMD's version of eIBRS called Automatic IBRS which is a set-and-forget control of indirect branch restriction speculation resources on privilege change - Add support for a new x86 instruction - LKGS - Load kernel GS which is part of the FRED infrastructure - Reset SPEC_CTRL upon init to accomodate use cases like kexec which rediscover - Other smaller fixes and cleanups * tag 'x86_cpu_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/amd: Cache debug register values in percpu variables KVM: x86: Propagate the AMD Automatic IBRS feature to the guest x86/cpu: Support AMD Automatic IBRS x86/cpu, kvm: Add the SMM_CTL MSR not present feature x86/cpu, kvm: Add the Null Selector Clears Base feature x86/cpu, kvm: Move X86_FEATURE_LFENCE_RDTSC to its native leaf x86/cpu, kvm: Add the NO_NESTED_DATA_BP feature KVM: x86: Move open-coded CPUID leaf 0x80000021 EAX bit propagation code x86/cpu, kvm: Add support for CPUID_80000021_EAX x86/gsseg: Add the new <asm/gsseg.h> header to <asm/asm-prototypes.h> x86/gsseg: Use the LKGS instruction if available for load_gs_index() x86/gsseg: Move load_gs_index() to its own new header file x86/gsseg: Make asm_load_gs_index() take an u16 x86/opcode: Add the LKGS instruction to x86-opcode-map x86/cpufeature: Add the CPU feature bit for LKGS x86/bugs: Reset speculation control settings on init x86/cpu: Remove redundant extern x86_read_arch_cap_msr()
2023-02-21Merge tag 'x86-cleanups-2023-02-20' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull miscellaneous x86 cleanups from Thomas Gleixner: - Correct the common copy and pasted mishandling of kstrtobool() in the strict_sas_size() setup function - Make recalibrate_cpu_khz() an GPL only export - Check TSC feature before doing anything else which avoids pointless code execution if TSC is not available - Remove or fixup stale and misleading comments - Remove unused or pointelessly duplicated variables - Spelling and typo fixes * tag 'x86-cleanups-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/hotplug: Remove incorrect comment about mwait_play_dead() x86/tsc: Do feature check as the very first thing x86/tsc: Make recalibrate_cpu_khz() export GPL only x86/cacheinfo: Remove unused trace variable x86/Kconfig: Fix spellos & punctuation x86/signal: Fix the value returned by strict_sas_size() x86/cpu: Remove misleading comment x86/setup: Move duplicate boot_cpu_data definition out of the ifdeffery x86/boot/e820: Fix typo in e820.c comment
2023-02-21Merge tag 'x86_vdso_for_v6.3_rc1' of ↵Linus Torvalds1-4/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 vdso updates from Borislav Petkov: - Add getcpu support for the 32-bit version of the vDSO - Some smaller fixes * tag 'x86_vdso_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso: Fix -Wmissing-prototypes warnings x86/vdso: Fake 32bit VDSO build on 64bit compile for vgetcpu selftests: Emit a warning if getcpu() is missing on 32bit x86/vdso: Provide getcpu for x86-32. x86/cpu: Provide the full setup for getcpu() on x86-32 x86/vdso: Move VDSO image init to vdso2c generated code
2023-02-21Merge tag 'x86_microcode_for_v6.3_rc1' of ↵Linus Torvalds1-15/+30
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode loader updates from Borislav Petkov: - Fix mixed steppings support on AMD which got broken somewhere along the way - Improve revision reporting - Properly check CPUID capabilities after late microcode upgrade to avoid false positives - A garden variety of other small fixes * tag 'x86_microcode_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode/core: Return an error only when necessary x86/microcode/AMD: Fix mixed steppings support x86/microcode/AMD: Add a @cpu parameter to the reloading functions x86/microcode/amd: Remove load_microcode_amd()'s bsp parameter x86/microcode: Allow only "1" as a late reload trigger value x86/microcode/intel: Print old and new revision during early boot x86/microcode/intel: Pass the microcode revision to print_ucode_info() directly x86/microcode: Adjust late loading result reporting message x86/microcode: Check CPU capabilities after late microcode update correctly x86/microcode: Add a parameter to microcode_check() to store CPU capabilities x86/microcode: Use the DEVICE_ATTR_RO() macro x86/microcode/AMD: Handle multiple glued containers properly x86/microcode/AMD: Rename a couple of functions
2023-02-10x86/speculation: Identify processors vulnerable to SMT RSB predictionsTom Lendacky1-2/+7
Certain AMD processors are vulnerable to a cross-thread return address predictions bug. When running in SMT mode and one of the sibling threads transitions out of C0 state, the other sibling thread could use return target predictions from the sibling thread that transitioned out of C0. The Spectre v2 mitigations cover the Linux kernel, as it fills the RSB when context switching to the idle thread. However, KVM allows a VMM to prevent exiting guest mode when transitioning out of C0. A guest could act maliciously in this situation, so create a new x86 BUG that can be used to detect if the processor is vulnerable. Reviewed-by: Borislav Petkov (AMD) <[email protected]> Signed-off-by: Tom Lendacky <[email protected]> Message-Id: <91cec885656ca1fcd4f0185ce403a53dd9edecb7.1675956146.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <[email protected]>
2023-02-09efi: x86: Wire up IBT annotation in memory attributes tableArd Biesheuvel1-2/+3
UEFI v2.10 extends the EFI memory attributes table with a flag that indicates whether or not all RuntimeServicesCode regions were constructed with ENDBR landing pads, permitting the OS to map these regions with IBT restrictions enabled. So let's take this into account on x86 as well. Suggested-by: Peter Zijlstra <[email protected]> # ibt_save() changes Signed-off-by: Ard Biesheuvel <[email protected]> Acked-by: Dave Hansen <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2023-02-06x86/cpu: Provide the full setup for getcpu() on x86-32Sebastian Andrzej Siewior1-3/+1
setup_getcpu() configures two things: - it writes the current CPU & node information into MSR_TSC_AUX - it writes the same information as a GDT entry. By using the "full" setup_getcpu() on i386 it is possible to read the CPU information in userland via RDTSCP() or via LSL from the GDT. Provide an GDT_ENTRY_CPUNODE for x86-32 and make the setup function unconditionally available. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Roland Mainz <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-01-25x86/cpu: Support AMD Automatic IBRSKim Phillips1-8/+11
The AMD Zen4 core supports a new feature called Automatic IBRS. It is a "set-and-forget" feature that means that, like Intel's Enhanced IBRS, h/w manages its IBRS mitigation resources automatically across CPL transitions. The feature is advertised by CPUID_Fn80000021_EAX bit 8 and is enabled by setting MSR C000_0080 (EFER) bit 21. Enable Automatic IBRS by default if the CPU feature is present. It typically provides greater performance over the incumbent generic retpolines mitigation. Reuse the SPECTRE_V2_EIBRS spectre_v2_mitigation enum. AMD Automatic IBRS and Intel Enhanced IBRS have similar enablement. Add NO_EIBRS_PBRSB to cpu_vuln_whitelist, since AMD Automatic IBRS isn't affected by PBRSB-eIBRS. The kernel command line option spectre_v2=eibrs is used to select AMD Automatic IBRS, if available. Signed-off-by: Kim Phillips <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Acked-by: Sean Christopherson <[email protected]> Acked-by: Dave Hansen <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-01-25x86/cpu, kvm: Add the Null Selector Clears Base featureKim Phillips1-3/+1
The Null Selector Clears Base feature was being open-coded for KVM. Add it to its newly added native CPUID leaf 0x80000021 EAX proper. Also drop the bit description comments now it's more self-describing. [ bp: Convert test in check_null_seg_clears_base() too. ] Signed-off-by: Kim Phillips <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Acked-by: Sean Christopherson <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-01-25x86/vdso: Move VDSO image init to vdso2c generated codeBrian Gerst1-1/+0
Generate an init function for each VDSO image, replacing init_vdso() and sysenter_setup(). Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-01-25x86/cpu, kvm: Add support for CPUID_80000021_EAXKim Phillips1-0/+3
Add support for CPUID leaf 80000021, EAX. The majority of the features will be used in the kernel and thus a separate leaf is appropriate. Include KVM's reverse_cpuid entry because features are used by VM guests, too. [ bp: Massage commit message. ] Signed-off-by: Kim Phillips <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Acked-by: Sean Christopherson <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-01-21x86/microcode: Check CPU capabilities after late microcode update correctlyAshok Raj1-13/+23
The kernel caches each CPU's feature bits at boot in an x86_capability[] structure. However, the capabilities in the BSP's copy can be turned off as a result of certain command line parameters or configuration restrictions, for example the SGX bit. This can cause a mismatch when comparing the values before and after the microcode update. Another example is X86_FEATURE_SRBDS_CTRL which gets added only after microcode update: --- cpuid.before 2023-01-21 14:54:15.652000747 +0100 +++ cpuid.after 2023-01-21 14:54:26.632001024 +0100 @@ -10,7 +10,7 @@ CPU: 0x00000004 0x04: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 0x00000005 0x00: eax=0x00000040 ebx=0x00000040 ecx=0x00000003 edx=0x11142120 0x00000006 0x00: eax=0x000027f7 ebx=0x00000002 ecx=0x00000001 edx=0x00000000 - 0x00000007 0x00: eax=0x00000000 ebx=0x029c6fbf ecx=0x40000000 edx=0xbc002400 + 0x00000007 0x00: eax=0x00000000 ebx=0x029c6fbf ecx=0x40000000 edx=0xbc002e00 ^^^ and which proves for a gazillionth time that late loading is a bad bad idea. microcode_check() is called after an update to report any previously cached CPUID bits which might have changed due to the update. Therefore, store the cached CPU caps before the update and compare them with the CPU caps after the microcode update has succeeded. Thus, the comparison is done between the CPUID *hardware* bits before and after the upgrade instead of using the cached, possibly runtime modified values in BSP's boot_cpu_data copy. As a result, false warnings about CPUID bits changes are avoided. [ bp: - Massage. - Add SRBDS_CTRL example. - Add kernel-doc. - Incorporate forgotten review feedback from dhansen. ] Fixes: 1008c52c09dc ("x86/CPU: Add a microcode loader callback") Signed-off-by: Ashok Raj <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-01-20x86/microcode: Add a parameter to microcode_check() to store CPU capabilitiesAshok Raj1-8/+13
Add a parameter to store CPU capabilities before performing a microcode update so that CPU capabilities can be compared before and after update. [ bp: Massage. ] Signed-off-by: Ashok Raj <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-01-13x86/cpu: Remove misleading commentJuergen Gross1-1/+1
The comment of the "#endif" after setup_disable_pku() is wrong. As the related #ifdef is only a few lines above, just remove the comment. Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-01-13x86/gsseg: Use the LKGS instruction if available for load_gs_index()H. Peter Anvin (Intel)1-0/+1
The LKGS instruction atomically loads a segment descriptor into the %gs descriptor registers, *except* that %gs.base is unchanged, and the base is instead loaded into MSR_IA32_KERNEL_GS_BASE, which is exactly what we want this function to do. Signed-off-by: H. Peter Anvin (Intel) <[email protected]> Signed-off-by: Xin Li <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Linus Torvalds <[email protected]>
2022-12-14Merge tag 'x86_core_for_v6.2' of ↵Linus Torvalds1-53/+44
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 core updates from Borislav Petkov: - Add the call depth tracking mitigation for Retbleed which has been long in the making. It is a lighterweight software-only fix for Skylake-based cores where enabling IBRS is a big hammer and causes a significant performance impact. What it basically does is, it aligns all kernel functions to 16 bytes boundary and adds a 16-byte padding before the function, objtool collects all functions' locations and when the mitigation gets applied, it patches a call accounting thunk which is used to track the call depth of the stack at any time. When that call depth reaches a magical, microarchitecture-specific value for the Return Stack Buffer, the code stuffs that RSB and avoids its underflow which could otherwise lead to the Intel variant of Retbleed. This software-only solution brings a lot of the lost performance back, as benchmarks suggest: https://lore.kernel.org/all/[email protected]/ That page above also contains a lot more detailed explanation of the whole mechanism - Implement a new control flow integrity scheme called FineIBT which is based on the software kCFI implementation and uses hardware IBT support where present to annotate and track indirect branches using a hash to validate them - Other misc fixes and cleanups * tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits) x86/paravirt: Use common macro for creating simple asm paravirt functions x86/paravirt: Remove clobber bitmask from .parainstructions x86/debug: Include percpu.h in debugreg.h to get DECLARE_PER_CPU() et al x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit x86/Kconfig: Enable kernel IBT by default x86,pm: Force out-of-line memcpy() objtool: Fix weak hole vs prefix symbol objtool: Optimize elf_dirty_reloc_sym() x86/cfi: Add boot time hash randomization x86/cfi: Boot time selection of CFI scheme x86/ibt: Implement FineIBT objtool: Add --cfi to generate the .cfi_sites section x86: Add prefix symbols for function padding objtool: Add option to generate prefix symbols objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf objtool: Slice up elf_create_section_symbol() kallsyms: Revert "Take callthunks into account" x86: Unconfuse CONFIG_ and X86_FEATURE_ namespaces x86/retpoline: Fix crash printing warning x86/paravirt: Fix a !PARAVIRT build warning ...
2022-12-13Merge tag 'x86_cpu_for_v6.2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu updates from Borislav Petkov: - Split MTRR and PAT init code to accomodate at least Xen PV and TDX guests which do not get MTRRs exposed but only PAT. (TDX guests do not support the cache disabling dance when setting up MTRRs so they fall under the same category) This is a cleanup work to remove all the ugly workarounds for such guests and init things separately (Juergen Gross) - Add two new Intel CPUs to the list of CPUs with "normal" Energy Performance Bias, leading to power savings - Do not do bus master arbitration in C3 (ARB_DISABLE) on modern Centaur CPUs * tag 'x86_cpu_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) x86/mtrr: Make message for disabled MTRRs more descriptive x86/pat: Handle TDX guest PAT initialization x86/cpuid: Carve out all CPUID functionality x86/cpu: Switch to cpu_feature_enabled() for X86_FEATURE_XENPV x86/cpu: Remove X86_FEATURE_XENPV usage in setup_cpu_entry_area() x86/cpu: Drop 32-bit Xen PV guest code in update_task_stack() x86/cpu: Remove unneeded 64-bit dependency in arch_enter_from_user_mode() x86/cpufeatures: Add X86_FEATURE_XENPV to disabled-features.h x86/acpi/cstate: Optimize ARB_DISABLE on Centaur CPUs x86/mtrr: Simplify mtrr_ops initialization x86/cacheinfo: Switch cache_ap_init() to hotplug callback x86: Decouple PAT and MTRR handling x86/mtrr: Add a stop_machine() handler calling only cache_cpu_init() x86/mtrr: Let cache_aps_delayed_init replace mtrr_aps_delayed_init x86/mtrr: Get rid of __mtrr_enabled bool x86/mtrr: Simplify mtrr_bp_init() x86/mtrr: Remove set_all callback from struct mtrr_ops x86/mtrr: Disentangle MTRR init from PAT init x86/mtrr: Move cache control code to cacheinfo.c x86/mtrr: Split MTRR-specific handling from cache dis/enabling ...
2022-11-18stackprotector: move get_random_canary() into stackprotector.hJason A. Donenfeld1-1/+1
This has nothing to do with random.c and everything to do with stack protectors. Yes, it uses randomness. But many things use randomness. random.h and random.c are concerned with the generation of randomness, not with each and every use. So move this function into the more specific stackprotector.h file where it belongs. Acked-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]>
2022-11-10x86/cacheinfo: Switch cache_ap_init() to hotplug callbackJuergen Gross1-1/+0
Instead of explicitly calling cache_ap_init() in identify_secondary_cpu() use a CPU hotplug callback instead. By registering the callback only after having started the non-boot CPUs and initializing cache_aps_delayed_init with "true", calling set_cache_aps_delayed_init() at boot time can be dropped. It should be noted that this change results in cache_ap_init() being called a little bit later when hotplugging CPUs. By using a new hotplug slot right at the start of the low level bringup this is not problematic, as no operations requiring a specific caching mode are performed that early in CPU initialization. Suggested-by: Borislav Petkov <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Borislav Petkov <[email protected]>
2022-11-10x86/mtrr: Add a stop_machine() handler calling only cache_cpu_init()Juergen Gross1-1/+2
Instead of having a stop_machine() handler for either a specific MTRR register or all state at once, add a handler just for calling cache_cpu_init() if appropriate. Add functions for calling stop_machine() with this handler as well. Add a generic replacement for mtrr_bp_restore() and a wrapper for mtrr_bp_init(). Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Borislav Petkov <[email protected]>
2022-11-01x86/ibt: Implement FineIBTPeter Zijlstra1-0/+1
Implement an alternative CFI scheme that merges both the fine-grained nature of kCFI but also takes full advantage of the coarse grained hardware CFI as provided by IBT. To contrast: kCFI is a pure software CFI scheme and relies on being able to read text -- specifically the instruction *before* the target symbol, and does the hash validation *before* doing the call (otherwise control flow is compromised already). FineIBT is a software and hardware hybrid scheme; by ensuring every branch target starts with a hash validation it is possible to place the hash validation after the branch. This has several advantages: o the (hash) load is avoided; no memop; no RX requirement. o IBT WAIT-FOR-ENDBR state is a speculation stop; by placing the hash validation in the immediate instruction after the branch target there is a minimal speculation window and the whole is a viable defence against SpectreBHB. o Kees feels obliged to mention it is slightly more vulnerable when the attacker can write code. Obviously this patch relies on kCFI, but additionally it also relies on the padding from the call-depth-tracking patches. It uses this padding to place the hash-validation while the call-sites are re-written to modify the indirect target to be 16 bytes in front of the original target, thus hitting this new preamble. Notably, there is no hardware that needs call-depth-tracking (Skylake) and supports IBT (Tigerlake and onwards). Suggested-by: Joao Moreira (Intel) <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-10-17x86/percpu: Move irq_stack variables next to current_taskThomas Gleixner1-3/+0
Further extend struct pcpu_hot with the hard and soft irq stack pointers. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-10-17x86/percpu: Move current_top_of_stack next to current_taskThomas Gleixner1-11/+1
Extend the struct pcpu_hot cacheline with current_top_of_stack; another very frequently used value. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-10-17x86/percpu: Move preempt_count next to current_taskThomas Gleixner1-7/+1
Add preempt_count to pcpu_hot, since it is once of the most used per-cpu variables. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-10-17x86: Put hot per CPU variables into a structThomas Gleixner1-9/+5
The layout of per-cpu variables is at the mercy of the compiler. This can lead to random performance fluctuations from build to build. Create a structure to hold some of the hottest per-cpu variables, starting with current_task. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-10-17x86/cpu: Re-enable stackprotectorThomas Gleixner1-0/+3
Commit 5416c2663517 ("x86: make sure load_percpu_segment has no stackprotector") disabled the stackprotector for cpu/common.c because of load_percpu_segment(). Back then the boot stack canary was initialized very early in start_kernel(). Switching the per CPU area by loading the GDT caused the stackprotector to fail with paravirt enabled kernels as the GSBASE was not updated yet. In hindsight a wrong change because it would have been sufficient to ensure that the canary is the same in both per CPU areas. Commit d55535232c3d ("random: move rand_initialize() earlier") moved the stack canary initialization to a later point in the init sequence. As a consequence the per CPU stack canary is 0 when switching the per CPU areas, so there is no requirement anymore to exclude this file. Add a comment to load_percpu_segment(). Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-10-17x86/cpu: Get rid of redundant switch_to_new_gdt() invocationsThomas Gleixner1-11/+6
The only place where switch_to_new_gdt() is required is early boot to switch from the early GDT to the direct GDT. Any other invocation is completely redundant because it does not change anything. Secondary CPUs come out of the ASM code with GDT and GSBASE correctly set up. The same is true for XEN_PV. Remove all the voodoo invocations which are left overs from the ancient past, rename the function to switch_gdt_and_percpu_base() and mark it init. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-10-17x86/cpu: Remove segment load from switch_to_new_gdt()Thomas Gleixner1-16/+31
On 32bit FS and on 64bit GS segments are already set up correctly, but load_percpu_segment() still sets [FG]S after switching from the early GDT to the direct GDT. For 32bit the segment load has no side effects, but on 64bit it causes GSBASE to become 0, which means that any per CPU access before GSBASE is set to the new value is going to fault. That's the reason why the whole file containing this code has stackprotector removed. But that's a pointless exercise for both 32 and 64 bit as the relevant segment selector is already correct. Loading the new GDT does not change that. Remove the segment loads and add comments. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-08-18x86/bugs: Add "unknown" reporting for MMIO Stale DataPawan Gupta1-15/+27
Older Intel CPUs that are not in the affected processor list for MMIO Stale Data vulnerabilities currently report "Not affected" in sysfs, which may not be correct. Vulnerability status for these older CPUs is unknown. Add known-not-affected CPUs to the whitelist. Report "unknown" mitigation status for CPUs that are not in blacklist, whitelist and also don't enumerate MSR ARCH_CAPABILITIES bits that reflect hardware immunity to MMIO Stale Data vulnerabilities. Mitigation is not deployed when the status is unknown. [ bp: Massage, fixup. ] Fixes: 8d50cdf8b834 ("x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data") Suggested-by: Andrew Cooper <[email protected]> Suggested-by: Tony Luck <[email protected]> Signed-off-by: Pawan Gupta <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/a932c154772f2121794a5f2eded1a11013114711.1657846269.git.pawan.kumar.gupta@linux.intel.com
2022-08-03x86/speculation: Add RSB VM Exit protectionsDaniel Sneddon1-2/+10
tl;dr: The Enhanced IBRS mitigation for Spectre v2 does not work as documented for RET instructions after VM exits. Mitigate it with a new one-entry RSB stuffing mechanism and a new LFENCE. == Background == Indirect Branch Restricted Speculation (IBRS) was designed to help mitigate Branch Target Injection and Speculative Store Bypass, i.e. Spectre, attacks. IBRS prevents software run in less privileged modes from affecting branch prediction in more privileged modes. IBRS requires the MSR to be written on every privilege level change. To overcome some of the performance issues of IBRS, Enhanced IBRS was introduced. eIBRS is an "always on" IBRS, in other words, just turn it on once instead of writing the MSR on every privilege level change. When eIBRS is enabled, more privileged modes should be protected from less privileged modes, including protecting VMMs from guests. == Problem == Here's a simplification of how guests are run on Linux' KVM: void run_kvm_guest(void) { // Prepare to run guest VMRESUME(); // Clean up after guest runs } The execution flow for that would look something like this to the processor: 1. Host-side: call run_kvm_guest() 2. Host-side: VMRESUME 3. Guest runs, does "CALL guest_function" 4. VM exit, host runs again 5. Host might make some "cleanup" function calls 6. Host-side: RET from run_kvm_guest() Now, when back on the host, there are a couple of possible scenarios of post-guest activity the host needs to do before executing host code: * on pre-eIBRS hardware (legacy IBRS, or nothing at all), the RSB is not touched and Linux has to do a 32-entry stuffing. * on eIBRS hardware, VM exit with IBRS enabled, or restoring the host IBRS=1 shortly after VM exit, has a documented side effect of flushing the RSB except in this PBRSB situation where the software needs to stuff the last RSB entry "by hand". IOW, with eIBRS supported, host RET instructions should no longer be influenced by guest behavior after the host retires a single CALL instruction. However, if the RET instructions are "unbalanced" with CALLs after a VM exit as is the RET in #6, it might speculatively use the address for the instruction after the CALL in #3 as an RSB prediction. This is a problem since the (untrusted) guest controls this address. Balanced CALL/RET instruction pairs such as in step #5 are not affected. == Solution == The PBRSB issue affects a wide variety of Intel processors which support eIBRS. But not all of them need mitigation. Today, X86_FEATURE_RSB_VMEXIT triggers an RSB filling sequence that mitigates PBRSB. Systems setting RSB_VMEXIT need no further mitigation - i.e., eIBRS systems which enable legacy IBRS explicitly. However, such systems (X86_FEATURE_IBRS_ENHANCED) do not set RSB_VMEXIT and most of them need a new mitigation. Therefore, introduce a new feature flag X86_FEATURE_RSB_VMEXIT_LITE which triggers a lighter-weight PBRSB mitigation versus RSB_VMEXIT. The lighter-weight mitigation performs a CALL instruction which is immediately followed by a speculative execution barrier (INT3). This steers speculative execution to the barrier -- just like a retpoline -- which ensures that speculation can never reach an unbalanced RET. Then, ensure this CALL is retired before continuing execution with an LFENCE. In other words, the window of exposure is opened at VM exit where RET behavior is troublesome. While the window is open, force RSB predictions sampling for RET targets to a dead end at the INT3. Close the window with the LFENCE. There is a subset of eIBRS systems which are not vulnerable to PBRSB. Add these systems to the cpu_vuln_whitelist[] as NO_EIBRS_PBRSB. Future systems that aren't vulnerable will set ARCH_CAP_PBRSB_NO. [ bp: Massage, incorporate review comments from Andy Cooper. ] Signed-off-by: Daniel Sneddon <[email protected]> Co-developed-by: Pawan Gupta <[email protected]> Signed-off-by: Pawan Gupta <[email protected]> Signed-off-by: Borislav Petkov <[email protected]>
2022-07-07x86/bugs: Add Cannon lake to RETBleed affected CPU listPawan Gupta1-0/+1
Cannon lake is also affected by RETBleed, add it to the list. Fixes: 6ad0ad2bf8a6 ("x86/bugs: Report Intel retbleed vulnerability") Signed-off-by: Pawan Gupta <[email protected]> Signed-off-by: Borislav Petkov <[email protected]>
2022-06-27x86/cpu/amd: Enumerate BTC_NOAndrew Cooper1-2/+4
BTC_NO indicates that hardware is not susceptible to Branch Type Confusion. Zen3 CPUs don't suffer BTC. Hypervisors are expected to synthesise BTC_NO when it is appropriate given the migration pool, to prevent kernels using heuristics. [ bp: Massage. ] Signed-off-by: Andrew Cooper <[email protected]> Signed-off-by: Borislav Petkov <[email protected]>
2022-06-27x86/common: Stamp out the stepping madnessPeter Zijlstra1-21/+16
The whole MMIO/RETBLEED enumeration went overboard on steppings. Get rid of all that and simply use ANY. If a future stepping of these models would not be affected, it had better set the relevant ARCH_CAP_$FOO_NO bit in IA32_ARCH_CAPABILITIES. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Dave Hansen <[email protected]> Signed-off-by: Borislav Petkov <[email protected]>
2022-06-27x86/bugs: Report Intel retbleed vulnerabilityPeter Zijlstra1-12/+12
Skylake suffers from RSB underflow speculation issues; report this vulnerability and it's mitigation (spectre_v2=ibrs). [jpoimboe: cleanups, eibrs] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]> Signed-off-by: Borislav Petkov <[email protected]>
2022-06-27x86/bugs: Report AMD retbleed vulnerabilityAlexandre Chartre1-0/+19
Report that AMD x86 CPUs are vulnerable to the RETBleed (Arbitrary Speculative Code Execution with Return Instructions) attack. [peterz: add hygon] [kim: invert parity; fam15h] Co-developed-by: Kim Phillips <[email protected]> Signed-off-by: Kim Phillips <[email protected]> Signed-off-by: Alexandre Chartre <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]> Signed-off-by: Borislav Petkov <[email protected]>
2022-06-14Merge tag 'x86-bugs-2022-06-01' of ↵Linus Torvalds1-3/+49
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 MMIO stale data fixes from Thomas Gleixner: "Yet another hw vulnerability with a software mitigation: Processor MMIO Stale Data. They are a class of MMIO-related weaknesses which can expose stale data by propagating it into core fill buffers. Data which can then be leaked using the usual speculative execution methods. Mitigations include this set along with microcode updates and are similar to MDS and TAA vulnerabilities: VERW now clears those buffers too" * tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/speculation/mmio: Print SMT warning KVM: x86/speculation: Disable Fill buffer clear within guests x86/speculation/mmio: Reuse SRBDS mitigation for SBDS x86/speculation/srbds: Update SRBDS mitigation selection x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data x86/speculation/mmio: Enable CPU Fill buffer clearing on idle x86/bugs: Group MDS, TAA & Processor MMIO Stale Data mitigations x86/speculation/mmio: Add mitigation for Processor MMIO Stale Data x86/speculation: Add a common function for MD_CLEAR mitigation update x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug Documentation: Add documentation for Processor MMIO Stale Data
2022-05-31x86/microcode: Default-disable late loadingBorislav Petkov1-0/+2
It is dangerous and it should not be used anyway - there's a nice early loading already. Requested-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-05-23Merge tag 'x86_cpu_for_v5.19_rc1' of ↵Linus Torvalds1-48/+53
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 CPU feature updates from Borislav Petkov: - Remove a bunch of chicken bit options to turn off CPU features which are not really needed anymore - Misc fixes and cleanups * tag 'x86_cpu_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/speculation: Add missing prototype for unpriv_ebpf_notify() x86/pm: Fix false positive kmemleak report in msr_build_context() x86/speculation/srbds: Do not try to turn mitigation off when not supported x86/cpu: Remove "noclflush" x86/cpu: Remove "noexec" x86/cpu: Remove "nosmep" x86/cpu: Remove CONFIG_X86_SMAP and "nosmap" x86/cpu: Remove "nosep" x86/cpu: Allow feature bit names from /proc/cpuinfo in clearcpuid=
2022-05-23Merge tag 'x86_sev_for_v5.19_rc1' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull AMD SEV-SNP support from Borislav Petkov: "The third AMD confidential computing feature called Secure Nested Paging. Add to confidential guests the necessary memory integrity protection against malicious hypervisor-based attacks like data replay, memory remapping and others, thus achieving a stronger isolation from the hypervisor. At the core of the functionality is a new structure called a reverse map table (RMP) with which the guest has a say in which pages get assigned to it and gets notified when a page which it owns, gets accessed/modified under the covers so that the guest can take an appropriate action. In addition, add support for the whole machinery needed to launch a SNP guest, details of which is properly explained in each patch. And last but not least, the series refactors and improves parts of the previous SEV support so that the new code is accomodated properly and not just bolted on" * tag 'x86_sev_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) x86/entry: Fixup objtool/ibt validation x86/sev: Mark the code returning to user space as syscall gap x86/sev: Annotate stack change in the #VC handler x86/sev: Remove duplicated assignment to variable info x86/sev: Fix address space sparse warning x86/sev: Get the AP jump table address from secrets page x86/sev: Add missing __init annotations to SEV init routines virt: sevguest: Rename the sevguest dir and files to sev-guest virt: sevguest: Change driver name to reflect generic SEV support x86/boot: Put globals that are accessed early into the .data section x86/boot: Add an efi.h header for the decompressor virt: sevguest: Fix bool function returning negative value virt: sevguest: Fix return value check in alloc_shared_pages() x86/sev-es: Replace open-coded hlt-loop with sev_es_terminate() virt: sevguest: Add documentation for SEV-SNP CPUID Enforcement virt: sevguest: Add support to get extended report virt: sevguest: Add support to derive key virt: Add SEV-SNP guest driver x86/sev: Register SEV-SNP guest request platform device x86/sev: Provide support for SNP guest request NAEs ...
2022-05-21x86/speculation/mmio: Reuse SRBDS mitigation for SBDSPawan Gupta1-7/+14
The Shared Buffers Data Sampling (SBDS) variant of Processor MMIO Stale Data vulnerabilities may expose RDRAND, RDSEED and SGX EGETKEY data. Mitigation for this is added by a microcode update. As some of the implications of SBDS are similar to SRBDS, SRBDS mitigation infrastructure can be leveraged by SBDS. Set X86_BUG_SRBDS and use SRBDS mitigation. Mitigation is enabled by default; use srbds=off to opt-out. Mitigation status can be checked from below file: /sys/devices/system/cpu/vulnerabilities/srbds Signed-off-by: Pawan Gupta <[email protected]> Signed-off-by: Borislav Petkov <[email protected]>
2022-05-21x86/speculation/mmio: Enumerate Processor MMIO Stale Data bugPawan Gupta1-2/+41
Processor MMIO Stale Data is a class of vulnerabilities that may expose data after an MMIO operation. For more details please refer to Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst Add the Processor MMIO Stale Data bug enumeration. A microcode update adds new bits to the MSR IA32_ARCH_CAPABILITIES, define them. Signed-off-by: Pawan Gupta <[email protected]> Signed-off-by: Borislav Petkov <[email protected]>
2022-04-11x86/tsx: Disable TSX development mode at bootPawan Gupta1-0/+2
A microcode update on some Intel processors causes all TSX transactions to always abort by default[*]. Microcode also added functionality to re-enable TSX for development purposes. With this microcode loaded, if tsx=on was passed on the cmdline, and TSX development mode was already enabled before the kernel boot, it may make the system vulnerable to TSX Asynchronous Abort (TAA). To be on safer side, unconditionally disable TSX development mode during boot. If a viable use case appears, this can be revisited later. [*]: Intel TSX Disable Update for Selected Processors, doc ID: 643557 [ bp: Drop unstable web link, massage heavily. ] Suggested-by: Andrew Cooper <[email protected]> Suggested-by: Borislav Petkov <[email protected]> Signed-off-by: Pawan Gupta <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Tested-by: Neelima Krishnan <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/347bd844da3a333a9793c6687d4e4eb3b2419a3e.1646943780.git.pawan.kumar.gupta@linux.intel.com
2022-04-06x86/sev: Register GHCB memory when SEV-SNP is activeBrijesh Singh1-0/+4
The SEV-SNP guest is required by the GHCB spec to register the GHCB's Guest Physical Address (GPA). This is because the hypervisor may prefer that a guest uses a consistent and/or specific GPA for the GHCB associated with a vCPU. For more information, see the GHCB specification section "GHCB GPA Registration". [ bp: Cleanup comments. ] Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-04x86/cpu: Remove "noclflush"Borislav Petkov1-8/+0
Not really needed anymore and there's clearcpuid=. Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-04x86/cpu: Remove "nosmep"Borislav Petkov1-7/+0
There should be no need to disable SMEP anymore. Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Lai Jiangshan <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-04x86/cpu: Remove CONFIG_X86_SMAP and "nosmap"Borislav Petkov1-14/+1
Those were added as part of the SMAP enablement but SMAP is currently an integral part of kernel proper and there's no need to disable it anymore. Rip out that functionality. Leave --uaccess default on for objtool as this is what objtool should do by default anyway. If still needed - clearcpuid=smap. Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Lai Jiangshan <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-04x86/cpu: Remove "nosep"Borislav Petkov1-7/+0
That chicken bit was added by 4f88651125e2 ("[PATCH] i386: allow disabling X86_FEATURE_SEP at boot") but measuring int80 vsyscall performance on 32-bit doesn't matter anymore. If still needed, one can boot with clearcpuid=sep to disable that feature for testing. Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-04x86/cpu: Allow feature bit names from /proc/cpuinfo in clearcpuid=Borislav Petkov1-12/+52
Having to give the X86_FEATURE array indices in order to disable a feature bit for testing is not really user-friendly. So accept the feature bit names too. Some feature bits don't have names so there the array indices are still accepted, of course. Clearing CPUID flags is not something which should be done in production so taint the kernel too. An exemplary cmdline would then be something like: clearcpuid=de,440,smca,succory,bmi1,3dnow ("succory" is wrong on purpose). And it says: [ ... ] Clearing CPUID bits: de 13:24 smca (unknown: succory) bmi1 3dnow [ Fix CONFIG_X86_FEATURE_NAMES=n build error as reported by the 0day robot: https://lore.kernel.org/r/[email protected] ] Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15Merge branch 'x86/cpu' into x86/core, to resolve conflictsIngo Molnar1-0/+79
Conflicts: arch/x86/include/asm/cpufeatures.h Signed-off-by: Ingo Molnar <[email protected]>
2022-03-15x86/ibt: Disable IBT around firmwarePeter Zijlstra1-0/+28
Assume firmware isn't IBT clean and disable it across calls. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Kees Cook <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]