aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/include/asm
AgeCommit message (Collapse)AuthorFilesLines
2022-03-15x86,objtool: Move the ASM_REACHABLE annotation to objtool.hPeter Zijlstra2-0/+2
Because we need a variant for .S files too. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-03-15x86: Annotate call_on_stack()Peter Zijlstra1-1/+2
vmlinux.o: warning: objtool: page_fault_oops()+0x13c: unreachable instruction 0000 000000000005b460 <page_fault_oops>: ... 0128 5b588: 49 89 23 mov %rsp,(%r11) 012b 5b58b: 4c 89 dc mov %r11,%rsp 012e 5b58e: 4c 89 f2 mov %r14,%rdx 0131 5b591: 48 89 ee mov %rbp,%rsi 0134 5b594: 4c 89 e7 mov %r12,%rdi 0137 5b597: e8 00 00 00 00 call 5b59c <page_fault_oops+0x13c> 5b598: R_X86_64_PLT32 handle_stack_overflow-0x4 013c 5b59c: 5c pop %rsp vmlinux.o: warning: objtool: sysvec_reboot()+0x6d: unreachable instruction 0000 00000000000033f0 <sysvec_reboot>: ... 005d 344d: 4c 89 dc mov %r11,%rsp 0060 3450: e8 00 00 00 00 call 3455 <sysvec_reboot+0x65> 3451: R_X86_64_PLT32 irq_enter_rcu-0x4 0065 3455: 48 89 ef mov %rbp,%rdi 0068 3458: e8 00 00 00 00 call 345d <sysvec_reboot+0x6d> 3459: R_X86_64_PC32 .text+0x47d0c 006d 345d: e8 00 00 00 00 call 3462 <sysvec_reboot+0x72> 345e: R_X86_64_PLT32 irq_exit_rcu-0x4 0072 3462: 5c pop %rsp Both cases are due to a call_on_stack() calling a __noreturn function. Since that's an inline asm, GCC can't do anything about the instructions after the CALL. Therefore put in an explicit ASM_REACHABLE annotation to make sure objtool and gcc are consistently confused about control flow. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15x86: Mark stop_this_cpu() __noreturnPeter Zijlstra1-1/+1
vmlinux.o: warning: objtool: smp_stop_nmi_callback()+0x2b: unreachable instruction 0000 0000000000047cf0 <smp_stop_nmi_callback>: ... 0026 47d16: e8 00 00 00 00 call 47d1b <smp_stop_nmi_callback+0x2b> 47d17: R_X86_64_PLT32 stop_this_cpu-0x4 002b 47d1b: b8 01 00 00 00 mov $0x1,%eax Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15x86/ibt: Dont generate ENDBR in .discard.textPeter Zijlstra1-1/+2
Having ENDBR in discarded sections can easily lead to relocations into discarded sections which the linkers aren't really fond of. Objtool also shouldn't generate them, but why tempt fate. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15x86/ibt: Disable IBT around firmwarePeter Zijlstra2-2/+13
Assume firmware isn't IBT clean and disable it across calls. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Kees Cook <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15x86/ibt,kexec: Disable CET on kexecPeter Zijlstra1-0/+3
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15x86/ibt: Add IBT feature, MSR and #CP handlingPeter Zijlstra5-1/+28
The bits required to make the hardware go.. Of note is that, provided the syscall entry points are covered with ENDBR, #CP doesn't need to be an IST because we'll never hit the syscall gap. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15x86/ibt,paravirt: Sprinkle ENDBRPeter Zijlstra2-0/+4
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Kees Cook <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15x86/linkage: Add ENDBR to SYM_FUNC_START*()Peter Zijlstra1-0/+31
Ensure the ASM functions have ENDBR on for IBT builds, this follows the ARM64 example. Unlike ARM64, we'll likely end up overwriting them with poison. Suggested-by: Mark Rutland <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Kees Cook <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15x86/ibt,entry: Sprinkle ENDBR dustPeter Zijlstra2-10/+13
Kernel entry points should be having ENDBR on for IBT configs. The SYSCALL entry points are found through taking their respective address in order to program them in the MSRs, while the exception entry points are found through UNWIND_HINT_IRET_REGS. The rule is that any UNWIND_HINT_IRET_REGS at sym+0 should have an ENDBR, see the later objtool ibt validation patch. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15x86/ibt,xen: Sprinkle the ENDBRPeter Zijlstra1-1/+1
Even though Xen currently doesn't advertise IBT, prepare for when it will eventually do so and sprinkle the ENDBR dust accordingly. Even though most of the entry points are IRET like, the CPL0 Hypervisor can set WAIT-FOR-ENDBR and demand ENDBR at these sites. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15x86/entry,xen: Early rewrite of restore_regs_and_return_to_kernel()Peter Zijlstra2-6/+0
By doing an early rewrite of 'jmp native_iret` in restore_regs_and_return_to_kernel() we can get rid of the last INTERRUPT_RETURN user and paravirt_iret. Suggested-by: Andrew Cooper <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15x86/ibt,paravirt: Use text_gen_insn() for paravirt_patch()Peter Zijlstra1-6/+14
Less duplication is more better. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15x86/text-patching: Make text_gen_insn() play nice with ANNOTATE_NOENDBRPeter Zijlstra1-1/+9
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15x86/ibt: Base IBT bitsPeter Zijlstra1-0/+87
Add Kconfig, Makefile and basic instruction support for x86 IBT. (Ab)use __DISABLE_EXPORTS to disable IBT since it's already employed to mark compressed and purgatory. Additionally mark realmode with it as well to avoid inserting ENDBR instructions there. While ENDBR is technically a NOP, inserting them was causing some grief due to code growth. There's also a problem with using __noendbr in code compiled without -fcf-protection=branch. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-15Merge tag 'v5.17-rc8' into sched/core, to pick up fixesIngo Molnar3-9/+10
Signed-off-by: Ingo Molnar <[email protected]>
2022-03-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-8/+10
net/dsa/dsa2.c commit afb3cc1a397d ("net: dsa: unlock the rtnl_mutex when dsa_master_setup() fails") commit e83d56537859 ("net: dsa: replay master state events in dsa_tree_{setup,teardown}_master") https://lore.kernel.org/all/[email protected]/ drivers/net/ethernet/intel/ice/ice.h commit 97b0129146b1 ("ice: Fix error with handling of bonding MTU") commit 43113ff73453 ("ice: add TTY for GNSS module for E810T device") https://lore.kernel.org/all/[email protected]/ drivers/staging/gdm724x/gdm_lte.c commit fc7f750dc9d1 ("staging: gdm724x: fix use after free in gdm_lte_rx()") commit 4bcc4249b4cf ("staging: Use netif_rx().") https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jakub Kicinski <[email protected]>
2022-03-10x86, ACPI: rename init_freq_invariance_cppc() to arch_init_invariance_cppc()Ionela Voinescu1-1/+1
init_freq_invariance_cppc() was called in acpi_cppc_processor_probe(), after CPU performance information and controls were populated from the per-cpu _CPC objects. But these _CPC objects provide information that helps with both CPU (u-arch) and frequency invariance. Therefore, change the function name to a more generic one, while adding the arch_ prefix, as this function is expected to be defined differently by different architectures. Signed-off-by: Ionela Voinescu <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Tested-by: Valentin Schneider <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2022-03-08x86/ACPI: CPPC: Move init_freq_invariance_cppc() into x86 CPPCHuang Rui1-3/+1
The init_freq_invariance_cppc code actually doesn't need the SMP functionality. So setting the CONFIG_SMP as the check condition for init_freq_invariance_cppc may cause the confusion to misunderstand the CPPC. And the x86 CPPC file is better space to store the CPPC related functions, while the init_freq_invariance_cppc is out of smpboot, that means, the CONFIG_SMP won't be mandatory condition any more. And It's more clear than before. Signed-off-by: Huang Rui <[email protected]> [ rjw: Subject adjustment ] Signed-off-by: Rafael J. Wysocki <[email protected]>
2022-03-08x86: Expose init_freq_invariance() to topology headerHuang Rui1-0/+4
The function init_freq_invariance will be used on x86 CPPC, so expose it in the topology header. Signed-off-by: Huang Rui <[email protected]> [ rjw: Subject adjustment ] Signed-off-by: Rafael J. Wysocki <[email protected]>
2022-03-08x86/ACPI: CPPC: Move AMD maximum frequency ratio setting function into x86 CPPCHuang Rui1-0/+9
The AMD maximum frequency ratio setting function depends on CPPC, so the x86 CPPC implementation file is better space for this function. Signed-off-by: Huang Rui <[email protected]> [ rjw: Subject adjustment ] Signed-off-by: Rafael J. Wysocki <[email protected]>
2022-03-08KVM: SVM: Allow AVIC support on system w/ physical APIC ID > 255Suravee Suthikulpanit1-1/+1
Expand KVM's mask for the AVIC host physical ID to the full 12 bits defined by the architecture. The number of bits consumed by hardware is model specific, e.g. early CPUs ignored bits 11:8, but there is no way for KVM to enumerate the "true" size. So, KVM must allow using all bits, else it risks rejecting completely legal x2APIC IDs on newer CPUs. This means KVM relies on hardware to not assign x2APIC IDs that exceed the "true" width of the field, but presumably hardware is smart enough to tie the width to the max x2APIC ID. KVM also relies on hardware to support at least 8 bits, as the legacy xAPIC ID is writable by software. But, those assumptions are unavoidable due to the lack of any way to enumerate the "true" width. Cc: [email protected] Cc: Maxim Levitsky <[email protected]> Suggested-by: Sean Christopherson <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Fixes: 44a95dae1d22 ("KVM: x86: Detect and Initialize AVIC support") Signed-off-by: Suravee Suthikulpanit <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-03-08KVM: x86/mmu: Zap invalidated roots via asynchronous workerPaolo Bonzini1-0/+2
Use the system worker threads to zap the roots invalidated by the TDP MMU's "fast zap" mechanism, implemented by kvm_tdp_mmu_invalidate_all_roots(). At this point, apart from allowing some parallelism in the zapping of roots, the workqueue is a glorified linked list: work items are added and flushed entirely within a single kvm->slots_lock critical section. However, the workqueue fixes a latent issue where kvm_mmu_zap_all_invalidated_roots() assumes that it owns a reference to all invalid roots; therefore, no one can set the invalid bit outside kvm_mmu_zap_all_fast(). Putting the invalidated roots on a linked list... erm, on a workqueue ensures that tdp_mmu_zap_root_work() only puts back those extra references that kvm_mmu_zap_all_invalidated_roots() had gifted to it. Signed-off-by: Paolo Bonzini <[email protected]>
2022-03-07Merge tag 'x86_bugs_for_v5.17' of ↵Linus Torvalds2-8/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 spectre fixes from Borislav Petkov: - Mitigate Spectre v2-type Branch History Buffer attacks on machines which support eIBRS, i.e., the hardware-assisted speculation restriction after it has been shown that such machines are vulnerable even with the hardware mitigation. - Do not use the default LFENCE-based Spectre v2 mitigation on AMD as it is insufficient to mitigate such attacks. Instead, switch to retpolines on all AMD by default. - Update the docs and add some warnings for the obviously vulnerable cmdline configurations. * tag 'x86_bugs_for_v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT x86/speculation: Warn about Spectre v2 LFENCE mitigation x86/speculation: Update link to AMD speculation whitepaper x86/speculation: Use generic retpoline by default on AMD x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting Documentation/hw-vuln: Update spectre doc x86/speculation: Add eIBRS + Retpoline options x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
2022-03-04Merge branch 'kvm-bugfixes' into HEADPaolo Bonzini1-1/+0
Merge bugfixes from 5.17 before merging more tricky work.
2022-03-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+0
net/batman-adv/hard-interface.c commit 690bb6fb64f5 ("batman-adv: Request iflink once in batadv-on-batadv check") commit 6ee3c393eeb7 ("batman-adv: Demote batadv-on-batadv skip error message") https://lore.kernel.org/all/[email protected]/ net/smc/af_smc.c commit 4d08b7b57ece ("net/smc: Fix cleanup when register ULP fails") commit 462791bbfa35 ("net/smc: add sysctl interface for SMC") https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jakub Kicinski <[email protected]>
2022-03-02platform/x86: Add AMD system management interfaceSuma Hegde1-0/+16
Recent Fam19h EPYC server line of processors from AMD support system management functionality via HSMP (Host System Management Port) interface. The Host System Management Port (HSMP) is an interface to provide OS-level software with access to system management functions via a set of mailbox registers. More details on the interface can be found in chapter "7 Host System Management Port (HSMP)" of the following PPR https://www.amd.com/system/files/TechDocs/55898_B1_pub_0.50.zip This patch adds new amd_hsmp module under the drivers/platforms/x86/ which creates miscdevice with an IOCTL interface to the user space. /dev/hsmp is for running the hsmp mailbox commands. Signed-off-by: Suma Hegde <[email protected]> Signed-off-by: Naveen Krishna Chatradhi <[email protected]> Reviewed-by: Carlos Bilbao <[email protected]> Acked-by: Song Liu <[email protected]> Reviewed-by: Nathan Fontenot <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Hans de Goede <[email protected]> Signed-off-by: Hans de Goede <[email protected]>
2022-03-01KVM: x86/mmu: Zap only obsolete roots if a root shadow page is zappedSean Christopherson1-0/+2
Zap only obsolete roots when responding to zapping a single root shadow page. Because KVM keeps root_count elevated when stuffing a previous root into its PGD cache, shadowing a 64-bit guest means that zapping any root causes all vCPUs to reload all roots, even if their current root is not affected by the zap. For many kernels, zapping a single root is a frequent operation, e.g. in Linux it happens whenever an mm is dropped, e.g. process exits, etc... Signed-off-by: Sean Christopherson <[email protected]> Reviewed-by: Ben Gardon <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-02-25KVM: x86/mmu: do not pass vcpu to root freeing functionsPaolo Bonzini1-2/+2
These functions only operate on a given MMU, of which there is more than one in a vCPU (we care about two, because the third does not have any roots and is only used to walk guest page tables). They do need a struct kvm in order to lock the mmu_lock, but they do not needed anything else in the struct kvm_vcpu. So, pass the vcpu->kvm directly to them. Signed-off-by: Paolo Bonzini <[email protected]>
2022-02-25KVM: x86: use struct kvm_mmu_root_info for mmu->rootPaolo Bonzini1-2/+1
The root_hpa and root_pgd fields form essentially a struct kvm_mmu_root_info. Use the struct to have more consistency between mmu->root and mmu->prev_roots. The patch is entirely search and replace except for cached_root_available, which does not need a temporary struct kvm_mmu_root_info anymore. Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-02-25KVM: x86: Provide per VM capability for disabling PMU virtualizationDavid Dunn1-0/+1
Add a new capability, KVM_CAP_PMU_CAPABILITY, that takes a bitmask of settings/features to allow userspace to configure PMU virtualization on a per-VM basis. For now, support a single flag, KVM_PMU_CAP_DISABLE, to allow disabling PMU virtualization for a VM even when KVM is configured with enable_pmu=true a module level. To keep KVM simple, disallow changing VM's PMU configuration after vCPUs have been created. Signed-off-by: David Dunn <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-02-25KVM: x86: Fix pointer mistmatch warning when patching RET0 static callsSean Christopherson1-2/+2
Cast kvm_x86_ops.func to 'void *' when updating KVM static calls that are conditionally patched to __static_call_return0(). clang complains about using mismatching pointers in the ternary operator, which breaks the build when compiling with CONFIG_KVM_WERROR=y. >> arch/x86/include/asm/kvm-x86-ops.h:82:1: warning: pointer type mismatch ('bool (*)(struct kvm_vcpu *)' and 'void *') [-Wpointer-type-mismatch] Fixes: 5be2226f417d ("KVM: x86: allow defining return-0 static calls") Reported-by: Like Xu <[email protected]> Reported-by: kernel test robot <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Reviewed-by: David Dunn <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Tested-by: Nathan Chancellor <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-02-25uaccess: generalize access_ok()Arnd Bergmann1-12/+2
There are many different ways that access_ok() is defined across architectures, but in the end, they all just compare against the user_addr_max() value or they accept anything. Provide one definition that works for most architectures, checking against TASK_SIZE_MAX for user processes or skipping the check inside of uaccess_kernel() sections. For architectures without CONFIG_SET_FS(), this should be the fastest check, as it comes down to a single comparison of a pointer against a compile-time constant, while the architecture specific versions tend to do something more complex for historic reasons or get something wrong. Type checking for __user annotations is handled inconsistently across architectures, but this is easily simplified as well by using an inline function that takes a 'const void __user *' argument. A handful of callers need an extra __user annotation for this. Some architectures had trick to use 33-bit or 65-bit arithmetic on the addresses to calculate the overflow, however this simpler version uses fewer registers, which means it can produce better object code in the end despite needing a second (statically predicted) branch. Reviewed-by: Christoph Hellwig <[email protected]> Acked-by: Mark Rutland <[email protected]> [arm64, asm-generic] Acked-by: Geert Uytterhoeven <[email protected]> Acked-by: Stafford Horne <[email protected]> Acked-by: Dinh Nguyen <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2022-02-25uaccess: add generic __{get,put}_kernel_nofaultArnd Bergmann1-2/+0
Nine architectures are still missing __{get,put}_kernel_nofault: alpha, ia64, microblaze, nds32, nios2, openrisc, sh, sparc32, xtensa. Add a generic version that lets everything use the normal copy_{from,to}_kernel_nofault() code based on these, removing the last use of get_fs()/set_fs() from architecture-independent code. Reviewed-by: Christoph Hellwig <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2022-02-25x86: use more conventional access_ok() definitionArnd Bergmann1-22/+3
The way that access_ok() is defined on x86 is slightly different from most other architectures, and a bit more complex. The generic version tends to result in the best output on all architectures, as it results in single comparison against a constant limit for calls with a known size. There are a few callers of __range_not_ok(), all of which use TASK_SIZE as the limit rather than TASK_SIZE_MAX, but I could not see any reason for picking this. Changing these to call __access_ok() instead uses the default limit, but keeps the behavior otherwise. x86 is the only architecture with a WARN_ON_IN_IRQ() checking access_ok(), but it's probably best to leave that in place. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2022-02-25x86: remove __range_not_ok()Arnd Bergmann1-4/+6
The __range_not_ok() helper is an x86 (and sparc64) specific interface that does roughly the same thing as __access_ok(), but with different calling conventions. Change this to use the normal interface in order for consistency as we clean up all access_ok() implementations. This changes the limit from TASK_SIZE to TASK_SIZE_MAX, which Al points out is the right thing do do here anyway. The callers have to use __access_ok() instead of the normal access_ok() though, because on x86 that contains a WARN_ON_IN_IRQ() check that cannot be used inside of NMI context while tracing. The check in copy_code() is not needed any more, because this one is already done by copy_from_user_nmi(). Suggested-by: Al Viro <[email protected]> Suggested-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Arnd Bergmann <[email protected]>
2022-02-24Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-1/+0
Pull kvm fixes from Paolo Bonzini: "x86 host: - Expose KVM_CAP_ENABLE_CAP since it is supported - Disable KVM_HC_CLOCK_PAIRING in TSC catchup mode - Ensure async page fault token is nonzero - Fix lockdep false negative - Fix FPU migration regression from the AMX changes x86 guest: - Don't use PV TLB/IPI/yield on uniprocessor guests PPC: - reserve capability id (topic branch for ppc/kvm)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: nSVM: disallow userspace setting of MSR_AMD64_TSC_RATIO to non default value when tsc scaling disabled KVM: x86/mmu: make apf token non-zero to fix bug KVM: PPC: reserve capability 210 for KVM_CAP_PPC_AIL_MODE_3 x86/kvm: Don't use pv tlb/ipi/sched_yield if on 1 vCPU x86/kvm: Fix compilation warning in non-x86_64 builds x86/kvm/fpu: Remove kvm_vcpu_arch.guest_supported_xcr0 x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0 kvm: x86: Disable KVM_HC_CLOCK_PAIRING if tsc is in always catchup mode KVM: Fix lockdep false negative during host resume KVM: x86: Add KVM_CAP_ENABLE_CAP to x86
2022-02-23x86/mm/cpa: Generalize __set_memory_enc_pgtable()Brijesh Singh2-1/+16
The kernel provides infrastructure to set or clear the encryption mask from the pages for AMD SEV, but TDX requires few tweaks. - TDX and SEV have different requirements to the cache and TLB flushing. - TDX has own routine to notify VMM about page encryption status change. Modify __set_memory_enc_pgtable() and make it flexible enough to cover both AMD SEV and Intel TDX. The AMD-specific behavior is isolated in the callbacks under x86_platform.guest. TDX will provide own version of said callbacks. [ bp: Beat into submission. ] Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-02-23x86/coco: Add API to handle encryption maskKirill A. Shutemov2-6/+25
AMD SME/SEV uses a bit in the page table entries to indicate that the page is encrypted and not accessible to the VMM. TDX uses a similar approach, but the polarity of the mask is opposite to AMD: if the bit is set the page is accessible to VMM. Provide vendor-neutral API to deal with the mask: cc_mkenc() and cc_mkdec() modify given address to make it encrypted/decrypted. It can be applied to phys_addr_t, pgprotval_t or page table entry value. pgprot_encrypted() and pgprot_decrypted() reimplemented using new helpers. The implementation will be extended to cover TDX. pgprot_decrypted() is used by drivers (i915, virtio_gpu, vfio). cc_mkdec() called by pgprot_decrypted(). Export cc_mkdec(). Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Tom Lendacky <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-02-23x86/coco: Explicitly declare type of confidential computing platformKirill A. Shutemov1-0/+14
The kernel derives the confidential computing platform type it is running as from sme_me_mask on AMD or by using hv_is_isolation_supported() on HyperV isolation VMs. This detection process will be more complicated as more platforms get added. Declare a confidential computing vendor variable explicitly and set it via cc_set_vendor() on the respective platform. [ bp: Massage commit message, fixup HyperV check. ] Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Tom Lendacky <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-02-23x86/pat: Remove the unused set_pages_array_wt() functionChristoph Hellwig1-1/+0
Commit 623dffb2a2e0 ("x86/mm/pat: Add set_memory_wt() for Write-Through type") added it but there were no users. [ bp: Add a commit message. ] Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-02-23sched/headers: Add initial new headers as identity mappingsIngo Molnar1-0/+1
This allows code sharing between fast-headers tree and the vanilla scheduler tree. Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Peter Zijlstra <[email protected]>
2022-02-21Merge tag 'v5.17-rc5' into sched/core, to resolve conflictsIngo Molnar7-27/+61
New conflicts in sched/core due to the following upstream fixes: 44585f7bc0cb ("psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n") a06247c6804f ("psi: Fix uaf issue when psi trigger is destroyed while being polled") Conflicts: include/linux/psi_types.h kernel/sched/psi.c Signed-off-by: Ingo Molnar <[email protected]>
2022-02-21x86/speculation: Add eIBRS + Retpoline optionsPeter Zijlstra1-1/+3
Thanks to the chaps at VUsec it is now clear that eIBRS is not sufficient, therefore allow enabling of retpolines along with eIBRS. Add spectre_v2=eibrs, spectre_v2=eibrs,lfence and spectre_v2=eibrs,retpoline options to explicitly pick your preferred means of mitigation. Since there's new mitigations there's also user visible changes in /sys/devices/system/cpu/vulnerabilities/spectre_v2 to reflect these new mitigations. [ bp: Massage commit message, trim error messages, do more precise eIBRS mode checking. ] Co-developed-by: Josh Poimboeuf <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Patrick Colp <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]>
2022-02-21x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCEPeter Zijlstra (Intel)2-7/+7
The RETPOLINE_AMD name is unfortunate since it isn't necessarily AMD only, in fact Hygon also uses it. Furthermore it will likely be sufficient for some Intel processors. Therefore rename the thing to RETPOLINE_LFENCE to better describe what it is. Add the spectre_v2=retpoline,lfence option as an alias to spectre_v2=retpoline,amd to preserve existing setups. However, the output of /sys/devices/system/cpu/vulnerabilities/spectre_v2 will be changed. [ bp: Fix typos, massage. ] Co-developed-by: Josh Poimboeuf <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]>
2022-02-19sched/preempt: Refactor sched_dynamic_update()Mark Rutland1-4/+6
Currently sched_dynamic_update needs to open-code the enabled/disabled function names for each preemption model it supports, when in practice this is a boolean enabled/disabled state for each function. Make this clearer and avoid repetition by defining the enabled/disabled states at the function definition, and using helper macros to perform the static_call_update(). Where x86 currently overrides the enabled function, it is made to provide both the enabled and disabled states for consistency, with defaults provided by the core code otherwise. In subsequent patches this will allow us to support PREEMPT_DYNAMIC without static calls. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Ard Biesheuvel <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-02-18KVM: x86/mmu: Remove MMU auditingSean Christopherson1-4/+0
Remove mmu_audit.c and all its collateral, the auditing code has suffered severe bitrot, ironically partly due to shadow paging being more stable and thus not benefiting as much from auditing, but mostly due to TDP supplanting shadow paging for non-nested guests and shadowing of nested TDP not heavily stressing the logic that is being audited. Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-02-18KVM: x86: allow defining return-0 static callsPaolo Bonzini2-6/+13
A few vendor callbacks are only used by VMX, but they return an integer or bool value. Introduce KVM_X86_OP_OPTIONAL_RET0 for them: if a func is NULL in struct kvm_x86_ops, it will be changed to __static_call_return0 when updating static calls. Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-02-18KVM: x86: make several APIC virtualization callbacks optionalPaolo Bonzini1-5/+5
All their invocations are conditional on vcpu->arch.apicv_active, meaning that they need not be implemented by vendor code: even though at the moment both vendors implement APIC virtualization, all of them can be optional. In fact SVM does not need many of them, and their implementation can be deleted now. Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-02-18KVM: x86: warn on incorrectly NULL members of kvm_x86_opsPaolo Bonzini1-2/+5
Use the newly corrected KVM_X86_OP annotations to warn about possible NULL pointer dereferences as soon as the vendor module is loaded. Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>