aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-04-29KVM: selftests: Add macro for TSS selector, rename up code/data macrosSean Christopherson1-9/+9
Add a proper #define for the TSS selector instead of open coding 0x18 and hoping future developers don't use that selector for something else. Opportunistically rename the code and data selector macros to shorten the names, align the naming with the kernel's scheme, and capture that they are *kernel* segments. Reviewed-by: Ackerley Tng <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Allocate x86's TSS at VM creationSean Christopherson1-3/+2
Allocate x86's per-VM TSS at creation of a non-barebones VM. Like the GDT, the TSS is needed to actually run vCPUs, i.e. every non-barebones VM is all but guaranteed to allocate the TSS sooner or later. Reviewed-by: Ackerley Tng <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Fold x86's descriptor tables helpers into vcpu_init_sregs()Sean Christopherson1-25/+5
Now that the per-VM, on-demand allocation logic in kvm_setup_gdt() and vcpu_init_descriptor_tables() is gone, fold them into vcpu_init_sregs(). Note, both kvm_setup_gdt() and vcpu_init_descriptor_tables() configured the GDT, which is why it looks like kvm_setup_gdt() disappears. Opportunistically delete the pointless zeroing of the IDT limit (it was being unconditionally overwritten by vcpu_init_descriptor_tables()). Reviewed-by: Ackerley Tng <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Drop superfluous switch() on vm->mode in vcpu_init_sregs()Sean Christopherson1-16/+11
Replace the switch statement on vm->mode in x86's vcpu_init_sregs()'s with a simple assert that the VM has a 48-bit virtual address space. A switch statement is both overkill and misleading, as the existing code incorrectly implies that VMs with LA57 would need different to configuration for the LDT, TSS, and flat segments. In all likelihood, the only difference that would be needed for selftests is CR4.LA57 itself. Reviewed-by: Ackerley Tng <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Allocate x86's GDT during VM creationSean Christopherson1-3/+1
Allocate the GDT during creation of non-barebones VMs instead of waiting until the first vCPU is created, as the whole point of non-barebones VMs is to be able to run vCPUs, i.e. the GDT is going to get allocated no matter what. Reviewed-by: Ackerley Tng <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Map x86's exception_handlers at VM creation, not vCPU setupSean Christopherson1-1/+2
Map x86's exception handlers at VM creation, not vCPU setup, as the mapping is per-VM, i.e. doesn't need to be (re)done for every vCPU. Reviewed-by: Ackerley Tng <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Init IDT and exception handlers for all VMs/vCPUs on x86Sean Christopherson23-69/+6
Initialize the IDT and exception handlers for all non-barebones VMs and vCPUs on x86. Forcing tests to manually configure the IDT just to save 8KiB of memory is a terrible tradeoff, and also leads to weird tests (multiple tests have deliberately relied on shutdown to indicate success), and hard-to-debug failures, e.g. instead of a precise unexpected exception failure, tests see only shutdown. Reviewed-by: Ackerley Tng <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Rename x86's vcpu_setup() to vcpu_init_sregs()Sean Christopherson1-2/+2
Rename vcpu_setup() to be more descriptive and precise, there is a whole lot of "setup" that is done for a vCPU that isn't in said helper. No functional change intended. Reviewed-by: Ackerley Tng <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Move x86's descriptor table helpers "up" in processor.cSean Christopherson1-96/+95
Move x86's various descriptor table helpers in processor.c up above kvm_arch_vm_post_create() and vcpu_setup() so that the helpers can be made static and invoked from the aforementioned functions. No functional change intended. Reviewed-by: Ackerley Tng <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Explicitly clobber the IDT in the "delete memslot" testcaseSean Christopherson1-0/+12
Explicitly clobber the guest IDT in the "delete memslot" test, which expects the deleted memslot to result in either a KVM emulation error, or a triple fault shutdown. A future change to the core selftests library will configuring the guest IDT and exception handlers by default, i.e. will install a guest #PF handler and put the guest into an infinite #NPF loop (the guest hits a !PRESENT SPTE when trying to vector a #PF, and KVM reinjects the #PF without fixing the #NPF, because there is no memslot). Note, it's not clear whether or not KVM's behavior is reasonable in this case, e.g. arguably KVM should try (and fail) to emulate in response to the #NPF. But barring a goofy/broken userspace, this scenario will likely never happen in practice. Punt the KVM investigation to the future. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Rework platform_info_test to actually verify #GPSean Christopherson1-33/+33
Rework platform_info_test to actually handle and verify the expected #GP on RDMSR when the associated KVM capability is disabled. Currently, the test _deliberately_ doesn't handle the #GP, and instead lets it escalated to a triple fault shutdown. In addition to verifying that KVM generates the correct fault, handling the #GP will be necessary (without even more shenanigans) when a future change to the core KVM selftests library configures the IDT and exception handlers by default (the test subtly relies on the IDT limit being '0'). Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Move platform_info_test's main assert into guest codeSean Christopherson1-8/+12
As a first step toward gracefully handling the expected #GP on RDMSR in platform_info_test, move the test's assert on the non-faulting RDMSR result into the guest itself. This will allow using a unified flow for the host userspace side of things. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Fix off-by-one initialization of GDT limitAckerley Tng1-1/+1
Fix an off-by-one bug in the initialization of the GDT limit, which as defined in the SDM is inclusive, not exclusive. Note, vcpu_init_descriptor_tables() gets the limit correct, it's only vcpu_setup() that is broken, i.e. only tests that _don't_ invoke vcpu_init_descriptor_tables() can have problems. And the fact that KVM effectively initializes the GDT twice will be cleaned up in the near future. Signed-off-by: Ackerley Tng <[email protected]> [sean: rewrite changelog] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Move GDT, IDT, and TSS fields to x86's kvm_vm_archSean Christopherson5-16/+18
Now that kvm_vm_arch exists, move the GDT, IDT, and TSS fields to x86's implementation, as the structures are firmly x86-only. Reviewed-by: Ackerley Tng <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: sefltests: Add kvm_util_types.h to hold common types, e.g. vm_vaddr_tSean Christopherson2-15/+21
Move the base types unique to KVM selftests out of kvm_util.h and into a new header, kvm_util_types.h. This will allow kvm_util_arch.h, i.e. core arch headers, to reference common types, e.g. vm_vaddr_t and vm_paddr_t. No functional change intended. Reviewed-by: Ackerley Tng <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29Revert "kvm: selftests: move base kvm_util.h declarations to kvm_util_base.h"Sean Christopherson28-12/+1156
Effectively revert the movement of code from kvm_util.h => kvm_util_base.h, as the TL;DR of the justification for the move was to avoid #idefs and/or circular dependencies between what ended up being ucall_common.h and what was (and now again, is), kvm_util.h. But avoiding #ifdef and circular includes is trivial: don't do that. The cost of removing kvm_util_base.h is a few extra includes of ucall_common.h, but that cost is practically nothing. On the other hand, having a "base" version of a header that is really just the header itself is confusing, and makes it weird/hard to choose names for headers that actually are "base" headers, e.g. to hold core KVM selftests typedefs. For all intents and purposes, this reverts commit 7d9a662ed9f0403e7b94940dceb81552b8edb931. Reviewed-by: Ackerley Tng <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Randomly force emulation on x86 writes from guest codeSean Christopherson1-0/+21
Override vcpu_arch_put_guest() to randomly force emulation on supported accesses. Force emulation of LOCK CMPXCHG as well as a regular MOV to stress KVM's emulation of atomic accesses, which has a unique path in KVM's emulator. Arbitrarily give all the decisions 50/50 odds; absent much, much more sophisticated infrastructure for generating random numbers, it's highly unlikely that doing more than a coin flip with affect selftests' ability to find KVM bugs. This is effectively a regression test for commit 910c57dfa4d1 ("KVM: x86: Mark target gfn of emulated atomic instruction as dirty"). Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Add vcpu_arch_put_guest() to do writes from guest codeSean Christopherson3-4/+9
Introduce a macro, vcpu_arch_put_guest(), for "putting" values to memory from guest code in "interesting" situations, e.g. when writing memory that is being dirty logged. Structure the macro so that arch code can provide a custom implementation, e.g. x86 will use the macro to force emulation of the access. Use the helper in dirty_log_test, which is of particular interest (see above), and in xen_shinfo_test, which isn't all that interesting, but provides a second usage of the macro with a different size operand (uint8_t versus uint64_t), i.e. to help verify that the macro works for more than just 64-bit values. Use "put" as the verb to align with the kernel's {get,put}_user() terminology. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Add global snapshot of kvm_is_forced_emulation_enabled()Sean Christopherson4-11/+7
Add a global snapshot of kvm_is_forced_emulation_enabled() and sync it to all VMs by default so that core library code can force emulation, e.g. to allow for easier testing of the intersections between emulation and other features in KVM. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Provide an API for getting a random bool from an RNGSean Christopherson2-1/+12
Move memstress' random bool logic into common code to avoid reinventing the wheel for basic yes/no decisions. Provide an outer wrapper to handle the basic/common case of just wanting a 50/50 chance of something happening. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Provide a global pseudo-RNG instance for all testsSean Christopherson6-29/+23
Add a global guest_random_state instance, i.e. a pseudo-RNG, so that an RNG is available for *all* tests. This will allow randomizing behavior in core library code, e.g. x86 will utilize the pRNG to conditionally force emulation of writes from within common guest code. To allow for deterministic runs, and to be compatible with existing tests, allow tests to override the seed used to initialize the pRNG. Note, the seed *must* be overwritten before a VM is created in order for the seed to take effect, though it's perfectly fine for a test to initialize multiple VMs with different seeds. And as evidenced by memstress_guest_code(), it's also a-ok to instantiate more RNGs using the global seed (or a modified version of it). The goal of the global RNG is purely to ensure that _a_ source of random numbers is available, it doesn't have to be the _only_ RNG. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-29KVM: selftests: Define _GNU_SOURCE for all selftests codeSean Christopherson57-120/+17
Define _GNU_SOURCE is the base CFLAGS instead of relying on selftests to manually #define _GNU_SOURCE, which is repetitive and error prone. E.g. kselftest_harness.h requires _GNU_SOURCE for asprintf(), but if a selftest includes kvm_test_harness.h after stdio.h, the include guards result in the effective version of stdio.h consumed by kvm_test_harness.h not defining asprintf(): In file included from x86_64/fix_hypercall_test.c:12: In file included from include/kvm_test_harness.h:11: ../kselftest_harness.h:1169:2: error: call to undeclared function 'asprintf'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] 1169 | asprintf(&test_name, "%s%s%s.%s", f->name, | ^ When including the rseq selftest's "library" code, #undef _GNU_SOURCE so that rseq.c controls whether or not it wants to build with _GNU_SOURCE. Reported-by: Muhammad Usama Anjum <[email protected]> Acked-by: Claudio Imbrenda <[email protected]> Acked-by: Oliver Upton <[email protected]> Acked-by: Anup Patel <[email protected]> Reviewed-by: Muhammad Usama Anjum <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-19Merge x86 bugfixes from Linux 6.9-rc3Paolo Bonzini773-4479/+9771
Pull fix for SEV-SNP late disable bugs. Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-12KVM: SEV: use u64_to_user_ptr throughoutPaolo Bonzini1-22/+22
Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-12KVM: VMX: Modify NMI and INTR handlers to take intr_info as function argumentSean Christopherson1-9/+7
TDX uses different ABI to get information about VM exit. Pass intr_info to the NMI and INTR handlers instead of pulling it from vcpu_vmx in preparation for sharing the bulk of the handlers with TDX. When the guest TD exits to VMM, RAX holds status and exit reason, RCX holds exit qualification etc rather than the VMCS fields because VMM doesn't have access to the VMCS. The eventual code will be VMX: - get exit reason, intr_info, exit_qualification, and etc from VMCS - call NMI/INTR handlers (common code) TDX: - get exit reason, intr_info, exit_qualification, and etc from guest registers - call NMI/INTR handlers (common code) Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Isaku Yamahata <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Message-Id: <0396a9ae70d293c9d0b060349dae385a8a4fbcec.1705965635.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-12KVM: VMX: Move out vmx_x86_ops to 'main.c' to dispatch VMX and TDXPaolo Bonzini4-270/+391
KVM accesses Virtual Machine Control Structure (VMCS) with VMX instructions to operate on VM. TDX doesn't allow VMM to operate VMCS directly. Instead, TDX has its own data structures, and TDX SEAMCALL APIs for VMM to indirectly operate those data structures. This means we must have a TDX version of kvm_x86_ops. The existing global struct kvm_x86_ops already defines an interface which can be adapted to TDX, but kvm_x86_ops is a system-wide, not per-VM structure. To allow VMX to coexist with TDs, the kvm_x86_ops callbacks will have wrappers "if (tdx) tdx_op() else vmx_op()" to pick VMX or TDX at run time. To split the runtime switch, the VMX implementation, and the TDX implementation, add main.c, and move out the vmx_x86_ops hooks in preparation for adding TDX. Use 'vt' for the naming scheme as a nod to VT-x and as a concatenation of VmxTdx. The eventually converted code will look like this: vmx.c: vmx_op() { ... } VMX initialization tdx.c: tdx_op() { ... } TDX initialization x86_ops.h: vmx_op(); tdx_op(); main.c: static vt_op() { if (tdx) tdx_op() else vmx_op() } static struct kvm_x86_ops vt_x86_ops = { .op = vt_op, initialization functions call both VMX and TDX initialization Opportunistically, fix the name inconsistency from vmx_create_vcpu() and vmx_free_vcpu() to vmx_vcpu_create() and vmx_vcpu_free(). Co-developed-by: Xiaoyao Li <[email protected]> Signed-off-by: Xiaoyao Li <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Isaku Yamahata <[email protected]> Reviewed-by: Binbin Wu <[email protected]> Reviewed-by: Xiaoyao Li <[email protected]> Reviewed-by: Yuan Yao <[email protected]> Message-Id: <e603c317587f933a9d1bee8728c84e4935849c16.1705965634.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-12KVM: x86: Split core of hypercall emulation to helper functionSean Christopherson2-18/+42
By necessity, TDX will use a different register ABI for hypercalls. Break out the core functionality so that it may be reused for TDX. Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Isaku Yamahata <[email protected]> Message-Id: <5134caa55ac3dec33fb2addb5545b52b3b52db02.1705965635.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-12Merge branch 'kvm-sev-init2' into HEADPaolo Bonzini25-185/+703
The idea that no parameter would ever be necessary when enabling SEV or SEV-ES for a VM was decidedly optimistic. The first source of variability that was encountered is the desired set of VMSA features, as that affects the measurement of the VM's initial state and cannot be changed arbitrarily by the hypervisor. This series adds all the APIs that are needed to customize the features, with room for future enhancements: - a new /dev/kvm device attribute to retrieve the set of supported features (right now, only debug swap) - a new sub-operation for KVM_MEM_ENCRYPT_OP that can take a struct, replacing the existing KVM_SEV_INIT and KVM_SEV_ES_INIT It then puts the new op to work by including the VMSA features as a field of the The existing KVM_SEV_INIT and KVM_SEV_ES_INIT use the full set of supported VMSA features for backwards compatibility; but I am considering also making them use zero as the feature mask, and will gladly adjust the patches if so requested. In order to avoid creating *two* new KVM_MEM_ENCRYPT_OPs, I decided that I could as well make SEV and SEV-ES use VM types. This allows SEV-SNP to reuse the KVM_SEV_INIT2 ioctl. And while at it, KVM_SEV_INIT2 also includes two bugfixes. First of all, SEV-ES VM, when created with the new VM type instead of KVM_SEV_ES_INIT, reject KVM_GET_REGS/KVM_SET_REGS and friends on the vCPU file descriptor once the VMSA has been encrypted... which is how the API should have always behaved. Second, they also synchronize the FPU and AVX state. Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-12Merge branch 'mm-delete-change-gpte' into HEADPaolo Bonzini26-419/+16
The .change_pte() MMU notifier callback was intended as an optimization and for this reason it was initially called without a surrounding mmu_notifier_invalidate_range_{start,end}() pair. It was only ever implemented by KVM (which was also the original user of MMU notifiers) and the rules on when to call set_pte_at_notify() rather than set_pte_at() have always been pretty obscure. It may seem a miracle that it has never caused any hard to trigger bugs, but there's a good reason for that: KVM's implementation has been nonfunctional for a good part of its existence. Already in 2012, commit 6bdb913f0a70 ("mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end", 2012-10-09) changed the .change_pte() callback to occur within an invalidate_range_start/end() pair; and because KVM unmaps the sPTEs during .invalidate_range_start(), .change_pte() has no hope of finding a sPTE to change. Therefore, all the code for .change_pte() can be removed from both KVM and mm/, and set_pte_at_notify() can be replaced with just set_pte_at(). Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-12mm: replace set_pte_at_notify() with just set_pte_at()Paolo Bonzini5-19/+8
With the demise of the .change_pte() MMU notifier callback, there is no notification happening in set_pte_at_notify(). It is a synonym of set_pte_at() and can be replaced with it. Signed-off-by: Paolo Bonzini <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Message-ID: <[email protected]> Acked-by: Andrew Morton <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11mmu_notifier: remove the .change_pte() callbackPaolo Bonzini2-61/+2
The scope of set_pte_at_notify() has reduced more and more through the years. Initially, it was meant for when the change to the PTE was not bracketed by mmu_notifier_invalidate_range_{start,end}(). However, that has not been so for over ten years. During all this period the only implementation of .change_pte() was KVM and it had no actual functionality, because it was called after mmu_notifier_invalidate_range_start() zapped the secondary PTE. Now that this (nonfunctional) user of the .change_pte() callback is gone, the whole callback can be removed. For now, leave in place set_pte_at_notify() even though it is just a synonym for set_pte_at(). Signed-off-by: Paolo Bonzini <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Message-ID: <[email protected]> Acked-by: Andrew Morton <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: remove unused argument of kvm_handle_hva_range()Paolo Bonzini1-6/+1
The only user was kvm_mmu_notifier_change_pte(), which is now gone. Signed-off-by: Paolo Bonzini <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: delete .change_pte MMU notifier callbackPaolo Bonzini20-335/+7
The .change_pte() MMU notifier callback was intended as an optimization. The original point of it was that KSM could tell KVM to flip its secondary PTE to a new location without having to first zap it. At the time there was also an .invalidate_page() callback; both of them were *not* bracketed by calls to mmu_notifier_invalidate_range_{start,end}(), and .invalidate_page() also doubled as a fallback implementation of .change_pte(). Later on, however, both callbacks were changed to occur within an invalidate_range_start/end() block. In the case of .change_pte(), commit 6bdb913f0a70 ("mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end", 2012-10-09) did so to remove the fallback from .invalidate_page() to .change_pte() and allow sleepable .invalidate_page() hooks. This however made KVM's usage of the .change_pte() callback completely moot, because KVM unmaps the sPTEs during .invalidate_range_start() and therefore .change_pte() has no hope of finding a sPTE to change. Drop the generic KVM code that dispatches to kvm_set_spte_gfn(), as well as all the architecture specific implementations. Signed-off-by: Paolo Bonzini <[email protected]> Acked-by: Anup Patel <[email protected]> Acked-by: Michael Ellerman <[email protected]> (powerpc) Reviewed-by: Bibo Mao <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11selftests: kvm: add test for transferring FPU state into VMSAPaolo Bonzini1-0/+89
Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11selftests: kvm: split "launch" phase of SEV VM creationPaolo Bonzini3-8/+18
Allow the caller to set the initial state of the VM. Doing this before sev_vm_launch() matters for SEV-ES, since that is the place where the VMSA is updated and after which the guest state becomes sealed. Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11selftests: kvm: switch to using KVM_X86_*_VMPaolo Bonzini6-32/+40
This removes the concept of "subtypes", instead letting the tests use proper VM types that were recently added. While the sev_init_vm() and sev_es_init_vm() are still able to operate with the legacy KVM_SEV_INIT and KVM_SEV_ES_INIT ioctls, this is limited to VMs that are created manually with vm_create_barebones(). Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11selftests: kvm: add tests for KVM_SEV_INIT2Paolo Bonzini4-8/+159
Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SEV: allow SEV-ES DebugSwap againPaolo Bonzini1-1/+1
The DebugSwap feature of SEV-ES provides a way for confidential guests to use data breakpoints. Its status is record in VMSA, and therefore attestation signatures depend on whether it is enabled or not. In order to avoid invalidating the signatures depending on the host machine, it was disabled by default (see commit 5abf6dceb066, "SEV: disable SEV-ES DebugSwap by default", 2024-03-09). However, we now have a new API to create SEV VMs that allows enabling DebugSwap based on what the user tells KVM to do, and we also changed the legacy KVM_SEV_ES_INIT API to never enable DebugSwap. It is therefore possible to re-enable the feature without breaking compatibility with kernels that pre-date the introduction of DebugSwap, so go ahead. Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SEV: introduce KVM_SEV_INIT2 operationPaolo Bonzini3-10/+92
The idea that no parameter would ever be necessary when enabling SEV or SEV-ES for a VM was decidedly optimistic. In fact, in some sense it's already a parameter whether SEV or SEV-ES is desired. Another possible source of variability is the desired set of VMSA features, as that affects the measurement of the VM's initial state and cannot be changed arbitrarily by the hypervisor. Create a new sub-operation for KVM_MEMORY_ENCRYPT_OP that can take a struct, and put the new op to work by including the VMSA features as a field of the struct. The existing KVM_SEV_INIT and KVM_SEV_ES_INIT use the full set of supported VMSA features for backwards compatibility. The struct also includes the usual bells and whistles for future extensibility: a flags field that must be zero for now, and some padding at the end. Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SEV: sync FPU and AVX state at LAUNCH_UPDATE_VMSA timePaolo Bonzini5-10/+54
SEV-ES allows passing custom contents for x87, SSE and AVX state into the VMSA. Allow userspace to do that with the usual KVM_SET_XSAVE API and only mark FPU contents as confidential after it has been copied and encrypted into the VMSA. Since the XSAVE state for AVX is the first, it does not need the compacted-state handling of get_xsave_addr(). However, there are other parts of XSAVE state in the VMSA that currently are not handled, and the validation logic of get_xsave_addr() is pointless to duplicate in KVM, so move get_xsave_addr() to public FPU API; it is really just a facility to operate on XSAVE state and does not expose any internal details of arch/x86/kernel/fpu. Acked-by: Dave Hansen <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SEV: define VM types for SEV and SEV-ESPaolo Bonzini5-3/+29
Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SEV: introduce to_kvm_sev_infoPaolo Bonzini2-2/+7
Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: x86: Add supported_vm_types to kvm_capsPaolo Bonzini2-6/+8
This simplifies the implementation of KVM_CHECK_EXTENSION(KVM_CAP_VM_TYPES), and also allows the vendor module to specify which VM types are supported. Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: x86: add fields to struct kvm_arch for CoCo featuresPaolo Bonzini2-21/+79
Some VM types have characteristics in common; in fact, the only use of VM types right now is kvm_arch_has_private_mem and it assumes that _all_ nonzero VM types have private memory. We will soon introduce a VM type for SEV and SEV-ES VMs, and at that point we will have two special characteristics of confidential VMs that depend on the VM type: not just if memory is private, but also whether guest state is protected. For the latter we have kvm->arch.guest_state_protected, which is only set on a fully initialized VM. For VM types with protected guest state, we can actually fix a problem in the SEV-ES implementation, where ioctls to set registers do not cause an error even if the VM has been initialized and the guest state encrypted. Make sure that when using VM types that will become an error. Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Reviewed-by: Isaku Yamahata <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SEV: store VMSA features in kvm_sev_infoPaolo Bonzini3-10/+24
Right now, the set of features that are stored in the VMSA upon initialization is fixed and depends on the module parameters for kvm-amd.ko. However, the hypervisor cannot really change it at will because the feature word has to match between the hypervisor and whatever computes a measurement of the VMSA for attestation purposes. Add a field to kvm_sev_info that holds the set of features to be stored in the VMSA; and query it instead of referring to the module parameters. Because KVM_SEV_INIT and KVM_SEV_ES_INIT accept no parameters, this does not yet introduce any functional change, but it paves the way for an API that allows customization of the features per-VM. Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Reviewed-by: Michael Roth <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SEV: publish supported VMSA featuresPaolo Bonzini5-4/+44
Compute the set of features to be stored in the VMSA when KVM is initialized; move it from there into kvm_sev_info when SEV is initialized, and then into the initial VMSA. The new variable can then be used to return the set of supported features to userspace, via the KVM_GET_DEVICE_ATTR ioctl. Signed-off-by: Paolo Bonzini <[email protected]> Reviewed-by: Isaku Yamahata <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: introduce new vendor op for KVM_GET_DEVICE_ATTRPaolo Bonzini3-14/+26
Allow vendor modules to provide their own attributes on /dev/kvm. To avoid proliferation of vendor ops, implement KVM_HAS_DEVICE_ATTR and KVM_GET_DEVICE_ATTR in terms of the same function. You're not supposed to use KVM_GET_DEVICE_ATTR to do complicated computations, especially on /dev/kvm. Reviewed-by: Michael Roth <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Reviewed-by: Isaku Yamahata <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: x86: use u64_to_user_ptr()Paolo Bonzini1-21/+3
There is no danger to the kernel if 32-bit userspace provides a 64-bit value that has the high bits set, but for whatever reason happens to resolve to an address that has something mapped there. KVM uses the checked version of get_user() and put_user(), so any faults are caught properly. Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SVM: Compile sev.c if and only if CONFIG_KVM_AMD_SEV=yPaolo Bonzini4-43/+38
Stop compiling sev.c when CONFIG_KVM_AMD_SEV=n, as the number of #ifdefs in sev.c is getting ridiculous, and having #ifdefs inside of SEV helpers is quite confusing. To minimize #ifdefs in code flows, #ifdef away only the kvm_x86_ops hooks and the #VMGEXIT handler. Stubs are also restricted to functions that check sev_enabled and to the destruction functions sev_free_cpu() and sev_vm_destroy(), where the style of their callers is to leave checks to the callers. Most call sites instead rely on dead code elimination to take care of functions that are guarded with sev_guest() or sev_es_guest(). Signed-off-by: Sean Christopherson <[email protected]> Co-developed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SVM: Invert handling of SEV and SEV_ES feature flagsSean Christopherson2-5/+5
Leave SEV and SEV_ES '0' in kvm_cpu_caps by default, and instead set them in sev_set_cpu_caps() if SEV and SEV-ES support are fully enabled. Aside from the fact that sev_set_cpu_caps() is wildly misleading when it *clears* capabilities, this will allow compiling out sev.c without falsely advertising SEV/SEV-ES support in KVM_GET_SUPPORTED_CPUID. Signed-off-by: Sean Christopherson <[email protected]> Reviewed-by: Michael Roth <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>