aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kvm/svm
AgeCommit message (Collapse)AuthorFilesLines
2024-06-03KVM: SEV-ES: Delegate LBR virtualization to the processorRavi Bangoria3-7/+17
As documented in APM[1], LBR Virtualization must be enabled for SEV-ES guests. Although KVM currently enforces LBRV for SEV-ES guests, there are multiple issues with it: o MSR_IA32_DEBUGCTLMSR is still intercepted. Since MSR_IA32_DEBUGCTLMSR interception is used to dynamically toggle LBRV for performance reasons, this can be fatal for SEV-ES guests. For ex SEV-ES guest on Zen3: [guest ~]# wrmsr 0x1d9 0x4 KVM: entry failed, hardware error 0xffffffff EAX=00000004 EBX=00000000 ECX=000001d9 EDX=00000000 Fix this by never intercepting MSR_IA32_DEBUGCTLMSR for SEV-ES guests. No additional save/restore logic is required since MSR_IA32_DEBUGCTLMSR is of swap type A. o KVM will disable LBRV if userspace sets MSR_IA32_DEBUGCTLMSR before the VMSA is encrypted. Fix this by moving LBRV enablement code post VMSA encryption. [1]: AMD64 Architecture Programmer's Manual Pub. 40332, Rev. 4.07 - June 2023, Vol 2, 15.35.2 Enabling SEV-ES. https://bugzilla.kernel.org/attachment.cgi?id=304653 Fixes: 376c6d285017 ("KVM: SVM: Provide support for SEV-ES vCPU creation/loading") Co-developed-by: Nikunj A Dadhania <[email protected]> Signed-off-by: Nikunj A Dadhania <[email protected]> Signed-off-by: Ravi Bangoria <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-06-03KVM: SEV-ES: Disallow SEV-ES guests when X86_FEATURE_LBRV is absentRavi Bangoria3-9/+14
As documented in APM[1], LBR Virtualization must be enabled for SEV-ES guests. So, prevent SEV-ES guests when LBRV support is missing. [1]: AMD64 Architecture Programmer's Manual Pub. 40332, Rev. 4.07 - June 2023, Vol 2, 15.35.2 Enabling SEV-ES. https://bugzilla.kernel.org/attachment.cgi?id=304653 Fixes: 376c6d285017 ("KVM: SVM: Provide support for SEV-ES vCPU creation/loading") Signed-off-by: Ravi Bangoria <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-06-03KVM: SEV-ES: Prevent MSR access post VMSA encryptionNikunj A Dadhania1-0/+18
KVM currently allows userspace to read/write MSRs even after the VMSA is encrypted. This can cause unintentional issues if MSR access has side- effects. For ex, while migrating a guest, userspace could attempt to migrate MSR_IA32_DEBUGCTLMSR and end up unintentionally disabling LBRV on the target. Fix this by preventing access to those MSRs which are context switched via the VMSA, once the VMSA is encrypted. Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Nikunj A Dadhania <[email protected]> Signed-off-by: Ravi Bangoria <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-06-03KVM: SVM: Remove the need to trigger an UNBLOCK event on AP creationTom Lendacky3-23/+1
All SNP APs are initially started using the APIC INIT/SIPI sequence in the guest. This sequence moves the AP MP state from KVM_MP_STATE_UNINITIALIZED to KVM_MP_STATE_RUNNABLE, so there is no need to attempt the UNBLOCK. As it is, the UNBLOCK support in SVM is only enabled when AVIC is enabled. When AVIC is disabled, AP creation is still successful. Remove the KVM_REQ_UNBLOCK request from the AP creation code and revert the changes to the vcpu_unblocking() kvm_x86_ops path. Signed-off-by: Tom Lendacky <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-06-03KVM: SEV: Don't WARN() if RMP lookup fails when invalidating gmem pagesPaolo Bonzini1-5/+4
The hook only handles cleanup work specific to SNP, e.g. RMP table entries and flushing caches for encrypted guest memory. When run on a non-SNP-enabled host (currently only possible using KVM_X86_SW_PROTECTED_VM, e.g. via KVM selftests), the callback is a noop and will WARN due to the RMP table not being present. It's actually expected in this case that the RMP table wouldn't be present and that the hook should be a noop, so drop the WARN_ONCE(). Reported-by: Sean Christopherson <[email protected]> Closes: https://lore.kernel.org/kvm/[email protected]/ Fixes: 8eb01900b018 ("KVM: SEV: Implement gmem hook for invalidating private pages") Suggested-by: Paolo Bonzini <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-06-03KVM: SEV: Automatically switch reclaimed pages to sharedMichael Roth1-24/+31
Currently there's a consistent pattern of always calling host_rmp_make_shared() immediately after snp_page_reclaim(), so go ahead and handle it automatically as part of snp_page_reclaim(). Also rename it to kvm_rmp_make_shared() to more easily distinguish it as a KVM-specific variant of the more generic rmp_make_shared() helper. Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-06-03KVM: SVM: Use KVM's snapshot of the host's XCR0 for SEV-ES host stateSean Christopherson1-1/+1
Use KVM's snapshot of the host's XCR0 when stuffing SEV-ES host state instead of reading XCR0 from hardware. XCR0 is only written during boot, i.e. won't change while KVM is running (and KVM at large is hosed if that doesn't hold true). Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-06-03KVM: x86: Add a struct to consolidate host values, e.g. EFER, XCR0, etc...Sean Christopherson1-1/+1
Add "struct kvm_host_values kvm_host" to hold the various host values that KVM snapshots during initialization. Bundling the host values into a single struct simplifies adding new MSRs and other features with host state/values that KVM cares about, and provides a one-stop shop. E.g. adding a new value requires one line, whereas tracking each value individual often requires three: declaration, definition, and export. No functional change intended. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-05-23KVM: SVM: WARN on vNMI + NMI window iff NMIs are outright maskedSean Christopherson1-8/+19
When requesting an NMI window, WARN on vNMI support being enabled if and only if NMIs are actually masked, i.e. if the vCPU is already handling an NMI. KVM's ABI for NMIs that arrive simultanesouly (from KVM's point of view) is to inject one NMI and pend the other. When using vNMI, KVM pends the second NMI simply by setting V_NMI_PENDING, and lets the CPU do the rest (hardware automatically sets V_NMI_BLOCKING when an NMI is injected). However, if KVM can't immediately inject an NMI, e.g. because the vCPU is in an STI shadow or is running with GIF=0, then KVM will request an NMI window and trigger the WARN (but still function correctly). Whether or not the GIF=0 case makes sense is debatable, as the intent of KVM's behavior is to provide functionality that is as close to real hardware as possible. E.g. if two NMIs are sent in quick succession, the probability of both NMIs arriving in an STI shadow is infinitesimally low on real hardware, but significantly larger in a virtual environment, e.g. if the vCPU is preempted in the STI shadow. For GIF=0, the argument isn't as clear cut, because the window where two NMIs can collide is much larger in bare metal (though still small). That said, KVM should not have divergent behavior for the GIF=0 case based on whether or not vNMI support is enabled. And KVM has allowed simultaneous NMIs with GIF=0 for over a decade, since commit 7460fb4a3400 ("KVM: Fix simultaneous NMIs"). I.e. KVM's GIF=0 handling shouldn't be modified without a *really* good reason to do so, and if KVM's behavior were to be modified, it should be done irrespective of vNMI support. Fixes: fa4c027a7956 ("KVM: x86: Add support for SVM's Virtual NMI") Cc: [email protected] Cc: Santosh Shukla <[email protected]> Cc: Maxim Levitsky <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-12KVM: SVM: Add module parameter to enable SEV-SNPBrijesh Singh1-1/+2
Add a module parameter than can be used to enable or disable the SEV-SNP feature. Now that KVM contains the support for the SNP set the GHCB hypervisor feature flag to indicate that SNP is supported. Signed-off-by: Brijesh Singh <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Signed-off-by: Ashish Kalra <[email protected]> Message-ID: <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-12KVM: SEV: Avoid WBINVD for HVA-based MMU notifications for SNPAshish Kalra1-1/+7
With SNP/guest_memfd, private/encrypted memory should not be mappable, and MMU notifications for HVA-mapped memory will only be relevant to unencrypted guest memory. Therefore, the rationale behind issuing a wbinvd_on_all_cpus() in sev_guest_memory_reclaimed() should not apply for SNP guests and can be ignored. Signed-off-by: Ashish Kalra <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> [mdr: Add some clarifications in commit] Signed-off-by: Michael Roth <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-12KVM: x86: Implement hook for determining max NPT mapping levelMichael Roth3-0/+21
In the case of SEV-SNP, whether or not a 2MB page can be mapped via a 2MB mapping in the guest's nested page table depends on whether or not any subpages within the range have already been initialized as private in the RMP table. The existing mixed-attribute tracking in KVM is insufficient here, for instance: - gmem allocates 2MB page - guest issues PVALIDATE on 2MB page - guest later converts a subpage to shared - SNP host code issues PSMASH to split 2MB RMP mapping to 4K - KVM MMU splits NPT mapping to 4K - guest later converts that shared page back to private At this point there are no mixed attributes, and KVM would normally allow for 2MB NPT mappings again, but this is actually not allowed because the RMP table mappings are 4K and cannot be promoted on the hypervisor side, so the NPT mappings must still be limited to 4K to match this. Implement a kvm_x86_ops.private_max_mapping_level() hook for SEV that checks for this condition and adjusts the mapping level accordingly. Reviewed-by: Paolo Bonzini <[email protected]> Signed-off-by: Michael Roth <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-12KVM: SEV: Implement gmem hook for invalidating private pagesMichael Roth3-0/+67
Implement a platform hook to do the work of restoring the direct map entries of gmem-managed pages and transitioning the corresponding RMP table entries back to the default shared/hypervisor-owned state. Signed-off-by: Michael Roth <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-12KVM: SEV: Implement gmem hook for initializing private pagesMichael Roth3-0/+105
This will handle the RMP table updates needed to put a page into a private state before mapping it into an SEV-SNP guest. Reviewed-by: Paolo Bonzini <[email protected]> Signed-off-by: Michael Roth <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-12KVM: SEV: Support SEV-SNP AP Creation NAE eventTom Lendacky3-3/+248
Add support for the SEV-SNP AP Creation NAE event. This allows SEV-SNP guests to alter the register state of the APs on their own. This allows the guest a way of simulating INIT-SIPI. A new event, KVM_REQ_UPDATE_PROTECTED_GUEST_STATE, is created and used so as to avoid updating the VMSA pointer while the vCPU is running. For CREATE The guest supplies the GPA of the VMSA to be used for the vCPU with the specified APIC ID. The GPA is saved in the svm struct of the target vCPU, the KVM_REQ_UPDATE_PROTECTED_GUEST_STATE event is added to the vCPU and then the vCPU is kicked. For CREATE_ON_INIT: The guest supplies the GPA of the VMSA to be used for the vCPU with the specified APIC ID the next time an INIT is performed. The GPA is saved in the svm struct of the target vCPU. For DESTROY: The guest indicates it wishes to stop the vCPU. The GPA is cleared from the svm struct, the KVM_REQ_UPDATE_PROTECTED_GUEST_STATE event is added to vCPU and then the vCPU is kicked. The KVM_REQ_UPDATE_PROTECTED_GUEST_STATE event handler will be invoked as a result of the event or as a result of an INIT. If a new VMSA is to be installed, the VMSA guest page is set as the VMSA in the vCPU VMCB and the vCPU state is set to KVM_MP_STATE_RUNNABLE. If a new VMSA is not to be installed, the VMSA is cleared in the vCPU VMCB and the vCPU state is set to KVM_MP_STATE_HALTED to prevent it from being run. Signed-off-by: Tom Lendacky <[email protected]> Co-developed-by: Michael Roth <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Ashish Kalra <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-12KVM: SEV: Add support to handle RMP nested page faultsBrijesh Singh3-4/+122
When SEV-SNP is enabled in the guest, the hardware places restrictions on all memory accesses based on the contents of the RMP table. When hardware encounters RMP check failure caused by the guest memory access it raises the #NPF. The error code contains additional information on the access type. See the APM volume 2 for additional information. When using gmem, RMP faults resulting from mismatches between the state in the RMP table vs. what the guest expects via its page table result in KVM_EXIT_MEMORY_FAULTs being forwarded to userspace to handle. This means the only expected case that needs to be handled in the kernel is when the page size of the entry in the RMP table is larger than the mapping in the nested page table, in which case a PSMASH instruction needs to be issued to split the large RMP entry into individual 4K entries so that subsequent accesses can succeed. Signed-off-by: Brijesh Singh <[email protected]> Co-developed-by: Michael Roth <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Ashish Kalra <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-12KVM: SEV: Add support to handle Page State Change VMGEXITMichael Roth2-0/+193
SEV-SNP VMs can ask the hypervisor to change the page state in the RMP table to be private or shared using the Page State Change NAE event as defined in the GHCB specification version 2. Forward these requests to userspace as KVM_EXIT_VMGEXITs, similar to how it is done for requests that don't use a GHCB page. As with the MSR-based page-state changes, use the existing KVM_HC_MAP_GPA_RANGE hypercall format to deliver these requests to userspace via KVM_EXIT_HYPERCALL. Signed-off-by: Michael Roth <[email protected]> Co-developed-by: Brijesh Singh <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Ashish Kalra <[email protected]> Message-ID: <[email protected]> Co-developed-by: Paolo Bonzini <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-12KVM: SEV: Add support to handle MSR based Page State Change VMGEXITMichael Roth1-0/+48
SEV-SNP VMs can ask the hypervisor to change the page state in the RMP table to be private or shared using the Page State Change MSR protocol as defined in the GHCB specification. When using gmem, private/shared memory is allocated through separate pools, and KVM relies on userspace issuing a KVM_SET_MEMORY_ATTRIBUTES KVM ioctl to tell the KVM MMU whether or not a particular GFN should be backed by private memory or not. Forward these page state change requests to userspace so that it can issue the expected KVM ioctls. The KVM MMU will handle updating the RMP entries when it is ready to map a private page into a guest. Use the existing KVM_HC_MAP_GPA_RANGE hypercall format to deliver these requests to userspace via KVM_EXIT_HYPERCALL. Signed-off-by: Michael Roth <[email protected]> Co-developed-by: Brijesh Singh <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Ashish Kalra <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-12KVM: SEV: Add support to handle GHCB GPA register VMGEXITBrijesh Singh2-6/+49
SEV-SNP guests are required to perform a GHCB GPA registration. Before using a GHCB GPA for a vCPU the first time, a guest must register the vCPU GHCB GPA. If hypervisor can work with the guest requested GPA then it must respond back with the same GPA otherwise return -1. On VMEXIT, verify that the GHCB GPA matches with the registered value. If a mismatch is detected, then abort the guest. Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Ashish Kalra <[email protected]> Signed-off-by: Michael Roth <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-12KVM: SEV: Add KVM_SEV_SNP_LAUNCH_FINISH commandBrijesh Singh1-0/+127
Add a KVM_SEV_SNP_LAUNCH_FINISH command to finalize the cryptographic launch digest which stores the measurement of the guest at launch time. Also extend the existing SNP firmware data structures to support disabling the use of Versioned Chip Endorsement Keys (VCEK) by guests as part of this command. While finalizing the launch flow, the code also issues the LAUNCH_UPDATE SNP firmware commands to encrypt/measure the initial VMSA pages for each configured vCPU, which requires setting the RMP entries for those pages to private, so also add handling to clean up the RMP entries for these pages whening freeing vCPUs during shutdown. Signed-off-by: Brijesh Singh <[email protected]> Co-developed-by: Michael Roth <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Harald Hoyer <[email protected]> Signed-off-by: Ashish Kalra <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-12KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE commandBrijesh Singh1-0/+230
A key aspect of a launching an SNP guest is initializing it with a known/measured payload which is then encrypted into guest memory as pre-validated private pages and then measured into the cryptographic launch context created with KVM_SEV_SNP_LAUNCH_START so that the guest can attest itself after booting. Since all private pages are provided by guest_memfd, make use of the kvm_gmem_populate() interface to handle this. The general flow is that guest_memfd will handle allocating the pages associated with the GPA ranges being initialized by each particular call of KVM_SEV_SNP_LAUNCH_UPDATE, copying data from userspace into those pages, and then the post_populate callback will do the work of setting the RMP entries for these pages to private and issuing the SNP firmware calls to encrypt/measure them. For more information see the SEV-SNP specification. Signed-off-by: Brijesh Singh <[email protected]> Co-developed-by: Michael Roth <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Ashish Kalra <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-12KVM: SEV: Add KVM_SEV_SNP_LAUNCH_START commandBrijesh Singh2-3/+174
KVM_SEV_SNP_LAUNCH_START begins the launch process for an SEV-SNP guest. The command initializes a cryptographic digest context used to construct the measurement of the guest. Other commands can then at that point be used to load/encrypt data into the guest's initial launch image. For more information see the SEV-SNP specification. Signed-off-by: Brijesh Singh <[email protected]> Co-developed-by: Michael Roth <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Ashish Kalra <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-12KVM: SEV: Add initial SEV-SNP supportBrijesh Singh3-2/+39
SEV-SNP builds upon existing SEV and SEV-ES functionality while adding new hardware-based security protection. SEV-SNP adds strong memory encryption and integrity protection to help prevent malicious hypervisor-based attacks such as data replay, memory re-mapping, and more, to create an isolated execution environment. Define a new KVM_X86_SNP_VM type which makes use of these capabilities and extend the KVM_SEV_INIT2 ioctl to support it. Also add a basic helper to check whether SNP is enabled and set PFERR_PRIVATE_ACCESS for private #NPFs so they are handled appropriately by KVM MMU. Signed-off-by: Brijesh Singh <[email protected]> Co-developed-by: Michael Roth <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Ashish Kalra <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-10Merge tag 'loongarch-kvm-6.10' of ↵Paolo Bonzini4-62/+57
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.10 1. Add ParaVirt IPI support. 2. Add software breakpoint support. 3. Add mmio trace events support.
2024-05-10Merge branch 'kvm-sev-es-ghcbv2' into HEADPaolo Bonzini2-11/+102
While the main additions from GHCB protocol version 1 to version 2 revolve mostly around SEV-SNP support, there are a number of changes applicable to SEV-ES guests as well. Pluck a handful patches from the SNP hypervisor patchset for GHCB-related changes that are also applicable to SEV-ES. A KVM_SEV_INIT2 field lets userspace can control the maximum GHCB protocol version advertised to guests and manage compatibility across kernels/versions.
2024-05-07KVM: SEV: Allow per-guest configuration of GHCB protocol versionMichael Roth2-3/+30
The GHCB protocol version may be different from one guest to the next. Add a field to track it for each KVM instance and extend KVM_SEV_INIT2 to allow it to be configured by userspace. Now that all SEV-ES support for GHCB protocol version 2 is in place, go ahead and default to it when creating SEV-ES guests through the new KVM_SEV_INIT2 interface. Keep the older KVM_SEV_ES_INIT interface restricted to GHCB protocol version 1. Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Michael Roth <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-07KVM: SEV: Add GHCB handling for termination requestsMichael Roth1-0/+9
GHCB version 2 adds support for a GHCB-based termination request that a guest can issue when it reaches an error state and wishes to inform the hypervisor that it should be terminated. Implement support for that similarly to GHCB MSR-based termination requests that are already available to SEV-ES guests via earlier versions of the GHCB protocol. See 'Termination Request' in the 'Invoking VMGEXIT' section of the GHCB specification for more details. Signed-off-by: Michael Roth <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-07KVM: SEV: Add GHCB handling for Hypervisor Feature Support requestsBrijesh Singh1-0/+14
Version 2 of the GHCB specification introduced advertisement of features that are supported by the Hypervisor. Now that KVM supports version 2 of the GHCB specification, bump the maximum supported protocol version. Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Ashish Kalra <[email protected]> Signed-off-by: Michael Roth <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-07KVM: SEV: Add support to handle AP reset MSR protocolTom Lendacky2-8/+49
Add support for AP Reset Hold being invoked using the GHCB MSR protocol, available in version 2 of the GHCB specification. Signed-off-by: Tom Lendacky <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Ashish Kalra <[email protected]> Signed-off-by: Michael Roth <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-05-07KVM: x86: Move synthetic PFERR_* sanity checks to SVM's #NPF handlerSean Christopherson1-0/+9
Move the sanity check that hardware never sets bits that collide with KVM- define synthetic bits from kvm_mmu_page_fault() to npf_interception(), i.e. make the sanity check #NPF specific. The legacy #PF path already WARNs if _any_ of bits 63:32 are set, and the error code that comes from VMX's EPT Violatation and Misconfig is 100% synthesized (KVM morphs VMX's EXIT_QUALIFICATION into error code flags). Add a compile-time assert in the legacy #PF handler to make sure that KVM- define flags are covered by its existing sanity check on the upper bits. Opportunistically add a description of PFERR_IMPLICIT_ACCESS, since we are removing the comment that defined it. Signed-off-by: Sean Christopherson <[email protected]> Reviewed-by: Kai Huang <[email protected]> Reviewed-by: Binbin Wu <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-19Merge x86 bugfixes from Linux 6.9-rc3Paolo Bonzini1-1/+1
Pull fix for SEV-SNP late disable bugs. Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-12KVM: SEV: use u64_to_user_ptr throughoutPaolo Bonzini1-22/+22
Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SEV: allow SEV-ES DebugSwap againPaolo Bonzini1-1/+1
The DebugSwap feature of SEV-ES provides a way for confidential guests to use data breakpoints. Its status is record in VMSA, and therefore attestation signatures depend on whether it is enabled or not. In order to avoid invalidating the signatures depending on the host machine, it was disabled by default (see commit 5abf6dceb066, "SEV: disable SEV-ES DebugSwap by default", 2024-03-09). However, we now have a new API to create SEV VMs that allows enabling DebugSwap based on what the user tells KVM to do, and we also changed the legacy KVM_SEV_ES_INIT API to never enable DebugSwap. It is therefore possible to re-enable the feature without breaking compatibility with kernels that pre-date the introduction of DebugSwap, so go ahead. Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SEV: introduce KVM_SEV_INIT2 operationPaolo Bonzini1-7/+46
The idea that no parameter would ever be necessary when enabling SEV or SEV-ES for a VM was decidedly optimistic. In fact, in some sense it's already a parameter whether SEV or SEV-ES is desired. Another possible source of variability is the desired set of VMSA features, as that affects the measurement of the VM's initial state and cannot be changed arbitrarily by the hypervisor. Create a new sub-operation for KVM_MEMORY_ENCRYPT_OP that can take a struct, and put the new op to work by including the VMSA features as a field of the struct. The existing KVM_SEV_INIT and KVM_SEV_ES_INIT use the full set of supported VMSA features for backwards compatibility. The struct also includes the usual bells and whistles for future extensibility: a flags field that must be zero for now, and some padding at the end. Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SEV: sync FPU and AVX state at LAUNCH_UPDATE_VMSA timePaolo Bonzini2-8/+50
SEV-ES allows passing custom contents for x87, SSE and AVX state into the VMSA. Allow userspace to do that with the usual KVM_SET_XSAVE API and only mark FPU contents as confidential after it has been copied and encrypted into the VMSA. Since the XSAVE state for AVX is the first, it does not need the compacted-state handling of get_xsave_addr(). However, there are other parts of XSAVE state in the VMSA that currently are not handled, and the validation logic of get_xsave_addr() is pointless to duplicate in KVM, so move get_xsave_addr() to public FPU API; it is really just a facility to operate on XSAVE state and does not expose any internal details of arch/x86/kernel/fpu. Acked-by: Dave Hansen <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SEV: define VM types for SEV and SEV-ESPaolo Bonzini3-3/+25
Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SEV: introduce to_kvm_sev_infoPaolo Bonzini2-2/+7
Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SEV: store VMSA features in kvm_sev_infoPaolo Bonzini3-10/+24
Right now, the set of features that are stored in the VMSA upon initialization is fixed and depends on the module parameters for kvm-amd.ko. However, the hypervisor cannot really change it at will because the feature word has to match between the hypervisor and whatever computes a measurement of the VMSA for attestation purposes. Add a field to kvm_sev_info that holds the set of features to be stored in the VMSA; and query it instead of referring to the module parameters. Because KVM_SEV_INIT and KVM_SEV_ES_INIT accept no parameters, this does not yet introduce any functional change, but it paves the way for an API that allows customization of the features per-VM. Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Reviewed-by: Michael Roth <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SEV: publish supported VMSA featuresPaolo Bonzini3-2/+25
Compute the set of features to be stored in the VMSA when KVM is initialized; move it from there into kvm_sev_info when SEV is initialized, and then into the initial VMSA. The new variable can then be used to return the set of supported features to userspace, via the KVM_GET_DEVICE_ATTR ioctl. Signed-off-by: Paolo Bonzini <[email protected]> Reviewed-by: Isaku Yamahata <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SVM: Compile sev.c if and only if CONFIG_KVM_AMD_SEV=yPaolo Bonzini3-40/+34
Stop compiling sev.c when CONFIG_KVM_AMD_SEV=n, as the number of #ifdefs in sev.c is getting ridiculous, and having #ifdefs inside of SEV helpers is quite confusing. To minimize #ifdefs in code flows, #ifdef away only the kvm_x86_ops hooks and the #VMGEXIT handler. Stubs are also restricted to functions that check sev_enabled and to the destruction functions sev_free_cpu() and sev_vm_destroy(), where the style of their callers is to leave checks to the callers. Most call sites instead rely on dead code elimination to take care of functions that are guarded with sev_guest() or sev_es_guest(). Signed-off-by: Sean Christopherson <[email protected]> Co-developed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-11KVM: SVM: Invert handling of SEV and SEV_ES feature flagsSean Christopherson1-4/+4
Leave SEV and SEV_ES '0' in kvm_cpu_caps by default, and instead set them in sev_set_cpu_caps() if SEV and SEV-ES support are fully enabled. Aside from the fact that sev_set_cpu_caps() is wildly misleading when it *clears* capabilities, this will allow compiling out sev.c without falsely advertising SEV/SEV-ES support in KVM_GET_SUPPORTED_CPUID. Signed-off-by: Sean Christopherson <[email protected]> Reviewed-by: Michael Roth <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-ID: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2024-04-09KVM: SVM: Create a stack frame in __svm_sev_es_vcpu_run()Sean Christopherson1-0/+4
Now that KVM uses the host save area to context switch RBP, i.e. preserves RBP for the entirety of __svm_sev_es_vcpu_run(), create a stack frame using the standared FRAME_{BEGIN,END} macros. Note, __svm_sev_es_vcpu_run() is subtly not a leaf function as it can call into ibpb_feature() via UNTRAIN_RET_VM. Reviewed-by: Tom Lendacky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-09KVM: SVM: Save/restore args across SEV-ES VMRUN via host save areaSean Christopherson1-16/+13
Use the host save area to preserve volatile registers that are used in __svm_sev_es_vcpu_run() to access function parameters after #VMEXIT. Like saving/restoring non-volatile registers, there's no reason not to take advantage of hardware restoring registers on #VMEXIT, as doing so shaves a few instructions and the save area is going to be accessed no matter what. Converting all register save/restore code to use the host save area also make it easier to follow the SEV-ES VMRUN flow in its entirety, as opposed to having a mix of stack-based versus host save area save/restore. Add a parameter to RESTORE_HOST_SPEC_CTRL_BODY so that the SEV-ES path doesn't need to write @spec_ctrl_intercepted to memory just to play nice with the common macro. Reviewed-by: Tom Lendacky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-09KVM: SVM: Save/restore non-volatile GPRs in SEV-ES VMRUN via host save areaSean Christopherson3-26/+35
Use the host save area to save/restore non-volatile (callee-saved) registers in __svm_sev_es_vcpu_run() to take advantage of hardware loading all registers from the save area on #VMEXIT. KVM still needs to save the registers it wants restored, but the loads are handled automatically by hardware. Aside from less assembly code, letting hardware do the restoration means stack frames are preserved for the entirety of __svm_sev_es_vcpu_run(). Opportunistically add a comment to call out why @svm needs to be saved across VMRUN->#VMEXIT, as it's not easy to decipher that from the macro hell. Cc: Tom Lendacky <[email protected]> Cc: Michael Roth <[email protected]> Cc: Alexey Kardashevskiy <[email protected]> Reviewed-by: Tom Lendacky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-09KVM: SVM: Clobber RAX instead of RBX when discarding spec_ctrl_interceptedSean Christopherson1-2/+2
POP @spec_ctrl_intercepted into RAX instead of RBX when discarding it from the stack so that __svm_sev_es_vcpu_run() doesn't modify any non-volatile registers. __svm_sev_es_vcpu_run() doesn't return a value, and RAX is already are clobbered multiple times in the #VMEXIT path. This will allowing using the host save area to save/restore non-volatile registers in __svm_sev_es_vcpu_run(). Reviewed-by: Tom Lendacky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-09KVM: SVM: Drop 32-bit "support" from __svm_sev_es_vcpu_run()Sean Christopherson1-31/+13
Drop 32-bit "support" from __svm_sev_es_vcpu_run(), as SEV/SEV-ES firmly 64-bit only. The "support" was purely the result of bad copy+paste from __svm_vcpu_run(), which in turn was slightly less bad copy+paste from __vmx_vcpu_run(). Opportunistically convert to unadulterated register accesses so that it's easier (but still not easy) to follow which registers hold what arguments, and when. Reviewed-by: Tom Lendacky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-09KVM: SVM: Wrap __svm_sev_es_vcpu_run() with #ifdef CONFIG_KVM_AMD_SEVSean Christopherson1-0/+2
Compile (and link) __svm_sev_es_vcpu_run() if and only if SEV support is actually enabled. This will allow dropping non-existent 32-bit "support" from __svm_sev_es_vcpu_run() without causing undue confusion. Intentionally don't provide a stub (but keep the declaration), as any sane compiler, even with things like KASAN enabled, should eliminate the call to __svm_sev_es_vcpu_run() since sev_es_guest() unconditionally returns "false" if CONFIG_KVM_AMD_SEV=n. Reviewed-by: Tom Lendacky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-09KVM: SVM: Create a stack frame in __svm_vcpu_run() for unwindingSean Christopherson1-0/+1
Unconditionally create a stack frame in __svm_vcpu_run() to play nice with unwinding via frame pointers, at least until the point where RBP is loaded with the guest's value. Don't bother conditioning the code on CONFIG_FRAME_POINTER=y, as RBP needs to be saved and restored anyways (due to it being clobbered with the guest's value); omitting the "MOV RSP, RBP" is not worth the extra #ifdef. Creating a stack frame will allow removing the OBJECT_FILES_NON_STANDARD tag from vmenter.S once __svm_sev_es_vcpu_run() is fixed to not stomp all over RBP for no reason. Reviewed-by: Tom Lendacky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-09KVM: SVM: Remove a useless zeroing of allocated memoryChristophe JAILLET1-1/+1
Remove KVM's unnecessary zeroing of memory when allocating the pages array in sev_pin_memory() via __vmalloc(), as the array is only used to hold kernel pointers. The kmalloc() path for "small" regions doesn't zero the array, and if KVM leaks state and/or accesses uninitialized data, then the kernel has bigger problems. Signed-off-by: Christophe JAILLET <[email protected]> Link: https://lore.kernel.org/r/c7619a3d3cbb36463531a7c73ccbde9db587986c.1710004509.git.christophe.jaillet@wanadoo.fr [sean: massage changelog] Signed-off-by: Sean Christopherson <[email protected]>
2024-04-06Merge branch 'linus' into x86/urgent, to pick up dependent commitIngo Molnar1-24/+34
We want to fix: 0e110732473e ("x86/retpoline: Do the necessary fixup to the Zen3/4 srso return thunk for !SRSO") So merge in Linus's latest into x86/urgent to have it available. Signed-off-by: Ingo Molnar <[email protected]>