aboutsummaryrefslogtreecommitdiff
path: root/arch/s390/kernel/uv.c
AgeCommit message (Collapse)AuthorFilesLines
2024-10-29s390/uv: Retrieve UV secrets sysfs supportSteffen Eiden1-1/+23
Reflect the updated content in the query information UVC to the sysfs at /sys/firmware/query * new UV-query sysfs entry for the maximum number of retrievable secrets the UV can store for one secure guest. * new UV-query sysfs entry for the maximum number of association secrets the UV can store for one secure guest. * max_secrets contains the sum of max association and max retrievable secrets. Reviewed-by: Christoph Schlameuss <[email protected]> Signed-off-by: Steffen Eiden <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Janosch Frank <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2024-10-29s390/uv: Retrieve UV secrets supportSteffen Eiden1-1/+128
Provide a kernel API to retrieve secrets from the UV secret store. Add two new functions: * `uv_get_secret_metadata` - get metadata for a given secret identifier * `uv_retrieve_secret` - get the secret value for the secret index With those two functions one can extract the secret for a given secret id, if the secret is retrievable. Reviewed-by: Christoph Schlameuss <[email protected]> Signed-off-by: Steffen Eiden <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Janosch Frank <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2024-10-29s390/uv: Provide host-key hashes in sysfsSteffen Eiden1-0/+71
Utilize the new Query Ultravisor Keys UVC to give user space the information which host-keys are installed on the system. Create a new sysfs directory 'firmware/uv/keys' that contains the hash of the host-key and the backup host-key of that system. Additionally, the file 'all' contains the response from the UVC possibly containing more key-hashes than currently known. Reviewed-by: Janosch Frank <[email protected]> Signed-off-by: Steffen Eiden <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Janosch Frank <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2024-10-29s390/uv: Refactor uv-sysfs creationSteffen Eiden1-10/+22
Streamline the sysfs generation to make it more extensible. Add a function to create a sysfs entry in the uv-sysfs dir. Use this function for the query directory. Reviewed-by: Janosch Frank <[email protected]> Reviewed-by: Christoph Schlameuss <[email protected]> Signed-off-by: Steffen Eiden <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Janosch Frank <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2024-09-01s390/uv: convert gmap_destroy_page() from follow_page() to folio_walkDavid Hildenbrand1-6/+12
Let's get rid of another follow_page() user and perform the UV calls under PTL -- which likely should be fine. No need for an additional reference while holding the PTL: uv_destroy_folio() and uv_convert_from_secure_folio() raise the refcount, so any concurrent make_folio_secure() would see an unexpted reference and cannot set PG_arch_1 concurrently. Do we really need a writable PTE? Likely yes, because the "destroy" part is, in comparison to the export, a destructive operation. So we'll keep the writability check for now. We'll lose the secretmem check from follow_page(). Likely we don't care about that here. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Janosch Frank <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-01s390/uv: drop arch_make_page_accessible()David Hildenbrand1-5/+0
All code was converted to using arch_make_folio_accessible(), let's drop arch_make_page_accessible(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Vishal Moola (Oracle) <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Janosch Frank <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-07-23s390: Remove protvirt and kvm config guards for uv codeJanosch Frank1-20/+15
Removing the CONFIG_PROTECTED_VIRTUALIZATION_GUEST ifdefs and config option as well as CONFIG_KVM ifdefs in uv files. Having this configurable has been more of a pain than a help. It's time to remove the ifdefs and the config option. Signed-off-by: Janosch Frank <[email protected]> Acked-by: Christian Borntraeger <[email protected]> Acked-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2024-06-05s390/uv: Implement HAVE_ARCH_MAKE_FOLIO_ACCESSIBLEDavid Hildenbrand1-7/+11
Let's also implement HAVE_ARCH_MAKE_FOLIO_ACCESSIBLE, so we can convert arch_make_page_accessible() to be a simple wrapper around arch_make_folio_accessible(). Unfortunately, we cannot do that in the header. There are only two arch_make_page_accessible() calls remaining in gup.c. We can now drop HAVE_ARCH_MAKE_PAGE_ACCESSIBLE completely form core-MM. We'll handle that separately, once the s390x part landed. Suggested-by: Matthew Wilcox <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]>
2024-06-05s390/uv: Convert uv_convert_owned_from_secure() to ↵David Hildenbrand1-5/+13
uv_convert_from_secure_(folio|pte)() Let's do the same as we did for uv_destroy_(folio|pte)() and have the following variants: (1) uv_convert_from_secure(): "low level" helper that operates on paddr and does not mess with folios. (2) uv_convert_from_secure_folio(): Consumes a folio to which we hold a reference. (3) uv_convert_from_secure_pte(): Consumes a PTE that holds a reference through the mapping. Unfortunately we need uv_convert_from_secure_pte(), because pfn_folio() and friends are not available in pgtable.h. Reviewed-by: Claudio Imbrenda <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]>
2024-06-05s390/uv: Convert uv_destroy_owned_page() to uv_destroy_(folio|pte)()David Hildenbrand1-7/+17
Let's have the following variants for destroying pages: (1) uv_destroy(): Like uv_pin_shared() and uv_convert_from_secure(), "low level" helper that operates on paddr and doesn't mess with folios. (2) uv_destroy_folio(): Consumes a folio to which we hold a reference. (3) uv_destroy_pte(): Consumes a PTE that holds a reference through the mapping. Unfortunately we need uv_destroy_pte(), because pfn_folio() and friends are not available in pgtable.h. Reviewed-by: Claudio Imbrenda <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]>
2024-06-05s390/uv: Make uv_convert_from_secure() a static functionDavid Hildenbrand1-1/+1
It's not used outside of uv.c, so let's make it a static function. Reviewed-by: Claudio Imbrenda <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]>
2024-06-05s390/uv: Update PG_arch_1 commentDavid Hildenbrand1-5/+4
We removed the usage of PG_arch_1 for page tables in commit a51324c430db ("s390/cmma: rework no-dat handling"). Let's update the comment in UV to reflect that. Reviewed-by: Claudio Imbrenda <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]>
2024-06-05s390/uv: Convert PG_arch_1 users to only work on small foliosDavid Hildenbrand1-16/+25
Now that make_folio_secure() may only set PG_arch_1 for small folios, let's convert relevant remaining UV code to only work on (small) folios and simply reject large folios early. This way, we'll never end up touching PG_arch_1 on tail pages of a large folio in UV code. The folio_get()/folio_put() for functions that are documented to already hold a folio reference look weird; likely they are required to make concurrent gmap_make_secure() back off because the caller might only hold an implicit reference due to the page mapping. So leave that alone for now. Reviewed-by: Claudio Imbrenda <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]>
2024-06-05s390/uv: Split large folios in gmap_make_secure()David Hildenbrand1-6/+25
While s390x makes sure to never have PMD-mapped THP in processes that use KVM -- by remapping them using PTEs in thp_split_walk_pmd_entry()->split_huge_pmd() -- there is still the possibility of having PTE-mapped THPs (large folios) mapped into guest memory. This would happen if user space allocates memory before calling KVM_CREATE_VM (which would call s390_enable_sie()). With upstream QEMU, this currently doesn't happen, because guest memory is setup and conditionally preallocated after KVM_CREATE_VM. Could it happen with shmem/file-backed memory when another process allocated memory in the pagecache? Likely, although currently not a common setup. Trying to split any PTE-mapped large folios sounds like the right and future-proof thing to do here. So let's call split_folio() and handle the return values accordingly. Reviewed-by: Claudio Imbrenda <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]>
2024-06-05s390/uv: gmap_make_secure() cleanups for further changesDavid Hildenbrand1-26/+40
Let's factor out handling of LRU cache draining and convert the if-else chain to a switch-case. Reviewed-by: Claudio Imbrenda <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]>
2024-06-05s390/uv: Don't call folio_wait_writeback() without a folio referenceDavid Hildenbrand1-0/+8
folio_wait_writeback() requires that no spinlocks are held and that a folio reference is held, as documented. After we dropped the PTL, the folio could get freed concurrently. So grab a temporary reference. Fixes: 214d9bbcd3a6 ("s390/mm: provide memory management functions for protected KVM guests") Reviewed-by: Claudio Imbrenda <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]>
2024-04-09s390/mm: Convert gmap_make_secure to use a folioMatthew Wilcox (Oracle)1-13/+14
Remove uses of deprecated page APIs, and move the check for large folios to here to avoid taking the folio lock if the folio is too large. We could do better here by attempting to split the large folio, but I'll leave that improvement for someone who can test it. Acked-by: Claudio Imbrenda <[email protected]> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexander Gordeev <[email protected]>
2024-04-09s390/mm: Convert make_page_secure to use a folioMatthew Wilcox (Oracle)1-13/+16
These page APIs are deprecated, so convert the incoming page to a folio and use the folio APIs instead. The ultravisor API cannot handle large folios, so return -EINVAL if one has slipped through. Acked-by: Claudio Imbrenda <[email protected]> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexander Gordeev <[email protected]>
2024-04-09s390/uv: export prot_virt_guest symbol in uvHolger Dengler1-0/+1
The inline function is_prot_virt_guest() in asm/uv.h makes use of the prot_virt_guest symbol. As this inline function can be called by other parts of the kernel (modules and built-in), the symbol should be exported, similar to the prot_virt_host symbol. One consumer of is_prot_virt_guest() will be the ap bus code. Cc: Janosch Frank <[email protected]> Cc: Claudio Imbrenda <[email protected]> Signed-off-by: Holger Dengler <[email protected]> Reviewed-by: Harald Freudenberger <[email protected]> Acked-by: Claudio Imbrenda <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]>
2023-08-28s390/uv: UV feature check utilitySteffen Eiden1-1/+1
Introduces a function to check the existence of an UV feature. Refactor feature bit checks to use the new function. Signed-off-by: Steffen Eiden <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Reviewed-by: Janosch Frank <[email protected]> Signed-off-by: Janosch Frank <[email protected]> Reviewed-by: Michael Mueller <[email protected]> Link: https://lore.kernel.org/r/[email protected] Message-Id: <[email protected]>
2023-08-18s390/uv: export uv_pin_shared for direct usageJanosch Frank1-1/+2
Export the uv_pin_shared function so that it can be called from other modules that carry a GPL-compatible license. Signed-off-by: Janosch Frank <[email protected]> Signed-off-by: Tony Krowiak <[email protected]> Tested-by: Viktor Mihajlovski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]>
2023-07-03Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-33/+75
Pull kvm updates from Paolo Bonzini: "ARM64: - Eager page splitting optimization for dirty logging, optionally allowing for a VM to avoid the cost of hugepage splitting in the stage-2 fault path. - Arm FF-A proxy for pKVM, allowing a pKVM host to safely interact with services that live in the Secure world. pKVM intervenes on FF-A calls to guarantee the host doesn't misuse memory donated to the hyp or a pKVM guest. - Support for running the split hypervisor with VHE enabled, known as 'hVHE' mode. This is extremely useful for testing the split hypervisor on VHE-only systems, and paves the way for new use cases that depend on having two TTBRs available at EL2. - Generalized framework for configurable ID registers from userspace. KVM/arm64 currently prevents arbitrary CPU feature set configuration from userspace, but the intent is to relax this limitation and allow userspace to select a feature set consistent with the CPU. - Enable the use of Branch Target Identification (FEAT_BTI) in the hypervisor. - Use a separate set of pointer authentication keys for the hypervisor when running in protected mode, as the host is untrusted at runtime. - Ensure timer IRQs are consistently released in the init failure paths. - Avoid trapping CTR_EL0 on systems with Enhanced Virtualization Traps (FEAT_EVT), as it is a register commonly read from userspace. - Erratum workaround for the upcoming AmpereOne part, which has broken hardware A/D state management. RISC-V: - Redirect AMO load/store misaligned traps to KVM guest - Trap-n-emulate AIA in-kernel irqchip for KVM guest - Svnapot support for KVM Guest s390: - New uvdevice secret API - CMM selftest and fixes - fix racy access to target CPU for diag 9c x86: - Fix missing/incorrect #GP checks on ENCLS - Use standard mmu_notifier hooks for handling APIC access page - Drop now unnecessary TR/TSS load after VM-Exit on AMD - Print more descriptive information about the status of SEV and SEV-ES during module load - Add a test for splitting and reconstituting hugepages during and after dirty logging - Add support for CPU pinning in demand paging test - Add support for AMD PerfMonV2, with a variety of cleanups and minor fixes included along the way - Add a "nx_huge_pages=never" option to effectively avoid creating NX hugepage recovery threads (because nx_huge_pages=off can be toggled at runtime) - Move handling of PAT out of MTRR code and dedup SVM+VMX code - Fix output of PIC poll command emulation when there's an interrupt - Add a maintainer's handbook to document KVM x86 processes, preferred coding style, testing expectations, etc. - Misc cleanups, fixes and comments Generic: - Miscellaneous bugfixes and cleanups Selftests: - Generate dependency files so that partial rebuilds work as expected" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (153 commits) Documentation/process: Add a maintainer handbook for KVM x86 Documentation/process: Add a label for the tip tree handbook's coding style KVM: arm64: Fix misuse of KVM_ARM_VCPU_POWER_OFF bit index RISC-V: KVM: Remove unneeded semicolon RISC-V: KVM: Allow Svnapot extension for Guest/VM riscv: kvm: define vcpu_sbi_ext_pmu in header RISC-V: KVM: Expose IMSIC registers as attributes of AIA irqchip RISC-V: KVM: Add in-kernel virtualization of AIA IMSIC RISC-V: KVM: Expose APLIC registers as attributes of AIA irqchip RISC-V: KVM: Add in-kernel emulation of AIA APLIC RISC-V: KVM: Implement device interface for AIA irqchip RISC-V: KVM: Skeletal in-kernel AIA irqchip support RISC-V: KVM: Set kvm_riscv_aia_nr_hgei to zero RISC-V: KVM: Add APLIC related defines RISC-V: KVM: Add IMSIC related defines RISC-V: KVM: Implement guest external interrupt line management KVM: x86: Remove PRIx* definitions as they are solely for user space s390/uv: Update query for secret-UVCs s390/uv: replace scnprintf with sysfs_emit s390/uvdevice: Add 'Lock Secret Store' UVC ...
2023-06-19s390: allow pte_offset_map_lock() to failHugh Dickins1-0/+2
In rare transient cases, not yet made possible, pte_offset_map() and pte_offset_map_lock() may not find a page table: handle appropriately. Add comment on mm's contract with s390 above __zap_zero_pages(), and fix old comment there: must be called after THP was disabled. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Hugh Dickins <[email protected]> Cc: Alexander Gordeev <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Chris Zankel <[email protected]> Cc: Claudio Imbrenda <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Greg Ungerer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Helge Deller <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: John David Anglin <[email protected]> Cc: John Paul Adrian Glaubitz <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Simek <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Mike Rapoport (IBM) <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Qi Zheng <[email protected]> Cc: Russell King <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-06-16s390/uv: Update query for secret-UVCsSteffen Eiden1-0/+40
Update the query struct such that secret-UVC related information can be parsed. Add sysfs files for these new values. 'supp_add_secret_req_ver' notes the supported versions for the Add Secret UVC. Bit 0 indicates that version 0x100 is supported, bit 1 indicates 0x200, and so on. 'supp_add_secret_pcf' notes the supported plaintext flags for the Add Secret UVC. 'supp_secret_types' notes the supported types of secrets. Bit 0 indicates secret type 1, bit 1 indicates type 2, and so on. 'max_secrets' notes the maximum amount of secrets the secret store can store per pv guest. Signed-off-by: Steffen Eiden <[email protected]> Reviewed-by: Janosch Frank <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Janosch Frank <[email protected]> Message-Id: <[email protected]>
2023-06-16s390/uv: replace scnprintf with sysfs_emitSteffen Eiden1-32/+26
Replace scnprintf(page, PAGE_SIZE, ...) with the page size aware sysfs_emit(buf, ...) which adds some sanity checks. Signed-off-by: Steffen Eiden <[email protected]> Reviewed-by: Janosch Frank <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Janosch Frank <[email protected]> Message-Id: <[email protected]>
2023-06-16s390/uv: Always export uv_infoSteffen Eiden1-1/+9
KVM needs the struct's values to be able to provide PV support. The uvdevice is currently guest only and will need the struct's values for call support checking and potential future expansions. As uv.c is only compiled with CONFIG_PGSTE or CONFIG_PROTECTED_VIRTUALIZATION_GUEST we don't need a second check in the code. Users of uv_info will need to fence for these two config options for the time being. Signed-off-by: Steffen Eiden <[email protected]> Reviewed-by: Janosch Frank <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Janosch Frank <[email protected]> Message-Id: <[email protected]>
2023-05-04KVM: s390: fix race in gmap_make_secure()Claudio Imbrenda1-21/+11
Fix a potential race in gmap_make_secure() and remove the last user of follow_page() without FOLL_GET. The old code is locking something it doesn't have a reference to, and as explained by Jason and David in this discussion: https://lore.kernel.org/linux-mm/Y9J4P%[email protected]/ it can lead to all kind of bad things, including the page getting unmapped (MADV_DONTNEED), freed, reallocated as a larger folio and the unlock_page() would target the wrong bit. There is also another race with the FOLL_WRITE, which could race between the follow_page() and the get_locked_pte(). The main point is to remove the last use of follow_page() without FOLL_GET or FOLL_PIN, removing the races can be considered a nice bonus. Link: https://lore.kernel.org/linux-mm/Y9J4P%[email protected]/ Suggested-by: Jason Gunthorpe <[email protected]> Fixes: 214d9bbcd3a6 ("s390/mm: provide memory management functions for protected KVM guests") Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Claudio Imbrenda <[email protected]> Message-Id: <[email protected]>
2022-11-23KVM: s390: pv: avoid export before import if possibleClaudio Imbrenda1-0/+7
If the appropriate UV feature bit is set, there is no need to perform an export before import. The misc feature indicates, among other things, that importing a shared page from a different protected VM will automatically also transfer its ownership. Signed-off-by: Claudio Imbrenda <[email protected]> Reviewed-by: Nico Boehr <[email protected]> Reviewed-by: Janosch Frank <[email protected]> Reviewed-by: Steffen Eiden <[email protected]> Link: https://lore.kernel.org/r/[email protected] Message-Id: <[email protected]> Signed-off-by: Janosch Frank <[email protected]>
2022-07-13KVM: s390: pv: add export before importClaudio Imbrenda1-0/+28
Due to upcoming changes, it will be possible to temporarily have multiple protected VMs in the same address space, although only one will be actually active. In that scenario, it is necessary to perform an export of every page that is to be imported, since the hardware does not allow a page belonging to a protected guest to be imported into a different protected guest. This also applies to pages that are shared, and thus accessible by the host. Signed-off-by: Claudio Imbrenda <[email protected]> Reviewed-by: Janosch Frank <[email protected]> Link: https://lore.kernel.org/r/[email protected] Message-Id: <[email protected]> Signed-off-by: Janosch Frank <[email protected]>
2022-07-13KVM: s390: pv: handle secure storage violations for protected guestsClaudio Imbrenda1-0/+55
A secure storage violation is triggered when a protected guest tries to access secure memory that has been mapped erroneously, or that belongs to a different protected guest or to the ultravisor. With upcoming patches, protected guests will be able to trigger secure storage violations in normal operation. This happens for example if a protected guest is rebooted with deferred destroy enabled and the new guest is also protected. When the new protected guest touches pages that have not yet been destroyed, and thus are accounted to the previous protected guest, a secure storage violation is raised. This patch adds handling of secure storage violations for protected guests. This exception is handled by first trying to destroy the page, because it is expected to belong to a defunct protected guest where a destroy should be possible. Note that a secure page can only be destroyed if its protected VM does not have any CPUs, which only happens when the protected VM is being terminated. If that fails, a normal export of the page is attempted. This means that pages that trigger the exception will be made non-secure (in one way or another) before attempting to use them again for a different secure guest. Signed-off-by: Claudio Imbrenda <[email protected]> Acked-by: Janosch Frank <[email protected]> Link: https://lore.kernel.org/r/[email protected] Message-Id: <[email protected]> Signed-off-by: Janosch Frank <[email protected]>
2022-07-11s390: Add attestation query informationSteffen Eiden1-0/+20
We have information about the supported attestation header version and plaintext attestation flag bits. Let's expose it via the sysfs files. Signed-off-by: Steffen Eiden <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Reviewed-by: Janosch Frank <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Janosch Frank <[email protected]>
2022-06-01s390/uv: Add dump fields to queryJanosch Frank1-0/+33
The new dump feature requires us to know how much memory is needed for the "dump storage state" and "dump finalize" ultravisor call. These values are reported via the UV query call. Signed-off-by: Janosch Frank <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Reviewed-by: Steffen Eiden <[email protected]> Link: https://lore.kernel.org/r/[email protected] Message-Id: <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2022-06-01s390/uv: Add SE hdr query informationJanosch Frank1-0/+20
We have information about the supported se header version and pcf bits so let's expose it via the sysfs files. Signed-off-by: Janosch Frank <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Reviewed-by: Steffen Eiden <[email protected]> Link: https://lore.kernel.org/r/[email protected] Message-Id: <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2021-12-16s390/uv: fix memblock virtual vs physical address confusionHeiko Carstens1-5/+5
memblock_alloc_try_nid() returns a virtual address, however in error case the allocated memory is incorrectly freed with memblock_phys_free(). Properly use memblock_free() instead, and pass a physical address to uv_init() to fix this. Note: this doesn't fix a bug currently, since virtual and physical addresses are identical. Reviewed-by: Alexander Gordeev <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2021-11-06Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-1/+1
Merge misc updates from Andrew Morton: "257 patches. Subsystems affected by this patch series: scripts, ocfs2, vfs, and mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache, gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc, pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools, memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm, vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram, cleanups, kfence, and damon)" * emailed patches from Andrew Morton <[email protected]>: (257 commits) mm/damon: remove return value from before_terminate callback mm/damon: fix a few spelling mistakes in comments and a pr_debug message mm/damon: simplify stop mechanism Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions Docs/admin-guide/mm/damon/start: simplify the content Docs/admin-guide/mm/damon/start: fix a wrong link Docs/admin-guide/mm/damon/start: fix wrong example commands mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on mm/damon: remove unnecessary variable initialization Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM) selftests/damon: support watermarks mm/damon/dbgfs: support watermarks mm/damon/schemes: activate schemes based on a watermarks mechanism tools/selftests/damon: update for regions prioritization of schemes mm/damon/dbgfs: support prioritization weights mm/damon/vaddr,paddr: support pageout prioritization mm/damon/schemes: prioritize regions within the quotas mm/damon/selftests: support schemes quotas mm/damon/dbgfs: support quotas of schemes ...
2021-11-06memblock: rename memblock_free to memblock_phys_freeMike Rapoport1-1/+1
Since memblock_free() operates on a physical range, make its name reflect it and rename it to memblock_phys_free(), so it will be a logical counterpart to memblock_phys_alloc(). The callers are updated with the below semantic patch: @@ expression addr; expression size; @@ - memblock_free(addr, size); + memblock_phys_free(addr, size); Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Shahab Vahedi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-10-27KVM: s390: pv: properly handle page flags for protected guestsClaudio Imbrenda1-1/+33
Introduce variants of the convert and destroy page functions that also clear the PG_arch_1 bit used to mark them as secure pages. The PG_arch_1 flag is always allowed to overindicate; using the new functions introduced here allows to reduce the extent of overindication and thus improve performance. These new functions can only be called on pages for which a reference is already being held. Signed-off-by: Claudio Imbrenda <[email protected]> Reviewed-by: Janosch Frank <[email protected]> Reviewed-by: Christian Borntraeger <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Borntraeger <[email protected]>
2021-10-25KVM: s390: pv: avoid stalls when making pages secureClaudio Imbrenda1-6/+23
Improve make_secure_pte to avoid stalls when the system is heavily overcommitted. This was especially problematic in kvm_s390_pv_unpack, because of the loop over all pages that needed unpacking. Due to the locks being held, it was not possible to simply replace uv_call with uv_call_sched. A more complex approach was needed, in which uv_call is replaced with __uv_call, which does not loop. When the UVC needs to be executed again, -EAGAIN is returned, and the caller (or its caller) will try again. When -EAGAIN is returned, the path is the same as when the page is in writeback (and the writeback check is also performed, which is harmless). Fixes: 214d9bbcd3a672 ("s390/mm: provide memory management functions for protected KVM guests") Signed-off-by: Claudio Imbrenda <[email protected]> Reviewed-by: Janosch Frank <[email protected]> Reviewed-by: Christian Borntraeger <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Borntraeger <[email protected]>
2021-10-25s390/uv: fully validate the VMA before calling follow_page()David Hildenbrand1-1/+1
We should not walk/touch page tables outside of VMA boundaries when holding only the mmap sem in read mode. Evil user space can modify the VMA layout just before this function runs and e.g., trigger races with page table removal code since commit dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap"). find_vma() does not check if the address is >= the VMA start address; use vma_lookup() instead. Fixes: 214d9bbcd3a6 ("s390/mm: provide memory management functions for protected KVM guests") Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Acked-by: Heiko Carstens <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Borntraeger <[email protected]>
2021-07-27s390/uv: de-duplicate checks for Protected Host VirtualizationAlexander Egorenkov1-15/+0
De-duplicate checks for Protected Host Virtualization in decompressor and kernel. Set prot_virt_host=0 in the decompressor in *any* of the following cases and hand it over to the decompressed kernel: * No explicit prot_virt=1 is given on the kernel command-line * Protected Guest Virtualization is enabled * Hardware support not present * kdump or stand-alone dump The decompressed kernel needs to use only is_prot_virt_host() instead of performing again all checks done by the decompressor. Signed-off-by: Alexander Egorenkov <[email protected]> Reviewed-by: Vasily Gorbik <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2021-07-05s390: mm: Fix secure storage access exception handlingJanosch Frank1-0/+10
Turns out that the bit 61 in the TEID is not always 1 and if that's the case the address space ID and the address are unpredictable. Without an address and its address space ID we can't export memory and hence we can only send a SIGSEGV to the process or panic the kernel depending on who caused the exception. Unfortunately bit 61 is only reliable if we have the "misc" UV feature bit. Signed-off-by: Janosch Frank <[email protected]> Reviewed-by: Christian Borntraeger <[email protected]> Fixes: 084ea4d611a3d ("s390/mm: add (non)secure page access exceptions handlers") Cc: [email protected] Signed-off-by: Vasily Gorbik <[email protected]>
2021-06-18s390: setup kernel memory layout earlyVasily Gorbik1-7/+1
Currently there are two separate places where kernel memory layout has to be known and adjusted: 1. early kasan setup. 2. paging setup later. Those 2 places had to be kept in sync and adjusted to reflect peculiar technical details of one another. With additional factors which influence kernel memory layout like ultravisor secure storage limit, complexity of keeping two things in sync grew up even more. Besides that if we look forward towards creating identity mapping and enabling DAT before jumping into uncompressed kernel - that would also require full knowledge of and control over kernel memory layout. So, de-duplicate and move kernel memory layout setup logic into the decompressor. Reviewed-by: Alexander Gordeev <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2021-04-12s390/protvirt: fix error return code in uv_info_init()zhongbaisong1-1/+3
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Reported-by: Hulk Robot <[email protected]> Signed-off-by: Baisong Zhong <[email protected]> Fixes: 37564ed834ac ("s390/uv: add prot virt guest/host indication files") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]>
2021-03-24s390/uv: fix prot virt host indication compilationJanosch Frank1-1/+7
prot_virt_host is only available if CONFIG_KVM is enabled. So lets use a variable initialized to zero and overwrite it when that config option is set with prot_virt_host. Signed-off-by: Janosch Frank <[email protected]> Fixes: 37564ed834ac ("s390/uv: add prot virt guest/host indication files") Reported-by: kernel test robot <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2021-03-22s390/uv: add prot virt guest/host indication filesJanosch Frank1-1/+36
Let's export the prot_virt_guest and prot_virt_host variables into the UV sysfs firmware interface to make them easily consumable by administrators. prot_virt_host being 1 indicates that we did the UV initialization (opt-in) prot_virt_guest being 1 indicates that the UV indicates the share and unshare ultravisor calls which is an indication that we are running as a protected guest. Signed-off-by: Janosch Frank <[email protected]> Acked-by: Christian Borntraeger <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2021-01-27s390: uv: Fix sysfs max number of VCPUs reportingJanosch Frank1-1/+1
The number reported by the query is N-1 and I think people reading the sysfs file would expect N instead. For users creating VMs there's no actual difference because KVM's limit is currently below the UV's limit. Signed-off-by: Janosch Frank <[email protected]> Fixes: a0f60f8431999 ("s390/protvirt: Add sysfs firmware interface for Ultravisor information") Cc: [email protected] Reviewed-by: Claudio Imbrenda <[email protected]> Acked-by: Cornelia Huck <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2020-11-18s390/uv: handle destroy page legacy interfaceChristian Borntraeger1-1/+8
Older firmware can return rc=0x107 rrc=0xd for destroy page if the page is already non-secure. This should be handled like a success as already done by newer firmware. Signed-off-by: Christian Borntraeger <[email protected]> Fixes: 1a80b54d1ce1 ("s390/uv: add destroy page call") Reviewed-by: David Hildenbrand <[email protected]> Acked-by: Cornelia Huck <[email protected]> Reviewed-by: Janosch Frank <[email protected]>
2020-09-16s390/kasan: support protvirt with 4-level pagingVasily Gorbik1-0/+3
Currently the kernel crashes in Kasan instrumentation code if CONFIG_KASAN_S390_4_LEVEL_PAGING is used on protected virtualization capable machine where the ultravisor imposes addressing limitations on the host and those limitations are lower then KASAN_SHADOW_OFFSET. The problem is that Kasan has to know in advance where vmalloc/modules areas would be. With protected virtualization enabled vmalloc/modules areas are moved down to the ultravisor secure storage limit while kasan still expects them at the very end of 4-level paging address space. To fix that make Kasan recognize when protected virtualization is enabled and predefine vmalloc/modules areas position which are compliant with ultravisor secure storage limit. Kasan shadow itself stays in place and might reside above that ultravisor secure storage limit. One slight difference compaired to a kernel without Kasan enabled is that vmalloc/modules areas position is not reverted to default if ultravisor initialization fails. It would still be below the ultravisor secure storage limit. Kernel layout with kasan, 4-level paging and protected virtualization enabled (ultravisor secure storage limit is at 0x0000800000000000): ---[ vmemmap Area Start ]--- 0x0000400000000000-0x0000400080000000 ---[ vmemmap Area End ]--- ---[ vmalloc Area Start ]--- 0x00007fe000000000-0x00007fff80000000 ---[ vmalloc Area End ]--- ---[ Modules Area Start ]--- 0x00007fff80000000-0x0000800000000000 ---[ Modules Area End ]--- ---[ Kasan Shadow Start ]--- 0x0018000000000000-0x001c000000000000 ---[ Kasan Shadow End ]--- 0x001c000000000000-0x0020000000000000 1P PGD I Kernel layout with kasan, 4-level paging and protected virtualization disabled/unsupported: ---[ vmemmap Area Start ]--- 0x0000400000000000-0x0000400060000000 ---[ vmemmap Area End ]--- ---[ Kasan Shadow Start ]--- 0x0018000000000000-0x001c000000000000 ---[ Kasan Shadow End ]--- ---[ vmalloc Area Start ]--- 0x001fffe000000000-0x001fffff80000000 ---[ vmalloc Area End ]--- ---[ Modules Area Start ]--- 0x001fffff80000000-0x0020000000000000 ---[ Modules Area End ]--- Signed-off-by: Vasily Gorbik <[email protected]>
2020-09-16s390/protvirt: support ultravisor without secure storage limitVasily Gorbik1-1/+2
Avoid potential crash due to lack of secure storage limit. Check that max_sec_stor_addr is not 0 before adjusting vmalloc position. Signed-off-by: Vasily Gorbik <[email protected]>
2020-09-16s390/protvirt: parse prot_virt option in the decompressorVasily Gorbik1-24/+16
To make early kernel address space layout definition possible parse prot_virt option in the decompressor and pass it to the uncompressed kernel. This enables kasan to take ultravisor secure storage limit into consideration and pre-define vmalloc position correctly. Signed-off-by: Vasily Gorbik <[email protected]>