aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/boot/compressed
AgeCommit message (Collapse)AuthorFilesLines
2022-12-13Merge tag 'x86_boot_for_v6.2' of ↵Linus Torvalds6-495/+506
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Borislav Petkov: "A of early boot cleanups and fixes. - Do some spring cleaning to the compressed boot code by moving the EFI mixed-mode code to a separate compilation unit, the AMD memory encryption early code where it belongs and fixing up build dependencies. Make the deprecated EFI handover protocol optional with the goal of removing it at some point (Ard Biesheuvel) - Skip realmode init code on Xen PV guests as it is not needed there - Remove an old 32-bit PIC code compiler workaround" * tag 'x86_boot_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Remove x86_32 PIC using %ebx workaround x86/boot: Skip realmode init code when running as Xen PV guest x86/efi: Make the deprecated EFI handover protocol optional x86/boot/compressed: Only build mem_encrypt.S if AMD_MEM_ENCRYPT=y x86/boot/compressed: Adhere to calling convention in get_sev_encryption_bit() x86/boot/compressed: Move startup32_check_sev_cbit() out of head_64.S x86/boot/compressed: Move startup32_check_sev_cbit() into .text x86/boot/compressed: Move startup32_load_idt() out of head_64.S x86/boot/compressed: Move startup32_load_idt() into .text section x86/boot/compressed: Pull global variable reference into startup32_load_idt() x86/boot/compressed: Avoid touching ECX in startup32_set_idt_entry() x86/boot/compressed: Simplify IDT/GDT preserve/restore in the EFI thunk x86/boot/compressed, efi: Merge multiple definitions of image_offset into one x86/boot/compressed: Move efi32_pe_entry() out of head_64.S x86/boot/compressed: Move efi32_entry out of head_64.S x86/boot/compressed: Move efi32_pe_entry into .text section x86/boot/compressed: Move bootargs parsing out of 32-bit startup code x86/boot/compressed: Move 32-bit entrypoint code into .text section x86/boot/compressed: Rename efi_thunk_64.S to efi-mixed.S
2022-12-13Merge tag 'efi-next-for-v6.2' of ↵Linus Torvalds1-6/+0
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: "Another fairly sizable pull request, by EFI subsystem standards. Most of the work was done by me, some of it in collaboration with the distro and bootloader folks (GRUB, systemd-boot), where the main focus has been on removing pointless per-arch differences in the way EFI boots a Linux kernel. - Refactor the zboot code so that it incorporates all the EFI stub logic, rather than calling the decompressed kernel as a EFI app. - Add support for initrd= command line option to x86 mixed mode. - Allow initrd= to be used with arbitrary EFI accessible file systems instead of just the one the kernel itself was loaded from. - Move some x86-only handling and manipulation of the EFI memory map into arch/x86, as it is not used anywhere else. - More flexible handling of any random seeds provided by the boot environment (i.e., systemd-boot) so that it becomes available much earlier during the boot. - Allow improved arch-agnostic EFI support in loaders, by setting a uniform baseline of supported features, and adding a generic magic number to the DOS/PE header. This should allow loaders such as GRUB or systemd-boot to reduce the amount of arch-specific handling substantially. - (arm64) Run EFI runtime services from a dedicated stack, and use it to recover from synchronous exceptions that might occur in the firmware code. - (arm64) Ensure that we don't allocate memory outside of the 48-bit addressable physical range. - Make EFI pstore record size configurable - Add support for decoding CXL specific CPER records" * tag 'efi-next-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (43 commits) arm64: efi: Recover from synchronous exceptions occurring in firmware arm64: efi: Execute runtime services from a dedicated stack arm64: efi: Limit allocations to 48-bit addressable physical region efi: Put Linux specific magic number in the DOS header efi: libstub: Always enable initrd command line loader and bump version efi: stub: use random seed from EFI variable efi: vars: prohibit reading random seed variables efi: random: combine bootloader provided RNG seed with RNG protocol output efi/cper, cxl: Decode CXL Error Log efi/cper, cxl: Decode CXL Protocol Error Section efi: libstub: fix efi_load_initrd_dev_path() kernel-doc comment efi: x86: Move EFI runtime map sysfs code to arch/x86 efi: runtime-maps: Clarify purpose and enable by default for kexec efi: pstore: Add module parameter for setting the record size efi: xen: Set EFI_PARAVIRT for Xen dom0 boot on all architectures efi: memmap: Move manipulation routines into x86 arch tree efi: memmap: Move EFI fake memmap support into x86 arch tree efi: libstub: Undeprecate the command line initrd loader efi: libstub: Add mixed mode support to command line initrd loader efi: libstub: Permit mixed mode return types other than efi_status_t ...
2022-11-24x86/efi: Make the deprecated EFI handover protocol optionalArd Biesheuvel1-1/+3
The EFI handover protocol permits a bootloader to invoke the kernel as a EFI PE/COFF application, while passing a bootparams struct as a third argument to the entrypoint function call. This has no basis in the UEFI specification, and there are better ways to pass additional data to a UEFI application (UEFI configuration tables, UEFI variables, UEFI protocols) than going around the StartImage() boot service and jumping to a fixed offset in the loaded image, just to call a different function that takes a third parameter. The reason for handling struct bootparams in the bootloader was that the EFI stub could only load initrd images from the EFI system partition, and so passing it via struct bootparams was needed for loaders like GRUB, which pass the initrd in memory, and may load it from anywhere, including from the network. Another motivation was EFI mixed mode, which could not use the initrd loader in the EFI stub at all due to 32/64 bit incompatibilities (which will be fixed shortly [0]), and could not invoke the ordinary PE/COFF entry point either, for the same reasons. Given that loaders such as GRUB already carried the bootparams handling in order to implement non-EFI boot, retaining that code and just passing bootparams to the EFI stub was a reasonable choice (although defining an alternate entrypoint could have been avoided.) However, the GRUB side changes never made it upstream, and are only shipped by some of the distros in their downstream versions. In the meantime, EFI support has been added to other Linux architecture ports, as well as to U-boot and systemd, including arch-agnostic methods for passing initrd images in memory [1], and for doing mixed mode boot [2], none of them requiring anything like the EFI handover protocol. So given that only out-of-tree distro GRUB relies on this, let's permit it to be omitted from the build, in preparation for retiring it completely at a later date. (Note that systemd-boot does have an implementation as well, but only uses it as a fallback for booting images that do not implement the LoadFile2 based initrd loading method, i.e., v5.8 or older) [0] https://lore.kernel.org/all/[email protected]/ [1] ec93fc371f01 ("efi/libstub: Add support for loading the initrd from a device path") [2] 97aa276579b2 ("efi/x86: Add true mixed mode entry point into .compat section") Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-24x86/boot/compressed: Only build mem_encrypt.S if AMD_MEM_ENCRYPT=yArd Biesheuvel2-3/+1
Avoid building the mem_encrypt.o object if memory encryption support is not enabled to begin with. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-24x86/boot/compressed: Adhere to calling convention in get_sev_encryption_bit()Ard Biesheuvel2-12/+3
Make get_sev_encryption_bit() follow the ordinary i386 calling convention, and only call it if CONFIG_AMD_MEM_ENCRYPT is actually enabled. This clarifies the calling code, and makes it more maintainable. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-24x86/boot/compressed: Move startup32_check_sev_cbit() out of head_64.SArd Biesheuvel2-71/+68
Now that the startup32_check_sev_cbit() routine can execute from anywhere and behaves like an ordinary function, it can be moved where it belongs. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-24x86/boot/compressed: Move startup32_check_sev_cbit() into .textArd Biesheuvel1-16/+19
Move startup32_check_sev_cbit() into the .text section and turn it into an ordinary function using the ordinary 32-bit calling convention, instead of saving/restoring the registers that are known to be live at the only call site. This improves maintainability, and makes it possible to move this function out of head_64.S and into a separate compilation unit that is specific to memory encryption. Note that this requires the call site to be moved before the mixed mode check, as %eax will be live otherwise. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-24x86/boot/compressed: Move startup32_load_idt() out of head_64.SArd Biesheuvel2-73/+71
Now that startup32_load_idt() has been refactored into an ordinary callable function, move it into mem-encrypt.S where it belongs. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-24x86/boot/compressed: Move startup32_load_idt() into .text sectionArd Biesheuvel1-11/+20
Convert startup32_load_idt() into an ordinary function and move it into the .text section. This involves turning the rva() immediates into ones derived from a local label, and preserving/restoring the %ebp and %ebx as per the calling convention. Also move the #ifdef to the only existing call site. This makes it clear that the function call does nothing if support for memory encryption is not compiled in. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-24x86/boot/compressed: Pull global variable reference into startup32_load_idt()Ard Biesheuvel1-12/+8
In preparation for moving startup32_load_idt() out of head_64.S and turning it into an ordinary function using the ordinary 32-bit calling convention, pull the global variable reference to boot32_idt up into startup32_load_idt() so that startup32_set_idt_entry() does not need to discover its own runtime physical address, which will no longer be correlated with startup_32 once this code is moved into .text. While at it, give startup32_set_idt_entry() static linkage. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-24x86/boot/compressed: Avoid touching ECX in startup32_set_idt_entry()Ard Biesheuvel1-6/+2
Avoid touching register %ecx in startup32_set_idt_entry(), by folding the MOV, SHL and ORL instructions into a single ORL which no longer requires a temp register. This permits ECX to be used as a function argument in a subsequent patch. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-24x86/boot/compressed: Simplify IDT/GDT preserve/restore in the EFI thunkArd Biesheuvel1-13/+7
Tweak the asm and remove some redundant instructions. While at it, fix the associated comment for style and correctness. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-24x86/boot/compressed, efi: Merge multiple definitions of image_offset into oneArd Biesheuvel2-8/+0
There is no need for head_32.S and head_64.S both declaring a copy of the global 'image_offset' variable, so drop those and make the extern C declaration the definition. When image_offset is moved to the .c file, it needs to be placed particularly in the .data section because it lands by default in the .bss section which is cleared too late, in .Lrelocated, before the first access to it and thus garbage gets read, leading to SEV guests exploding in early boot. This happens only when the SEV guest kernel is loaded through grub. If supplied with qemu's -kernel command line option, that memory is always cleared upfront by qemu and all is fine there. [ bp: Expand commit message with SEV aspect. ] Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-22x86/boot/compressed: Move efi32_pe_entry() out of head_64.SArd Biesheuvel2-86/+83
Move the implementation of efi32_pe_entry() into efi-mixed.S, which is a more suitable location that only gets built if EFI mixed mode is actually enabled. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-22x86/boot/compressed: Move efi32_entry out of head_64.SArd Biesheuvel2-55/+47
Move the efi32_entry() routine out of head_64.S and into efi-mixed.S, which reduces clutter in the complicated startup routines. It also permits linkage of some symbols used by code to be made local. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-22x86/boot/compressed: Move efi32_pe_entry into .text sectionArd Biesheuvel1-6/+5
Move efi32_pe_entry() into the .text section, so that it can be moved out of head_64.S and into a separate compilation unit in a subsequent patch. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-22x86/boot/compressed: Move bootargs parsing out of 32-bit startup codeArd Biesheuvel2-20/+47
Move the logic that chooses between the different EFI entrypoints out of the 32-bit boot path, and into a 64-bit helper that can perform the same task much more cleanly. While at it, document the mixed mode boot flow in a code comment. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-22x86/boot/compressed: Move 32-bit entrypoint code into .text sectionArd Biesheuvel1-14/+34
Move the code that stores the arguments passed to the EFI entrypoint into the .text section, so that it can be moved into a separate compilation unit in a subsequent patch. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-22x86/boot/compressed: Rename efi_thunk_64.S to efi-mixed.SArd Biesheuvel2-3/+3
In preparation for moving the mixed mode specific code out of head_64.S, rename the existing file to clarify that it contains more than just the mixed mode thunk. While at it, clean up the Makefile rules that add it to the build. Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-19x86/kaslr: Fix process_mem_region()'s return valueJiapeng Chong1-1/+1
Fix the following coccicheck warning: ./arch/x86/boot/compressed/kaslr.c:670:8-9: WARNING: return of 0/1 in function 'process_mem_region' with return type bool. Reported-by: Abaci Robot <[email protected]> Signed-off-by: Jiapeng Chong <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-18efi: libstub: Permit mixed mode return types other than efi_status_tArd Biesheuvel1-6/+0
Rework the EFI stub macro wrappers around protocol method calls and other indirect calls in order to allow return types other than efi_status_t. This means the widening should be conditional on whether or not the return type is efi_status_t, and should be omitted otherwise. Also, switch to _Generic() to implement the type based compile time conditionals, which is more concise, and distinguishes between efi_status_t and u64 properly. Signed-off-by: Ard Biesheuvel <[email protected]>
2022-11-01kbuild: upgrade the orphan section warning to an error if CONFIG_WERROR is setXin Li1-1/+1
Andrew Cooper suggested upgrading the orphan section warning to a hard link error. However Nathan Chancellor said outright turning the warning into an error with no escape hatch might be too aggressive, as we have had these warnings triggered by new compiler generated sections, and suggested turning orphan sections into an error only if CONFIG_WERROR is set. Kees Cook echoed and emphasized that the mandate from Linus is that we should avoid breaking builds. It wrecks bisection, it causes problems across compiler versions, etc. Thus upgrade the orphan section warning to a hard link error only if CONFIG_WERROR is set. Suggested-by: Andrew Cooper <[email protected]> Suggested-by: Nathan Chancellor <[email protected]> Signed-off-by: Xin Li <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Tested-by: Nathan Chancellor <[email protected]> Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-10-17arch: Introduce CONFIG_FUNCTION_ALIGNMENTPeter Zijlstra1-0/+8
Generic function-alignment infrastructure. Architectures can select FUNCTION_ALIGNMENT_xxB symbols; the FUNCTION_ALIGNMENT symbol is then set to the largest such selected size, 0 otherwise. From this the -falign-functions compiler argument and __ALIGN macro are set. This incorporates the DEBUG_FORCE_FUNCTION_ALIGN_64B knob and future alignment requirements for x86_64 (later in this series) into a single place. NOTE: also removes the 0x90 filler byte from the generic __ALIGN primitive, that value makes no sense outside of x86. NOTE: .balign 0 reverts to a no-op. Requested-by: Linus Torvalds <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-10-10Merge tag 'mm-stable-2022-10-08' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in linux-next for a couple of months without, to my knowledge, any negative reports (or any positive ones, come to that). - Also the Maple Tree from Liam Howlett. An overlapping range-based tree for vmas. It it apparently slightly more efficient in its own right, but is mainly targeted at enabling work to reduce mmap_lock contention. Liam has identified a number of other tree users in the kernel which could be beneficially onverted to mapletrees. Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat at [1]. This has yet to be addressed due to Liam's unfortunately timed vacation. He is now back and we'll get this fixed up. - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses clang-generated instrumentation to detect used-unintialized bugs down to the single bit level. KMSAN keeps finding bugs. New ones, as well as the legacy ones. - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of memory into THPs. - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support file/shmem-backed pages. - userfaultfd updates from Axel Rasmussen - zsmalloc cleanups from Alexey Romanov - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure - Huang Ying adds enhancements to NUMA balancing memory tiering mode's page promotion, with a new way of detecting hot pages. - memcg updates from Shakeel Butt: charging optimizations and reduced memory consumption. - memcg cleanups from Kairui Song. - memcg fixes and cleanups from Johannes Weiner. - Vishal Moola provides more folio conversions - Zhang Yi removed ll_rw_block() :( - migration enhancements from Peter Xu - migration error-path bugfixes from Huang Ying - Aneesh Kumar added ability for a device driver to alter the memory tiering promotion paths. For optimizations by PMEM drivers, DRM drivers, etc. - vma merging improvements from Jakub Matěn. - NUMA hinting cleanups from David Hildenbrand. - xu xin added aditional userspace visibility into KSM merging activity. - THP & KSM code consolidation from Qi Zheng. - more folio work from Matthew Wilcox. - KASAN updates from Andrey Konovalov. - DAMON cleanups from Kaixu Xia. - DAMON work from SeongJae Park: fixes, cleanups. - hugetlb sysfs cleanups from Muchun Song. - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core. Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1] * tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits) hugetlb: allocate vma lock for all sharable vmas hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer hugetlb: fix vma lock handling during split vma and range unmapping mglru: mm/vmscan.c: fix imprecise comments mm/mglru: don't sync disk for each aging cycle mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol mm: memcontrol: use do_memsw_account() in a few more places mm: memcontrol: deprecate swapaccounting=0 mode mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled mm/secretmem: remove reduntant return value mm/hugetlb: add available_huge_pages() func mm: remove unused inline functions from include/linux/mm_inline.h selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd selftests/vm: add thp collapse shmem testing selftests/vm: add thp collapse file and tmpfs testing selftests/vm: modularize thp collapse memory operations selftests/vm: dedup THP helpers mm/khugepaged: add tracepoint to hpage_collapse_scan_file() mm/madvise: add file and shmem support to MADV_COLLAPSE ...
2022-10-03x86: kmsan: disable instrumentation of unsupported codeAlexander Potapenko1-0/+1
Instrumenting some files with KMSAN will result in kernel being unable to link, boot or crashing at runtime for various reasons (e.g. infinite recursion caused by instrumentation hooks calling instrumented code again). Completely omit KMSAN instrumentation in the following places: - arch/x86/boot and arch/x86/realmode/rm, as KMSAN doesn't work for i386; - arch/x86/entry/vdso, which isn't linked with KMSAN runtime; - three files in arch/x86/kernel - boot problems; - arch/x86/mm/cpu_entry_area.c - recursion. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alexander Potapenko <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: David Rientjes <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Eric Biggers <[email protected]> Cc: Eric Biggers <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Ilya Leoshkevich <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Kees Cook <[email protected]> Cc: Marco Elver <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vegard Nossum <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-09-29kbuild: build init/built-in.a just onceMasahiro Yamada1-0/+1
Kbuild builds init/built-in.a twice; first during the ordinary directory descending, second from scripts/link-vmlinux.sh. We do this because UTS_VERSION contains the build version and the timestamp. We cannot update it during the normal directory traversal since we do not yet know if we need to update vmlinux. UTS_VERSION is temporarily calculated, but omitted from the update check. Otherwise, vmlinux would be rebuilt every time. When Kbuild results in running link-vmlinux.sh, it increments the version number in the .version file and takes the timestamp at that time to really fix UTS_VERSION. However, updating the same file twice is a footgun. To avoid nasty timestamp issues, all build artifacts that depend on init/built-in.a are atomically generated in link-vmlinux.sh, where some of them do not need rebuilding. To fix this issue, this commit changes as follows: [1] Split UTS_VERSION out to include/generated/utsversion.h from include/generated/compile.h include/generated/utsversion.h is generated just before the vmlinux link. It is generated under include/generated/ because some decompressors (s390, x86) use UTS_VERSION. [2] Split init_uts_ns and linux_banner out to init/version-timestamp.c from init/version.c init_uts_ns and linux_banner contain UTS_VERSION. During the ordinary directory descending, they are compiled with __weak and used to determine if vmlinux needs relinking. Just before the vmlinux link, they are compiled without __weak to embed the real version and timestamp. Signed-off-by: Masahiro Yamada <[email protected]>
2022-08-24x86/boot: Don't propagate uninitialized boot_params->cc_blob_addressMichael Roth2-1/+19
In some cases, bootloaders will leave boot_params->cc_blob_address uninitialized rather than zeroing it out. This field is only meant to be set by the boot/compressed kernel in order to pass information to the uncompressed kernel when SEV-SNP support is enabled. Therefore, there are no cases where the bootloader-provided values should be treated as anything other than garbage. Otherwise, the uncompressed kernel may attempt to access this bogus address, leading to a crash during early boot. Normally, sanitize_boot_params() would be used to clear out such fields but that happens too late: sev_enable() may have already initialized it to a valid value that should not be zeroed out. Instead, have sev_enable() zero it out unconditionally beforehand. Also ensure this happens for !CONFIG_AMD_MEM_ENCRYPT as well by also including this handling in the sev_enable() stub function. [ bp: Massage commit message and comments. ] Fixes: b190a043c49a ("x86/sev: Add SEV-SNP feature detection/setup") Reported-by: Jeremi Piotrowski <[email protected]> Reported-by: [email protected] Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: [email protected] Link: https://bugzilla.kernel.org/show_bug.cgi?id=216387 Link: https://lore.kernel.org/r/[email protected]
2022-08-10x86: link vdso and boot with -z noexecstack --no-warn-rwx-segmentsNick Desaulniers1-0/+4
Users of GNU ld (BFD) from binutils 2.39+ will observe multiple instances of a new warning when linking kernels in the form: ld: warning: arch/x86/boot/pmjump.o: missing .note.GNU-stack section implies executable stack ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker ld: warning: arch/x86/boot/compressed/vmlinux has a LOAD segment with RWX permissions Generally, we would like to avoid the stack being executable. Because there could be a need for the stack to be executable, assembler sources have to opt-in to this security feature via explicit creation of the .note.GNU-stack feature (which compilers create by default) or command line flag --noexecstack. Or we can simply tell the linker the production of such sections is irrelevant and to link the stack as --noexecstack. LLVM's LLD linker defaults to -z noexecstack, so this flag isn't strictly necessary when linking with LLD, only BFD, but it doesn't hurt to be explicit here for all linkers IMO. --no-warn-rwx-segments is currently BFD specific and only available in the current latest release, so it's wrapped in an ld-option check. While the kernel makes extensive usage of ELF sections, it doesn't use permissions from ELF segments. Link: https://lore.kernel.org/linux-block/[email protected]/ Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107 Link: https://github.com/llvm/llvm-project/issues/57009 Reported-and-tested-by: Jens Axboe <[email protected]> Suggested-by: Fangrui Song <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-07-06x86/compressed/64: Add identity mappings for setup_data entriesMichael Roth1-0/+13
The decompressed kernel initially relies on the identity map set up by the boot/compressed kernel for accessing things like boot_params. With the recent introduction of SEV-SNP support, the decompressed kernel also needs to access the setup_data entries pointed to by boot_params->hdr.setup_data. This can lead to a crash in the kexec kernel during early boot due to these entries not currently being included in the initial identity map, see thread at Link below. Include mappings for the setup_data entries in the initial identity map. [ bp: Massage commit message and use a helper var for better readability. ] Fixes: b190a043c49a ("x86/sev: Add SEV-SNP feature detection/setup") Reported-by: Jun'ichi Nomura <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/TYCPR01MB694815CD815E98945F63C99183B49@TYCPR01MB6948.jpnprd01.prod.outlook.com
2022-05-23Merge tag 'x86_tdx_for_v5.19_rc1' of ↵Linus Torvalds8-7/+132
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull Intel TDX support from Borislav Petkov: "Intel Trust Domain Extensions (TDX) support. This is the Intel version of a confidential computing solution called Trust Domain Extensions (TDX). This series adds support to run the kernel as part of a TDX guest. It provides similar guest protections to AMD's SEV-SNP like guest memory and register state encryption, memory integrity protection and a lot more. Design-wise, it differs from AMD's solution considerably: it uses a software module which runs in a special CPU mode called (Secure Arbitration Mode) SEAM. As the name suggests, this module serves as sort of an arbiter which the confidential guest calls for services it needs during its lifetime. Just like AMD's SNP set, this series reworks and streamlines certain parts of x86 arch code so that this feature can be properly accomodated" * tag 'x86_tdx_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) x86/tdx: Fix RETs in TDX asm x86/tdx: Annotate a noreturn function x86/mm: Fix spacing within memory encryption features message x86/kaslr: Fix build warning in KASLR code in boot stub Documentation/x86: Document TDX kernel architecture ACPICA: Avoid cache flush inside virtual machines x86/tdx/ioapic: Add shared bit for IOAPIC base address x86/mm: Make DMA memory shared for TD guest x86/mm/cpa: Add support for TDX shared memory x86/tdx: Make pages shared in ioremap() x86/topology: Disable CPU online/offline control for TDX guests x86/boot: Avoid #VE during boot for TDX platforms x86/boot: Set CR0.NE early and keep it set during the boot x86/acpi/x86/boot: Add multiprocessor wake-up support x86/boot: Add a trampoline for booting APs via firmware handoff x86/tdx: Wire up KVM hypercalls x86/tdx: Port I/O: Add early boot support x86/tdx: Port I/O: Add runtime hypercalls x86/boot: Port I/O: Add decompression-time support for TDX x86/boot: Port I/O: Allow to hook up alternative helpers ...
2022-04-20x86/boot: Put globals that are accessed early into the .data sectionMichael Roth2-2/+6
The helpers in arch/x86/boot/compressed/efi.c might be used during early boot to access the EFI system/config tables, and in some cases these EFI helpers might attempt to print debug/error messages, before console_init() has been called. __putstr() checks some variables to avoid printing anything before the console has been initialized, but this isn't enough since those variables live in .bss, which may not have been cleared yet. This can lead to a triple-fault occurring, primarily when booting in legacy/CSM mode (where EFI helpers will attempt to print some debug messages). Fix this by declaring these globals in .data section instead so there is no dependency on .bss being cleared before accessing them. Fixes: c01fce9cef849 ("x86/compressed: Add SEV-SNP feature detection/setup") Reported-by: Borislav Petkov <[email protected]> Suggested-by: Thomas Lendacky <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-17x86/boot: Add an efi.h header for the decompressorBorislav Petkov6-9/+131
Copy the needed symbols only and remove the kernel proper includes. No functional changes. Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-07x86/boot: Avoid #VE during boot for TDX platformsSean Christopherson2-3/+19
There are a few MSRs and control register bits that the kernel normally needs to modify during boot. But, TDX disallows modification of these registers to help provide consistent security guarantees. Fortunately, TDX ensures that these are all in the correct state before the kernel loads, which means the kernel does not need to modify them. The conditions to avoid are: * Any writes to the EFER MSR * Clearing CR4.MCE This theoretically makes the guest boot more fragile. If, for instance, EFER was set up incorrectly and a WRMSR was performed, it will trigger early exception panic or a triple fault, if it's before early exceptions are set up. However, this is likely to trip up the guest BIOS long before control reaches the kernel. In any case, these kinds of problems are unlikely to occur in production environments, and developers have good debug tools to fix them quickly. Change the common boot code to work on TDX and non-TDX systems. This should have no functional effect on non-TDX systems. Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]> Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Dan Williams <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-04-07x86/boot: Set CR0.NE early and keep it set during the bootKirill A. Shutemov1-3/+4
TDX guest requires CR0.NE to be set. Clearing the bit triggers #GP(0). If CR0.NE is 0, the MS-DOS compatibility mode for handling floating-point exceptions is selected. In this mode, the software exception handler for floating-point exceptions is invoked externally using the processor’s FERR#, INTR, and IGNNE# pins. Using FERR# and IGNNE# to handle floating-point exception is deprecated. CR0.NE=0 also limits newer processors to operate with one logical processor active. Kernel uses CR0_STATE constant to initialize CR0. It has NE bit set. But during early boot kernel has more ad-hoc approach to setting bit in the register. During some of this ad-hoc manipulation, CR0.NE is cleared. This causes a #GP in TDX guests and makes it die in early boot. Make CR0 initialization consistent, deriving the initial value of CR0 from CR0_STATE. Since CR0_STATE always has CR0.NE=1, this ensures that CR0.NE is never 0 and avoids the #GP. Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-04-07x86/boot: Port I/O: Add decompression-time support for TDXKirill A. Shutemov3-1/+65
Port I/O instructions trigger #VE in the TDX environment. In response to the exception, kernel emulates these instructions using hypercalls. But during early boot, on the decompression stage, it is cumbersome to deal with #VE. It is cleaner to go to hypercalls directly, bypassing #VE handling. Hook up TDX-specific port I/O helpers if booting in TDX environment. Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-04-07x86/boot: Port I/O: Allow to hook up alternative helpersKirill A. Shutemov2-1/+5
Port I/O instructions trigger #VE in the TDX environment. In response to the exception, kernel emulates these instructions using hypercalls. But during early boot, on the decompression stage, it is cumbersome to deal with #VE. It is cleaner to go to hypercalls directly, bypassing #VE handling. Add a way to hook up alternative port I/O helpers in the boot stub with a new pio_ops structure. For now, set the ops structure to just call the normal I/O operation functions. out*()/in*() macros redefined to use pio_ops callbacks. It eliminates need in changing call sites. io_delay() changed to use port I/O helper instead of inline assembly. Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-04-07x86: Consolidate port I/O helpersKirill A. Shutemov1-1/+1
There are two implementations of port I/O helpers: one in the kernel and one in the boot stub. Move the helpers required for both to <asm/shared/io.h> and use the one implementation everywhere. Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-04-07x86/tdx: Detect TDX at early kernel decompression timeKuppuswamy Sathyanarayanan5-0/+40
The early decompression code does port I/O for its console output. But, handling the decompression-time port I/O demands a different approach from normal runtime because the IDT required to support #VE based port I/O emulation is not yet set up. Paravirtualizing I/O calls during the decompression step is acceptable because the decompression code doesn't have a lot of call sites to IO instruction. To support port I/O in decompression code, TDX must be detected before the decompression code might do port I/O. Detect whether the kernel runs in a TDX guest. Add an early_is_tdx_guest() interface to query the cached TDX guest status in the decompression code. TDX is detected with CPUID. Make cpuid_count() accessible outside boot/cpuflags.c. TDX detection in the main kernel is very similar. Move common bits into <asm/shared/tdx.h>. The actual port I/O paravirtualization will come later in the series. Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]> Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Tony Luck <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-04-07x86/sev: Use firmware-validated CPUID for SEV-SNP guestsMichael Roth1-37/+0
SEV-SNP guests will be provided the location of special 'secrets' and 'CPUID' pages via the Confidential Computing blob. This blob is provided to the run-time kernel either through a boot_params field that was initialized by the boot/compressed kernel, or via a setup_data structure as defined by the Linux Boot Protocol. Locate the Confidential Computing blob from these sources and, if found, use the provided CPUID page/table address to create a copy that the run-time kernel will use when servicing CPUID instructions via a #VC handler. Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-07x86/sev: Add SEV-SNP feature detection/setupMichael Roth1-27/+0
Initial/preliminary detection of SEV-SNP is done via the Confidential Computing blob. Check for it prior to the normal SEV/SME feature initialization, and add some sanity checks to confirm it agrees with SEV-SNP CPUID/MSR bits. Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-07x86/compressed/64: Add identity mapping for Confidential Computing blobMichael Roth3-1/+25
The run-time kernel will need to access the Confidential Computing blob very early during boot to access the CPUID table it points to. At that stage, it will be relying on the identity-mapped page table set up by the boot/compressed kernel, so make sure the blob and the CPUID table it points to are mapped in advance. [ bp: Massage. ] Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-07x86/compressed: Export and rename add_identity_map()Michael Roth2-9/+10
SEV-specific code will need to add some additional mappings, but doing this within ident_map_64.c requires some SEV-specific helpers to be exported and some SEV-specific struct definitions to be pulled into ident_map_64.c. Instead, export add_identity_map() so SEV-specific (and other subsystem-specific) code can be better contained outside of ident_map_64.c. While at it, rename the function to kernel_add_identity_map(), similar to the kernel_ident_mapping_init() function it relies upon. No functional changes. Suggested-by: Borislav Petkov <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-07x86/compressed: Use firmware-validated CPUID leaves for SEV-SNP guestsMichael Roth1-0/+46
SEV-SNP guests will be provided the location of special 'secrets' 'CPUID' pages via the Confidential Computing blob. This blob is provided to the boot kernel either through an EFI config table entry, or via a setup_data structure as defined by the Linux Boot Protocol. Locate the Confidential Computing from these sources and, if found, use the provided CPUID page/table address to create a copy that the boot kernel will use when servicing CPUID instructions via a #VC CPUID handler. [ bp: s/cpuid/CPUID/ ] Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-07x86/compressed: Add SEV-SNP feature detection/setupMichael Roth1-1/+111
Initial/preliminary detection of SEV-SNP is done via the Confidential Computing blob. Check for it prior to the normal SEV/SME feature initialization, and add some sanity checks to confirm it agrees with SEV-SNP CPUID/MSR bits. Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-07x86/sev: Move MSR-based VMGEXITs for CPUID to helperMichael Roth1-0/+1
This code will also be used later for SEV-SNP-validated CPUID code in some cases, so move it to a common helper. While here, also add a check to terminate in cases where the CPUID function/subfunction is indexed and the subfunction is non-zero, since the GHCB MSR protocol does not support non-zero subfunctions. Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-06x86/compressed/acpi: Move EFI kexec handling into common codeMichael Roth2-60/+45
Future patches for SEV-SNP-validated CPUID will also require early parsing of the EFI configuration. Incrementally move the related code into a set of helpers that can be re-used for that purpose. In this instance, the current acpi.c kexec handling is mainly used to get the alternative EFI config table address provided by kexec via a setup_data entry of type SETUP_EFI. If not present, the code then falls back to normal EFI config table address provided by EFI system table. This would need to be done by all call-sites attempting to access the EFI config table, so just have efi_get_conf_table() handle that automatically. Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-06x86/compressed/acpi: Move EFI vendor table lookup to helperMichael Roth3-45/+106
Future patches for SEV-SNP-validated CPUID will also require early parsing of the EFI configuration. Incrementally move the related code into a set of helpers that can be re-used for that purpose. [ bp: Unbreak unnecessarily broken lines. ] Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-06x86/compressed/acpi: Move EFI config table lookup to helperMichael Roth3-17/+60
Future patches for SEV-SNP-validated CPUID will also require early parsing of the EFI configuration. Incrementally move the related code into a set of helpers that can be re-used for that purpose. [ bp: Remove superfluous zeroing of a stack variable. ] Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-06x86/compressed/acpi: Move EFI system table lookup to helperMichael Roth3-14/+42
Future patches for SEV-SNP-validated CPUID will also require early parsing of the EFI configuration. Incrementally move the related code into a set of helpers that can be re-used for that purpose. Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-06x86/compressed/acpi: Move EFI detection to helperMichael Roth4-18/+77
Future patches for SEV-SNP-validated CPUID will also require early parsing of the EFI configuration. Incrementally move the related code into a set of helpers that can be re-used for that purpose. First, carve out the functionality which determines the EFI environment type the machine is booting on. [ bp: Massage commit message. ] Signed-off-by: Michael Roth <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]