aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/boot/compressed/kaslr.c
AgeCommit message (Collapse)AuthorFilesLines
2024-07-02x86/efi: Drop support for fake EFI memory mapsArd Biesheuvel1-34/+9
Between kexec and confidential VM support, handling the EFI memory maps correctly on x86 is already proving to be rather difficult (as opposed to other EFI architectures which manage to never modify the EFI memory map to begin with) EFI fake memory map support is essentially a development hack (for testing new support for the 'special purpose' and 'more reliable' EFI memory attributes) that leaked into production code. The regions marked in this manner are not actually recognized as such by the firmware itself or the EFI stub (and never have), and marking memory as 'more reliable' seems rather futile if the underlying memory is just ordinary RAM. Marking memory as 'special purpose' in this way is also dubious, but may be in use in production code nonetheless. However, the same should be achievable by using the memmap= command line option with the ! operator. EFI fake memmap support is not enabled by any of the major distros (Debian, Fedora, SUSE, Ubuntu) and does not exist on other architectures, so let's drop support for it. Acked-by: Borislav Petkov (AMD) <[email protected]> Acked-by: Dan Williams <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
2023-10-18x86/boot: Rename conflicting 'boot_params' pointer to 'boot_params_ptr'Ard Biesheuvel1-13/+13
The x86 decompressor is built and linked as a separate executable, but it shares components with the kernel proper, which are either #include'd as C files, or linked into the decompresor as a static library (e.g, the EFI stub) Both the kernel itself and the decompressor define a global symbol 'boot_params' to refer to the boot_params struct, but in the former case, it refers to the struct directly, whereas in the decompressor, it refers to a global pointer variable referring to the struct boot_params passed by the bootloader or constructed from scratch. This ambiguity is unfortunate, and makes it impossible to assign this decompressor variable from the x86 EFI stub, given that declaring it as extern results in a clash. So rename the decompressor version (whose scope is limited) to boot_params_ptr. [ mingo: Renamed 'boot_params_p' to 'boot_params_ptr' for clarity ] Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: [email protected]
2023-06-06x86/boot/compressed: Handle unaccepted memoryKirill A. Shutemov1-12/+28
The firmware will pre-accept the memory used to run the stub. But, the stub is responsible for accepting the memory into which it decompresses the main kernel. Accept memory just before decompression starts. The stub is also responsible for choosing a physical address in which to place the decompressed kernel image. The KASLR mechanism will randomize this physical address. Since the accepted memory region is relatively small, KASLR would be quite ineffective if it only used the pre-accepted area (EFI_CONVENTIONAL_MEMORY). Ensure that KASLR randomizes among the entire physical address space by also including EFI_UNACCEPTED_MEMORY. Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Liam Merwick <[email protected]> Reviewed-by: Tom Lendacky <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-19x86/kaslr: Fix process_mem_region()'s return valueJiapeng Chong1-1/+1
Fix the following coccicheck warning: ./arch/x86/boot/compressed/kaslr.c:670:8-9: WARNING: return of 0/1 in function 'process_mem_region' with return type bool. Reported-by: Abaci Robot <[email protected]> Signed-off-by: Jiapeng Chong <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-09-29kbuild: build init/built-in.a just onceMasahiro Yamada1-0/+1
Kbuild builds init/built-in.a twice; first during the ordinary directory descending, second from scripts/link-vmlinux.sh. We do this because UTS_VERSION contains the build version and the timestamp. We cannot update it during the normal directory traversal since we do not yet know if we need to update vmlinux. UTS_VERSION is temporarily calculated, but omitted from the update check. Otherwise, vmlinux would be rebuilt every time. When Kbuild results in running link-vmlinux.sh, it increments the version number in the .version file and takes the timestamp at that time to really fix UTS_VERSION. However, updating the same file twice is a footgun. To avoid nasty timestamp issues, all build artifacts that depend on init/built-in.a are atomically generated in link-vmlinux.sh, where some of them do not need rebuilding. To fix this issue, this commit changes as follows: [1] Split UTS_VERSION out to include/generated/utsversion.h from include/generated/compile.h include/generated/utsversion.h is generated just before the vmlinux link. It is generated under include/generated/ because some decompressors (s390, x86) use UTS_VERSION. [2] Split init_uts_ns and linux_banner out to init/version-timestamp.c from init/version.c init_uts_ns and linux_banner contain UTS_VERSION. During the ordinary directory descending, they are compiled with __weak and used to determine if vmlinux needs relinking. Just before the vmlinux link, they are compiled without __weak to embed the real version and timestamp. Signed-off-by: Masahiro Yamada <[email protected]>
2022-04-17x86/boot: Add an efi.h header for the decompressorBorislav Petkov1-2/+1
Copy the needed symbols only and remove the kernel proper includes. No functional changes. Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-27x86/boot/compressed: Avoid duplicate malloc() implementationsKees Cook1-4/+0
The early malloc() and free() implementation in include/linux/decompress/mm.h (which is also included by the static decompressors) is static. This is fine when the only thing interested in using malloc() is the decompression code, but the x86 early boot environment may use malloc() in a couple places, leading to a potential collision when the static copies of the available memory region ("malloc_ptr") gets reset to the global "free_mem_ptr" value. As it happened, the existing usage pattern was accidentally safe because each user did 1 malloc() and 1 free() before returning and were not nested: extract_kernel() (misc.c) choose_random_location() (kaslr.c) mem_avoid_init() handle_mem_options() malloc() ... free() ... parse_elf() (misc.c) malloc() ... free() Once the future FGKASLR series is added, however, it will insert additional malloc() calls local to fgkaslr.c in the middle of parse_elf()'s malloc()/free() pair: parse_elf() (misc.c) malloc() if (...) { layout_randomized_image(output, &ehdr, phdrs); malloc() <- boom ... else layout_image(output, &ehdr, phdrs); free() To avoid collisions, there must be a single implementation of malloc(). Adjust include/linux/decompress/mm.h so that visibility can be controlled, provide prototypes in misc.h, and implement the functions in misc.c. This also results in a small size savings: $ size vmlinux.before vmlinux.after text data bss dec hex filename 8842314 468 178320 9021102 89a6ae vmlinux.before 8842240 468 178320 9021028 89a664 vmlinux.after Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-08-24x86/kaslr: Have process_mem_region() return a booleanJing Yangyang1-1/+1
Fix the following coccicheck warning: ./arch/x86/boot/compressed/kaslr.c:671:10-11:WARNING:return of 0/1 in function 'process_mem_region' with return type bool Generated by: scripts/coccinelle/misc/boolreturn.cocci Reported-by: Zeal Robot <[email protected]> Signed-off-by: Jing Yangyang <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-03-19x86/kaslr: Return boolean values from a function returning boolJiapeng Chong1-2/+2
Fix the following coccicheck warnings: ./arch/x86/boot/compressed/kaslr.c:642:10-11: WARNING: return of 0/1 in function 'process_mem_region' with return type bool. Reported-by: Abaci Robot <[email protected]> Signed-off-by: Jiapeng Chong <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/1615283963-67277-1-git-send-email-jiapeng.chong@linux.alibaba.com
2020-10-19x86/boot/64: Initialize 5-level paging variables earlierArvind Sankar1-8/+0
Commit ca0e22d4f011 ("x86/boot/compressed/64: Always switch to own page table") started using a new set of pagetables even without KASLR. After that commit, initialize_identity_maps() is called before the 5-level paging variables are setup in choose_random_location(), which will not work if 5-level paging is actually enabled. Fix this by moving the initialization of __pgtable_l5_enabled, pgdir_shift and ptrs_per_p4d into cleanup_trampoline(), which is called immediately after the finalization of whether the kernel is executing with 4- or 5-level paging. This will be earlier than anything that might require those variables, and keeps the 4- vs 5-level paging code all in one place. Fixes: ca0e22d4f011 ("x86/boot/compressed/64: Always switch to own page table") Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Joerg Roedel <[email protected]> Tested-by: Joerg Roedel <[email protected]> Tested-by: Kirill A. Shutemov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-09-07x86/boot/compressed/64: Don't pre-map memory in KASLR codeJoerg Roedel1-23/+1
With the page-fault handler in place, he identity mapping can be built on-demand. So remove the code which manually creates the mappings and unexport/remove the functions used for it. Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-09-07x86/boot/compressed/64: Always switch to own page tableJoerg Roedel1-3/+0
When booted through startup_64(), the kernel keeps running on the EFI page table until the KASLR code sets up its own page table. Without KASLR, the pre-decompression boot code never switches off the EFI page table. Change that by unconditionally switching to a kernel-controlled page table after relocation. This makes sure the kernel can make changes to the mapping when necessary, for example map pages unencrypted in SEV and SEV-ES guests. Also, remove the debug_putstr() calls in initialize_identity_maps() because the function now runs before console_init() is called. [ bp: Massage commit message. ] Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-09-07x86/boot/compressed/64: Rename kaslr_64.c to ident_map_64.cJoerg Roedel1-9/+0
The file contains only code related to identity-mapped page tables. Rename the file and compile it always in. Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-09-07Merge 'x86/kaslr' to pick up dependent bitsBorislav Petkov1-133/+105
Signed-off-by: Borislav Petkov <[email protected]>
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva1-1/+1
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <[email protected]>
2020-08-06x86/kaslr: Replace strlen() with strnlen()Arvind Sankar1-2/+6
strnlen is safer in case the command line is not NUL-terminated. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86: Add support for ZSTD compressed kernelNick Terrell1-7/+0
- Add support for zstd compressed kernel - Define __DISABLE_EXPORTS in Makefile - Remove __DISABLE_EXPORTS definition from kaslr.c - Bump the heap size for zstd. - Update the documentation. Integrates the ZSTD decompression code to the x86 pre-boot code. Zstandard requires slightly more memory during the kernel decompression on x86 (192 KB vs 64 KB), and the memory usage is independent of the window size. __DISABLE_EXPORTS is now defined in the Makefile, which covers both the existing use in kaslr.c, and the use needed by the zstd decompressor in misc.c. This patch has been boot tested with both a zstd and gzip compressed kernel on i386 and x86_64 using buildroot and QEMU. Additionally, this has been tested in production on x86_64 devices. We saw a 2 second boot time reduction by switching kernel compression from xz to zstd. Signed-off-by: Nick Terrell <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Tested-by: Sedat Dilek <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Add a check that the random address is in rangeArvind Sankar1-1/+11
Check in find_random_phys_addr() that the chosen address is inside the range that was required. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Make local variables 64-bitArvind Sankar1-6/+7
Change the type of local variables/fields that store mem_vector addresses to u64 to make it less likely that 32-bit overflow will cause issues on 32-bit. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Replace 'unsigned long long' with 'u64'Arvind Sankar1-7/+6
No functional change. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Make minimum/image_size 'unsigned long'Arvind Sankar1-2/+2
Change type of minimum/image_size arguments in process_mem_region to 'unsigned long'. These actually can never be above 4G (even on x86_64), and they're 'unsigned long' in every other function except this one. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Small cleanup of find_random_phys_addr()Arvind Sankar1-3/+2
Just a trivial rearrangement to do all the processing together, and only have one call to slots_fetch_random() in the source. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Drop unnecessary alignment in find_random_virt_addr()Arvind Sankar1-5/+1
Drop unnecessary alignment of image_size to CONFIG_PHYSICAL_ALIGN in find_random_virt_addr, it cannot change the result: the largest valid slot is the largest n that satisfies minimum + n * CONFIG_PHYSICAL_ALIGN + image_size <= KERNEL_IMAGE_SIZE (since minimum is already aligned) and so n is equal to (KERNEL_IMAGE_SIZE - minimum - image_size) / CONFIG_PHYSICAL_ALIGN even if image_size is not aligned to CONFIG_PHYSICAL_ALIGN. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Drop redundant check in store_slot_info()Arvind Sankar1-6/+3
Drop unnecessary check that number of slots is not zero in store_slot_info, it's guaranteed to be at least 1 by the calculation. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Make the type of number of slots/slot areas consistentArvind Sankar1-5/+3
The number of slots can be 'unsigned int', since on 64-bit, the maximum amount of memory is 2^52, the minimum alignment is 2^21, so the slot number cannot be greater than 2^31. But in case future processors have more than 52 physical address bits, make it 'unsigned long'. The slot areas are limited by MAX_SLOT_AREA, currently 100. It is indexed by an int, but the number of areas is stored as 'unsigned long'. Change both to 'unsigned int' for consistency. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Drop test for command-line parameters before parsingArvind Sankar1-4/+0
This check doesn't save anything. In the case when none of the parameters are present, each strstr will scan args twice (once to find the length and then for searching), six scans in total. Just going ahead and parsing the arguments only requires three scans: strlen, memcpy, and parsing. This will be the first malloc, so free will actually free up the memory, so the check doesn't save heap space either. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Simplify process_gb_huge_pages()Arvind Sankar1-26/+21
Replace the loop to determine the number of 1Gb pages with arithmetic. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Short-circuit gb_huge_pages on x86-32Arvind Sankar1-2/+2
32-bit does not have GB pages, so don't bother checking for them. Using the IS_ENABLED() macro allows the compiler to completely remove the gb_huge_pages code. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Fix off-by-one error in process_gb_huge_pages()Arvind Sankar1-1/+1
If the remaining size of the region is exactly 1Gb, there is still one hugepage that can be reserved. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Drop some redundant checks from __process_mem_region()Arvind Sankar1-21/+6
Clip the start and end of the region to minimum and mem_limit prior to the loop. region.start can only increase during the loop, so raising it to minimum before the loop is enough. A region that becomes empty due to this will get checked in the first iteration of the loop. Drop the check for overlap extending beyond the end of the region. This will get checked in the next loop iteration anyway. Rename end to region_end for symmetry with region.start. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Drop redundant variable in __process_mem_region()Arvind Sankar1-5/+2
region.size can be trimmed to store the portion of the region before the overlap, instead of a separate mem_vector variable. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Eliminate 'start_orig' local variable from __process_mem_region()Arvind Sankar1-6/+2
Set the region.size within the loop, which removes the need for start_orig. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Drop redundant cur_entry from __process_mem_region()Arvind Sankar1-6/+3
cur_entry is only used as cur_entry.start + cur_entry.size, which is always equal to end. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Fix off-by-one error in __process_mem_region()Arvind Sankar1-1/+1
In case of an overlap, the beginning of the region should be used even if it is exactly image_size, not just strictly larger. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Initialize mem_limit to the real maximum addressArvind Sankar1-19/+22
On 64-bit, the kernel must be placed below MAXMEM (64TiB with 4-level paging or 4PiB with 5-level paging). This is currently not enforced by KASLR, which thus implicitly relies on physical memory being limited to less than 64TiB. On 32-bit, the limit is KERNEL_IMAGE_SIZE (512MiB). This is enforced by special checks in __process_mem_region(). Initialize mem_limit to the maximum (depending on architecture), instead of ULLONG_MAX, and make sure the command-line arguments can only decrease it. This makes the enforcement explicit on 64-bit, and eliminates the 32-bit specific checks to keep the kernel below 512M. Check upfront to make sure the minimum address is below the limit before doing any work. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Fix process_efi_entries commentArvind Sankar1-2/+2
Since commit: 0982adc74673 ("x86/boot/KASLR: Work around firmware bugs by excluding EFI_BOOT_SERVICES_* and EFI_LOADER_* from KASLR's choice") process_efi_entries() will return true if we have an EFI memmap, not just if it contained EFI_MEMORY_MORE_RELIABLE regions. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Remove bogus warning and unnecessary gotoArvind Sankar1-6/+3
Drop the warning on seeing "--" in handle_mem_options(). This will trigger whenever one of the memory options is present in the command line together with "--", but there's no problem if that is the case. Replace goto with break. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-31x86/kaslr: Make command line handling saferArvind Sankar1-12/+14
Handle the possibility that the command line is NULL. Replace open-coded strlen with a function call. Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2019-11-26Merge tag 'acpi-5.5-rc1' of ↵Linus Torvalds1-7/+39
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These update the ACPICA code in the kernel to upstream revision 20191018, add support for EFI specific purpose memory, update the ACPI EC driver to make it work on systems with hardware-reduced ACPI, improve ACPI-based device enumeration for some platforms, rework the lid blacklist handling in the button driver and add more lid quirks to it, unify ACPI _HID/_UID matching, fix assorted issues and clean up the code and documentation. Specifics: - Update the ACPICA code in the kernel to upstream revision 20191018 including: * Fixes for Clang warnings (Bob Moore) * Fix for possible overflow in get_tick_count() (Bob Moore) * Introduction of acpi_unload_table() (Bob Moore) * Debugger and utilities updates (Erik Schmauss) * Fix for unloading tables loaded via configfs (Nikolaus Voss) - Add support for EFI specific purpose memory to optionally allow either application-exclusive or core-kernel-mm managed access to differentiated memory (Dan Williams) - Fix and clean up processing of the HMAT table (Brice Goglin, Qian Cai, Tao Xu) - Update the ACPI EC driver to make it work on systems with hardware-reduced ACPI (Daniel Drake) - Always build in support for the Generic Event Device (GED) to allow one kernel binary to work both on systems with full hardware ACPI and hardware-reduced ACPI (Arjan van de Ven) - Fix the table unload mechanism to unregister platform devices created when the given table was loaded (Andy Shevchenko) - Rework the lid blacklist handling in the button driver and add more lid quirks to it (Hans de Goede) - Improve ACPI-based device enumeration for some platforms based on Intel BayTrail SoCs (Hans de Goede) - Add an OpRegion driver for the Cherry Trail Crystal Cove PMIC and prevent handlers from being registered for unhandled PMIC OpRegions (Hans de Goede) - Unify ACPI _HID/_UID matching (Andy Shevchenko) - Clean up documentation and comments (Cao jin, James Pack, Kacper Piwiński)" * tag 'acpi-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits) ACPI: OSI: Shoot duplicate word ACPI: HMAT: use %u instead of %d to print u32 values ACPI: NUMA: HMAT: fix a section mismatch ACPI: HMAT: don't mix pxm and nid when setting memory target processor_pxm ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device ACPI: NUMA: HMAT: Register HMAT at device_initcall level device-dax: Add a driver for "hmem" devices dax: Fix alloc_dax_region() compile warning lib: Uplevel the pmem "region" ida to a global allocator x86/efi: Add efi_fake_mem support for EFI_MEMORY_SP arm/efi: EFI soft reservation to memblock x86/efi: EFI soft reservation to E820 enumeration efi: Common enable/disable infrastructure for EFI soft reservation x86/efi: Push EFI_MEMMAP check into leaf routines efi: Enumerate EFI_MEMORY_SP ACPI: NUMA: Establish a new drivers/acpi/numa/ directory ACPICA: Update version to 20191018 ACPICA: debugger: remove leading whitespaces when converting a string to a buffer ACPICA: acpiexec: initialize all simple types and field units from user input ACPICA: debugger: add field unit support for acpi_db_get_next_token ...
2019-11-12x86/boot: Introduce setup_indirectDaniel Kiper1-0/+12
The setup_data is a bit awkward to use for extremely large data objects, both because the setup_data header has to be adjacent to the data object and because it has a 32-bit length field. However, it is important that intermediate stages of the boot process have a way to identify which chunks of memory are occupied by kernel data. Thus introduce an uniform way to specify such indirect data as setup_indirect struct and SETUP_INDIRECT type. And finally bump setup_header version in arch/x86/boot/header.S. Suggested-by: H. Peter Anvin (Intel) <[email protected]> Signed-off-by: Daniel Kiper <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Ross Philipson <[email protected]> Reviewed-by: H. Peter Anvin (Intel) <[email protected]> Acked-by: Konrad Rzeszutek Wilk <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: [email protected] Cc: Boris Ostrovsky <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Ingo Molnar <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Juergen Gross <[email protected]> Cc: [email protected] Cc: [email protected] Cc: linux-efi <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Thomas Gleixner <[email protected]> Cc: x86-ml <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2019-11-07x86/efi: Add efi_fake_mem support for EFI_MEMORY_SPDan Williams1-7/+35
Given that EFI_MEMORY_SP is platform BIOS policy decision for marking memory ranges as "reserved for a specific purpose" there will inevitably be scenarios where the BIOS omits the attribute in situations where it is desired. Unlike other attributes if the OS wants to reserve this memory from the kernel the reservation needs to happen early in init. So early, in fact, that it needs to happen before e820__memblock_setup() which is a pre-requisite for efi_fake_memmap() that wants to allocate memory for the updated table. Introduce an x86 specific efi_fake_memmap_early() that can search for attempts to set EFI_MEMORY_SP via efi_fake_mem and update the e820 table accordingly. The KASLR code that scans the command line looking for user-directed memory reservations also needs to be updated to consider "efi_fake_mem=nn@ss:0x40000" requests. Acked-by: Ard Biesheuvel <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Signed-off-by: Dan Williams <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2019-11-07x86/efi: EFI soft reservation to E820 enumerationDan Williams1-0/+4
UEFI 2.8 defines an EFI_MEMORY_SP attribute bit to augment the interpretation of the EFI Memory Types as "reserved for a specific purpose". The proposed Linux behavior for specific purpose memory is that it is reserved for direct-access (device-dax) by default and not available for any kernel usage, not even as an OOM fallback. Later, through udev scripts or another init mechanism, these device-dax claimed ranges can be reconfigured and hot-added to the available System-RAM with a unique node identifier. This device-dax management scheme implements "soft" in the "soft reserved" designation by allowing some or all of the reservation to be recovered as typical memory. This policy can be disabled at compile-time with CONFIG_EFI_SOFT_RESERVE=n, or runtime with efi=nosoftreserve. This patch introduces 2 new concepts at once given the entanglement between early boot enumeration relative to memory that can optionally be reserved from the kernel page allocator by default. The new concepts are: - E820_TYPE_SOFT_RESERVED: Upon detecting the EFI_MEMORY_SP attribute on EFI_CONVENTIONAL memory, update the E820 map with this new type. Only perform this classification if the CONFIG_EFI_SOFT_RESERVE=y policy is enabled, otherwise treat it as typical ram. - IORES_DESC_SOFT_RESERVED: Add a new I/O resource descriptor for a device driver to search iomem resources for application specific memory. Teach the iomem code to identify such ranges as "Soft Reserved". Note that the comment for do_add_efi_memmap() needed refreshing since it seemed to imply that the efi map might overflow the e820 table, but that is not an issue as of commit 7b6e4ba3cb1f "x86/boot/e820: Clean up the E820_X_MAX definition" that removed the 128 entry limit for e820__range_add(). A follow-on change integrates parsing of the ACPI HMAT to identify the node and sub-range boundaries of EFI_MEMORY_SP designated memory. For now, just identify and reserve memory of this type. Acked-by: Ard Biesheuvel <[email protected]> Reported-by: kbuild test robot <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Signed-off-by: Dan Williams <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2019-03-06x86/boot/KASLR: Always return a value from process_mem_regionLouis Taylor1-1/+1
When compiling with -Wreturn-type, clang warns: arch/x86/boot/compressed/kaslr.c:704:1: warning: control may reach end of non-void function [-Wreturn-type] This function's return statement should have been placed outside the ifdeffed region. Move it there. Fixes: 690eaa532057 ("x86/boot/KASLR: Limit KASLR to extract the kernel in immovable memory only") Signed-off-by: Louis Taylor <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2019-02-06x86/boot: Fix randconfig build error due to MEMORY_HOTREMOVEBorislav Petkov1-1/+1
When building randconfigs, one of the failures is: ld: arch/x86/boot/compressed/kaslr.o: in function `choose_random_location': kaslr.c:(.text+0xbf7): undefined reference to `count_immovable_mem_regions' ld: kaslr.c:(.text+0xcbe): undefined reference to `immovable_mem' make[2]: *** [arch/x86/boot/compressed/vmlinux] Error 1 because CONFIG_ACPI is not enabled in this particular .config but CONFIG_MEMORY_HOTREMOVE is and count_immovable_mem_regions() is unresolvable because it is defined in compressed/acpi.c which is the compilation unit that depends on CONFIG_ACPI. Add CONFIG_ACPI to the explicit dependencies for MEMORY_HOTREMOVE. Signed-off-by: Borislav Petkov <[email protected]> Cc: Chao Fan <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2019-02-01x86/boot/KASLR: Limit KASLR to extract the kernel in immovable memory onlyChao Fan1-11/+60
KASLR may randomly choose a range which is located in movable memory regions. As a result, this will break memory hotplug and make the movable memory chosen by KASLR immovable. Therefore, limit KASLR to choose memory regions in the immovable range after consulting the SRAT table. [ bp: - Rewrite commit message. - Trim comments. ] Signed-off-by: Chao Fan <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Baoquan He <[email protected]> Cc: [email protected] Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: [email protected] Cc: Ingo Molnar <[email protected]> Cc: Juergen Gross <[email protected]> Cc: [email protected] Cc: Kees Cook <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: [email protected] Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-02-01x86/boot: Parse SRAT table and count immovable memory regionsChao Fan1-4/+0
Parse SRAT for the immovable memory regions and use that information to control which offset KASLR selects so that it doesn't overlap with any movable region. [ bp: - Move struct mem_vector where it is visible so that it builds. - Correct comments. - Rewrite commit message. ] Signed-off-by: Chao Fan <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Baoquan He <[email protected]> Cc: <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Juergen Gross <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-09-10x86/boot/KASLR: Remove return value from handle_mem_options()Chao Fan1-10/+8
It's not used by its sole user, so remove this unused functionality. Also remove a stray unused variable that GCC didn't warn about for some reason. Suggested-by: Dou Liyang <[email protected]> Signed-off-by: Chao Fan <[email protected]> Cc: Baoquan He <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-08-22module: allow symbol exports to be disabledArd Biesheuvel1-4/+1
To allow existing C code to be incorporated into the decompressor or the UEFI stub, introduce a CPP macro that turns all EXPORT_SYMBOL_xxx declarations into nops, and #define it in places where such exports are undesirable. Note that this gets rid of a rather dodgy redefine of linux/export.h's header guard. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ard Biesheuvel <[email protected]> Acked-by: Nicolas Pitre <[email protected]> Acked-by: Michael Ellerman <[email protected]> Reviewed-by: Will Deacon <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: James Morris <[email protected]> Cc: James Morris <[email protected]> Cc: Jessica Yu <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kees Cook <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Russell King <[email protected]> Cc: "Serge E. Hallyn" <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Garnier <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-07-30x86/boot/KASLR: Make local variable mem_limit staticzhong jiang1-1/+1
Fix the following sparse warning: arch/x86/boot/compressed/kaslr.c:102:20: warning: symbol 'mem_limit' was not declared. Should it be static? Signed-off-by: zhong jiang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-07-03x86/boot/KASLR: Skip specified number of 1GB huge pages when doing physical ↵Baoquan He1-5/+8
randomization (KASLR) When KASLR is enabled then 1GB huge pages allocations might regress sporadically. To reproduce on a KVM guest with 4GB RAM: - add the following options to the kernel command-line: 'default_hugepagesz=1G hugepagesz=1G hugepages=1' - boot the guest and check number of 1GB pages reserved: # grep HugePages_Total /proc/meminfo - sporadically, every couple of bootups the output of this command shows that when booting with "nokaslr" HugePages_Total is always 1, while booting without "nokaslr" sometimes HugePages_Total is set as 0 (that is, reserving the 1GB page failed). Note that you may need to boot a few times to trigger the issue, because it's somewhat non-deterministic. The root cause is that kernel may be put into the only good 1GB huge page in the [0x40000000, 0x7fffffff] physical range randomly. Below is the dmesg output snippet from the KVM guest. We can see that only [0x40000000, 0x7fffffff] region is good 1GB huge page, [0x100000000, 0x13fffffff] will be touched by the memblock top-down allocation: [...] e820: BIOS-provided physical RAM map: [...] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable [...] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved [...] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved [...] BIOS-e820: [mem 0x0000000000100000-0x00000000bffdffff] usable [...] BIOS-e820: [mem 0x00000000bffe0000-0x00000000bfffffff] reserved [...] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved [...] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved [...] BIOS-e820: [mem 0x0000000100000000-0x000000013fffffff] usable Besides, on bare-metal machines with larger memory, one less 1GB huge page might be available with KASLR enabled. That too is because the kernel image might be randomized into those "good" 1GB huge pages. To fix this, firstly parse the kernel command-line to get how many 1GB huge pages are specified. Then try to skip the specified number of 1GB huge pages when decide which memory region kernel can be randomized into. Also change the name of handle_mem_memmap() as handle_mem_options() since it handles not only 'mem=' and 'memmap=', but also 'hugepagesxxx' now. Signed-off-by: Baoquan He <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] [ Rewrote the changelog, fixed style problems in the code. ] Signed-off-by: Ingo Molnar <[email protected]>