aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/boot
AgeCommit message (Collapse)AuthorFilesLines
2020-09-14x86/boot/compressed: Disable relocation relaxationArvind Sankar1-0/+2
The x86-64 psABI [0] specifies special relocation types (R_X86_64_[REX_]GOTPCRELX) for indirection through the Global Offset Table, semantically equivalent to R_X86_64_GOTPCREL, which the linker can take advantage of for optimization (relaxation) at link time. This is supported by LLD and binutils versions 2.26 onwards. The compressed kernel is position-independent code, however, when using LLD or binutils versions before 2.27, it must be linked without the -pie option. In this case, the linker may optimize certain instructions into a non-position-independent form, by converting foo@GOTPCREL(%rip) to $foo. This potential issue has been present with LLD and binutils-2.26 for a long time, but it has never manifested itself before now: - LLD and binutils-2.26 only relax movq foo@GOTPCREL(%rip), %reg to leaq foo(%rip), %reg which is still position-independent, rather than mov $foo, %reg which is permitted by the psABI when -pie is not enabled. - GCC happens to only generate GOTPCREL relocations on mov instructions. - CLang does generate GOTPCREL relocations on non-mov instructions, but when building the compressed kernel, it uses its integrated assembler (due to the redefinition of KBUILD_CFLAGS dropping -no-integrated-as), which has so far defaulted to not generating the GOTPCRELX relocations. Nick Desaulniers reports [1,2]: "A recent change [3] to a default value of configuration variable (ENABLE_X86_RELAX_RELOCATIONS OFF -> ON) in LLVM now causes Clang's integrated assembler to emit R_X86_64_GOTPCRELX/R_X86_64_REX_GOTPCRELX relocations. LLD will relax instructions with these relocations based on whether the image is being linked as position independent or not. When not, then LLD will relax these instructions to use absolute addressing mode (R_RELAX_GOT_PC_NOPIC). This causes kernels built with Clang and linked with LLD to fail to boot." Patch series [4] is a solution to allow the compressed kernel to be linked with -pie unconditionally, but even if merged is unlikely to be backported. As a simple solution that can be applied to stable as well, prevent the assembler from generating the relaxed relocation types using the -mrelax-relocations=no option. For ease of backporting, do this unconditionally. [0] https://gitlab.com/x86-psABIs/x86-64-ABI/-/blob/master/x86-64-ABI/linker-optimization.tex#L65 [1] https://lore.kernel.org/lkml/20200807194100.3570838-1-ndesaulniers@google.com/ [2] https://github.com/ClangBuiltLinux/linux/issues/1121 [3] https://reviews.llvm.org/rGc41a18cf61790fc898dcda1055c3efbf442c14c0 [4] https://lore.kernel.org/lkml/20200731202738.2577854-1-nivedita@alum.mit.edu/ Reported-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200812004308.1448603-1-nivedita@alum.mit.edu
2020-09-10x86/sev-es: Check required CPU features for SEV-ESMartin Radev3-6/+6
Make sure the machine supports RDRAND, otherwise there is no trusted source of randomness in the system. To also check this in the pre-decompression stage, make has_cpuflag() not depend on CONFIG_RANDOMIZE_BASE anymore. Signed-off-by: Martin Radev <martin.b.radev@gmail.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-73-joro@8bytes.org
2020-09-10x86/efi: Add GHCB mappings when SEV-ES is activeTom Lendacky1-0/+1
Calling down to EFI runtime services can result in the firmware performing VMGEXIT calls. The firmware is likely to use the GHCB of the OS (e.g., for setting EFI variables), so each GHCB in the system needs to be identity-mapped in the EFI page tables, as unencrypted, to avoid page faults. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> [ jroedel@suse.de: Moved GHCB mapping loop to sev-es.c ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lkml.kernel.org/r/20200907131613.12703-72-joro@8bytes.org
2020-09-09x86/sev-es: Handle RDTSC(P) EventsTom Lendacky1-0/+4
Implement a handler for #VC exceptions caused by RDTSC and RDTSCP instructions. Also make it available in the pre-decompression stage because the KASLR code uses RDTSC/RDTSCP to gather entropy and some hypervisors intercept these instructions. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> [ jroedel@suse.de: - Adapt to #VC handling infrastructure - Make it available early ] Co-developed-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-55-joro@8bytes.org
2020-09-07x86/sev-es: Add CPUID handling to #VC handlerTom Lendacky1-0/+4
Handle #VC exceptions caused by CPUID instructions. These happen in early boot code when the KASLR code checks for RDTSC. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> [ jroedel@suse.de: Adapt to #VC handling framework ] Co-developed-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-28-joro@8bytes.org
2020-09-07x86/sev-es: Add support for handling IOIO exceptionsTom Lendacky1-0/+32
Add support for decoding and handling #VC exceptions for IOIO events. [ jroedel@suse.de: Adapted code to #VC handling framework ] Co-developed-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-26-joro@8bytes.org
2020-09-07x86/boot/compressed/64: Unmap GHCB page before booting the kernelJoerg Roedel3-2/+35
Force a page-fault on any further accesses to the GHCB page when they shouldn't happen anymore. This will catch any bugs where a #VC exception is raised even though none is expected anymore. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-25-joro@8bytes.org
2020-09-07x86/boot/compressed/64: Setup a GHCB-based VC Exception handlerJoerg Roedel6-1/+136
Install an exception handler for #VC exception that uses a GHCB. Also add the infrastructure for handling different exit-codes by decoding the instruction that caused the exception and error handling. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-24-joro@8bytes.org
2020-09-07x86/boot/compressed/64: Add set_page_en/decrypted() helpersJoerg Roedel2-0/+135
The functions are needed to map the GHCB for SEV-ES guests. The GHCB is used for communication with the hypervisor, so its content must not be encrypted. After the GHCB is not needed anymore it must be mapped encrypted again so that the running kernel image can safely re-use the memory. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-23-joro@8bytes.org
2020-09-07x86/boot/compressed/64: Check return value of kernel_ident_mapping_init()Joerg Roedel1-2/+5
The function can fail to create an identity mapping, check for that and bail out if it happens. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-22-joro@8bytes.org
2020-09-07x86/boot/compressed/64: Call set_sev_encryption_mask() earlierJoerg Roedel2-4/+8
Call set_sev_encryption_mask() while still on the stage 1 #VC-handler because the stage 2 handler needs the kernel's own page tables to be set up, to which calling set_sev_encryption_mask() is a prerequisite. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-21-joro@8bytes.org
2020-09-07x86/boot/compressed/64: Add stage1 #VC handlerJoerg Roedel5-0/+55
Add the first handler for #VC exceptions. At stage 1 there is no GHCB yet because the kernel might still be running on the EFI page table. The stage 1 handler is limited to the MSR-based protocol to talk to the hypervisor and can only support CPUID exit-codes, but that is enough to get to stage 2. [ bp: Zap superfluous newlines after rd/wrmsr instruction mnemonics. ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-20-joro@8bytes.org
2020-09-07x86/boot/compressed/64: Change add_identity_map() to take start and endJoerg Roedel1-10/+5
Changing the function to take start and end as parameters instead of start and size simplifies the callers which don't need to calculate the size if they already have start and end. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-19-joro@8bytes.org
2020-09-07x86/boot/compressed/64: Don't pre-map memory in KASLR codeJoerg Roedel3-37/+3
With the page-fault handler in place, he identity mapping can be built on-demand. So remove the code which manually creates the mappings and unexport/remove the functions used for it. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-18-joro@8bytes.org
2020-09-07x86/boot/compressed/64: Always switch to own page tableJoerg Roedel3-25/+32
When booted through startup_64(), the kernel keeps running on the EFI page table until the KASLR code sets up its own page table. Without KASLR, the pre-decompression boot code never switches off the EFI page table. Change that by unconditionally switching to a kernel-controlled page table after relocation. This makes sure the kernel can make changes to the mapping when necessary, for example map pages unencrypted in SEV and SEV-ES guests. Also, remove the debug_putstr() calls in initialize_identity_maps() because the function now runs before console_init() is called. [ bp: Massage commit message. ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-17-joro@8bytes.org
2020-09-07x86/boot/compressed/64: Add page-fault handlerJoerg Roedel4-0/+49
Install a page-fault handler to add an identity mapping to addresses not yet mapped. Also do some checking whether the error code is sane. This makes non SEV-ES machines use the exception handling infrastructure in the pre-decompressions boot code too, making it less likely to break in the future. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-16-joro@8bytes.org
2020-09-07x86/boot/compressed/64: Rename kaslr_64.c to ident_map_64.cJoerg Roedel4-10/+18
The file contains only code related to identity-mapped page tables. Rename the file and compile it always in. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-15-joro@8bytes.org
2020-09-07x86/boot/compressed/64: Add IDT InfrastructureJoerg Roedel5-1/+144
Add code needed to setup an IDT in the early pre-decompression boot-code. The IDT is loaded first in startup_64, which is after EfiExitBootServices() has been called, and later reloaded when the kernel image has been relocated to the end of the decompression area. This allows to setup different IDT handlers before and after the relocation. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200907131613.12703-14-joro@8bytes.org
2020-09-07x86/boot/compressed/64: Disable red-zone usageJoerg Roedel1-1/+1
The x86-64 ABI defines a red-zone on the stack: The 128-byte area beyond the location pointed to by %rsp is considered to be reserved and shall not be modified by signal or interrupt handlers. Therefore, functions may use this area for temporary data that is not needed across function calls. In particular, leaf functions may use this area for their entire stack frame, rather than adjusting the stack pointer in the prologue and epilogue. This area is known as the red zone. This is not compatible with exception handling, because the IRET frame written by the hardware at the stack pointer and the functions to handle the exception will overwrite the temporary variables of the interrupted function, causing undefined behavior. So disable red-zones for the pre-decompression boot code. Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/20200907131613.12703-13-joro@8bytes.org
2020-09-07Merge 'x86/kaslr' to pick up dependent bitsBorislav Petkov2-135/+107
Signed-off-by: Borislav Petkov <bp@suse.de>
2020-09-03x86/boot/compressed: Warn on orphan section placementKees Cook1-0/+1
We don't want to depend on the linker's orphan section placement heuristics as these can vary between linkers, and may change between versions. All sections need to be explicitly handled in the linker script. Now that all sections are explicitly handled, enable orphan section warnings. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20200902025347.2504702-6-keescook@chromium.org
2020-09-01x86/boot/compressed: Add missing debugging sections to outputKees Cook1-0/+2
Include the missing DWARF and STABS sections in the compressed image, when they are present. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200821194310.3089815-29-keescook@chromium.org
2020-09-01x86/boot/compressed: Remove, discard, or assert for unwanted sectionsKees Cook2-2/+13
In preparation for warning on orphan sections, stop the linker from generating the .eh_frame* sections, discard unwanted non-zero-sized generated sections, and enforce other expected-to-be-zero-sized sections (since discarding them might hide problems with them suddenly gaining unexpected entries). Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200821194310.3089815-28-keescook@chromium.org
2020-09-01x86/boot/compressed: Reorganize zero-size section assertsKees Cook1-18/+26
For readability, move the zero-sized sections to the end after DISCARDS. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200821194310.3089815-27-keescook@chromium.org
2020-09-01vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUGKees Cook1-0/+2
The .comment section doesn't belong in STABS_DEBUG. Split it out into a new macro named ELF_DETAILS. This will gain other non-debug sections that need to be accounted for when linking with --orphan-handling=warn. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-arch@vger.kernel.org Link: https://lore.kernel.org/r/20200821194310.3089815-5-keescook@chromium.org
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva2-3/+3
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-20x86/build: Declutter the build outputIngo Molnar1-4/+0
We have some really ancient debug printouts in the x86 boot image build code: Setup is 14108 bytes (padded to 14336 bytes). System is 8802 kB CRC 27e909d4 None of these ever helped debug any sort of breakage that I know of, and they clutter the build output. Remove them - if anyone needs the see the various interim stages of this to debug an obscure bug, they can add these printfs and more. We still keep this one: Kernel: arch/x86/boot/bzImage is ready (#19) As a sentimental leftover, plus the '#19' build count tag is mildly useful. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: x86@kernel.org
2020-08-19x86/boot/compressed: Use builtin mem functions for decompressorArvind Sankar2-9/+3
Since commits c041b5ad8640 ("x86, boot: Create a separate string.h file to provide standard string functions") fb4cac573ef6 ("x86, boot: Move memcmp() into string.h and string.c") the decompressor stub has been using the compiler's builtin memcpy, memset and memcmp functions, _except_ where it would likely have the largest impact, in the decompression code itself. Remove the #undef's of memcpy and memset in misc.c so that the decompressor code also uses the compiler builtins. The rationale given in the comment doesn't really apply: just because some functions use the out-of-line version is no reason to not use the builtin version in the rest. Replace the comment with an explanation of why memzero and memmove are being #define'd. Drop the suggestion to #undef in boot/string.h as well: the out-of-line versions are not really optimized versions, they're generic code that's good enough for the preboot environment. The compiler will likely generate better code for constant-size memcpy/memset/memcmp if it is allowed to. Most decompressors' performance is unchanged, with the exception of LZ4 and 64-bit ZSTD. Before After ARCH LZ4 73ms 10ms 32 LZ4 120ms 10ms 64 ZSTD 90ms 74ms 64 Measurements on QEMU on 2.2GHz Broadwell Xeon, using defconfig kernels. Decompressor code size has small differences, with the largest being that 64-bit ZSTD decreases just over 2k. The largest code size increase was on 64-bit XZ, of about 400 bytes. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Suggested-by: Nick Terrell <nickrterrell@gmail.com> Tested-by: Nick Terrell <nickrterrell@gmail.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-14x86/boot: Check that there are no run-time relocationsArvind Sankar2-25/+11
Add a linker script check that there are no run-time relocations, and remove the old one that tries to check via looking for specially-named sections in the object files. Drop the tests for -fPIE compiler option and -pie linker option, as they are available in all supported gcc and binutils versions (as well as clang and lld). Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Fangrui Song <maskray@google.com> Reviewed-by: Sedat Dilek <sedat.dilek@gmail.com> Link: https://lore.kernel.org/r/20200731230820.1742553-8-keescook@chromium.org
2020-08-14x86/boot: Remove run-time relocations from head_{32,64}.SArvind Sankar4-21/+18
The BFD linker generates run-time relocations for z_input_len and z_output_len, even though they are absolute symbols. This is fixed for binutils-2.35 [1]. Work around this for earlier versions by defining two variables input_len and output_len in addition to the symbols, and use them via position-independent references. This eliminates the last two run-time relocations in the head code and allows us to drop the -z noreloc-overflow flag to the linker. Move the -pie and --no-dynamic-linker LDFLAGS to LDFLAGS_vmlinux instead of KBUILD_LDFLAGS. There shouldn't be anything else getting linked, but this is the more logical location for these flags, and modversions might call the linker if an EXPORT_SYMBOL is left over accidentally in one of the decompressors. [1] https://sourceware.org/bugzilla/show_bug.cgi?id=25754 Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Fangrui Song <maskray@google.com> Link: https://lore.kernel.org/r/20200731230820.1742553-7-keescook@chromium.org
2020-08-14x86/boot: Remove run-time relocations from .head.text codeArvind Sankar2-78/+90
The assembly code in head_{32,64}.S, while meant to be position-independent, generates run-time relocations because it uses instructions such as: leal gdt(%edx), %eax which make the assembler and linker think that the code is using %edx as an index into gdt, and hence gdt needs to be relocated to its run-time address. On 32-bit, with lld Dmitry Golovin reports that this results in a link-time error with default options (i.e. unless -z notext is explicitly passed): LD arch/x86/boot/compressed/vmlinux ld.lld: error: can't create dynamic relocation R_386_32 against local symbol in readonly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocations in the output With the BFD linker, this generates a warning during the build, if --warn-shared-textrel is enabled, which at least Gentoo enables by default: LD arch/x86/boot/compressed/vmlinux ld: arch/x86/boot/compressed/head_32.o: warning: relocation in read-only section `.head.text' ld: warning: creating a DT_TEXTREL in object On 64-bit, it is not possible to link the kernel as -pie with lld, and it is only possible with a BFD linker that supports -z noreloc-overflow, i.e. versions >2.26. This is because these instructions cannot really be relocated: the displacement field is only 32-bits wide, and thus cannot be relocated for a 64-bit load address. The -z noreloc-overflow option simply overrides the linker error, and results in R_X86_64_RELATIVE relocations that apply a 64-bit relocation to a 32-bit field anyway. This happens to work because nothing will process these run-time relocations. Start fixing this by removing relocations from .head.text: - On 32-bit, use a base register that holds the address of the GOT and reference symbol addresses using @GOTOFF, i.e. leal gdt@GOTOFF(%edx), %eax - On 64-bit, most of the code can (and already does) use %rip-relative addressing, however the .code32 bits can't, and the 64-bit code also needs to reference symbol addresses as they will be after moving the compressed kernel to the end of the decompression buffer. For these cases, reference the symbols as an offset to startup_32 to avoid creating relocations, i.e.: leal (gdt-startup_32)(%bp), %eax This only works in .head.text as the subtraction cannot be represented as a PC-relative relocation unless startup_32 is in the same section as the code. Move efi32_pe_entry into .head.text so that it can use the same method to avoid relocations. Reported-by: Dmitry Golovin <dima@golovin.in> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Fangrui Song <maskray@google.com> Link: https://lore.kernel.org/r/20200731230820.1742553-6-keescook@chromium.org
2020-08-14x86/boot: Add .text.* to setup.ldArvind Sankar1-1/+1
GCC puts the main function into .text.startup when compiled with -Os (or -O2). This results in arch/x86/boot/main.c having a .text.startup section which is currently not included explicitly in the linker script setup.ld in the same directory. The BFD linker places this orphan section immediately after .text, so this still works. However, LLD git, since [1], is choosing to place it immediately after the .bstext section instead (this is the first code section). This plays havoc with the section layout that setup.elf requires to create the setup header, for eg on 64-bit: LD arch/x86/boot/setup.elf ld.lld: error: section .text.startup file range overlaps with .header >>> .text.startup range is [0x200040, 0x2001FE] >>> .header range is [0x2001EF, 0x20026B] ld.lld: error: section .header file range overlaps with .bsdata >>> .header range is [0x2001EF, 0x20026B] >>> .bsdata range is [0x2001FF, 0x200398] ld.lld: error: section .bsdata file range overlaps with .entrytext >>> .bsdata range is [0x2001FF, 0x200398] >>> .entrytext range is [0x20026C, 0x2002D3] ld.lld: error: section .text.startup virtual address range overlaps with .header >>> .text.startup range is [0x40, 0x1FE] >>> .header range is [0x1EF, 0x26B] ld.lld: error: section .header virtual address range overlaps with .bsdata >>> .header range is [0x1EF, 0x26B] >>> .bsdata range is [0x1FF, 0x398] ld.lld: error: section .bsdata virtual address range overlaps with .entrytext >>> .bsdata range is [0x1FF, 0x398] >>> .entrytext range is [0x26C, 0x2D3] ld.lld: error: section .text.startup load address range overlaps with .header >>> .text.startup range is [0x40, 0x1FE] >>> .header range is [0x1EF, 0x26B] ld.lld: error: section .header load address range overlaps with .bsdata >>> .header range is [0x1EF, 0x26B] >>> .bsdata range is [0x1FF, 0x398] ld.lld: error: section .bsdata load address range overlaps with .entrytext >>> .bsdata range is [0x1FF, 0x398] >>> .entrytext range is [0x26C, 0x2D3] Add .text.* to the .text output section to fix this, and also prevent any future surprises if the compiler decides to create other such sections. [1] https://reviews.llvm.org/D75225 Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Fangrui Song <maskray@google.com> Link: https://lore.kernel.org/r/20200731230820.1742553-5-keescook@chromium.org
2020-08-14x86/boot/compressed: Get rid of GOT fixup codeArd Biesheuvel3-80/+5
In a previous patch, we have eliminated GOT entries from the decompressor binary and added an assertion that the .got section is empty. This means that the GOT fixup routines that exist in both the 32-bit and 64-bit startup routines have become dead code, and can be removed. While at it, drop the KEEP() from the linker script, as it has no effect on the contents of output sections that are created by the linker itself. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Arvind Sankar <nivedita@alum.mit.edu> Link: https://lore.kernel.org/r/20200731230820.1742553-4-keescook@chromium.org
2020-08-14x86/boot/compressed: Force hidden visibility for all symbol referencesArd Biesheuvel2-0/+2
Eliminate all GOT entries in the decompressor binary, by forcing hidden visibility for all symbol references, which informs the compiler that such references will be resolved at link time without the need for allocating GOT entries. To ensure that no GOT entries will creep back in, add an assertion to the decompressor linker script that will fire if the .got section has a non-zero size. [Arvind: move hidden.h to include/linux instead of making a copy] Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Arvind Sankar <nivedita@alum.mit.edu> Link: https://lore.kernel.org/r/20200731230820.1742553-3-keescook@chromium.org
2020-08-14x86/boot/compressed: Move .got.plt entries out of the .got sectionArd Biesheuvel1-1/+10
The .got.plt section contains the part of the GOT which is used by PLT entries, and which gets updated lazily by the dynamic loader when function calls are dispatched through those PLT entries. On fully linked binaries such as the kernel proper or the decompressor, this never happens, and so in practice, the .got.plt section consists only of the first 3 magic entries that are meant to point at the _DYNAMIC section and at the fixup routine in the loader. However, since we don't use a dynamic loader, those entries are never populated or used. This means that treating those entries like ordinary GOT entries, and updating their values based on the actual placement of the executable in memory is completely pointless, and we can just ignore the .got.plt section entirely, provided that it has no additional entries beyond the first 3 ones. So add an assertion in the linker script to ensure that this assumption holds, and move the contents out of the [_got, _egot) memory range that is modified by the GOT fixup routines. While at it, drop the KEEP(), since it has no effect on the contents of output sections that are created by the linker itself. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Arvind Sankar <nivedita@alum.mit.edu> Link: https://lore.kernel.org/r/20200731230820.1742553-2-keescook@chromium.org
2020-08-09Merge tag 'kbuild-v5.9' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - run the checker (e.g. sparse) after the compiler - remove unneeded cc-option tests for old compiler flags - fix tar-pkg to install dtbs - introduce ccflags-remove-y and asflags-remove-y syntax - allow to trace functions in sub-directories of lib/ - introduce hostprogs-always-y and userprogs-always-y syntax - various Makefile cleanups * tag 'kbuild-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: stop filtering out $(GCC_PLUGINS_CFLAGS) from cc-option base kbuild: include scripts/Makefile.* only when relevant CONFIG is enabled kbuild: introduce hostprogs-always-y and userprogs-always-y kbuild: sort hostprogs before passing it to ifneq kbuild: move host .so build rules to scripts/gcc-plugins/Makefile kbuild: Replace HTTP links with HTTPS ones kbuild: trace functions in subdirectories of lib/ kbuild: introduce ccflags-remove-y and asflags-remove-y kbuild: do not export LDFLAGS_vmlinux kbuild: always create directories of targets powerpc/boot: add DTB to 'targets' kbuild: buildtar: add dtbs support kbuild: remove cc-option test of -ffreestanding kbuild: remove cc-option test of -fno-stack-protector Revert "kbuild: Create directory for target DTB" kbuild: run the checker after the compiler
2020-08-06x86/kaslr: Replace strlen() with strnlen()Arvind Sankar1-2/+6
strnlen is safer in case the command line is not NUL-terminated. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200803011534.730645-2-nivedita@alum.mit.edu
2020-07-31x86: Add support for ZSTD compressed kernelNick Terrell3-8/+9
- Add support for zstd compressed kernel - Define __DISABLE_EXPORTS in Makefile - Remove __DISABLE_EXPORTS definition from kaslr.c - Bump the heap size for zstd. - Update the documentation. Integrates the ZSTD decompression code to the x86 pre-boot code. Zstandard requires slightly more memory during the kernel decompression on x86 (192 KB vs 64 KB), and the memory usage is independent of the window size. __DISABLE_EXPORTS is now defined in the Makefile, which covers both the existing use in kaslr.c, and the use needed by the zstd decompressor in misc.c. This patch has been boot tested with both a zstd and gzip compressed kernel on i386 and x86_64 using buildroot and QEMU. Additionally, this has been tested in production on x86_64 devices. We saw a 2 second boot time reduction by switching kernel compression from xz to zstd. Signed-off-by: Nick Terrell <terrelln@fb.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20200730190841.2071656-7-nickrterrell@gmail.com
2020-07-31x86: Bump ZO_z_extra_bytes margin for zstdNick Terrell1-1/+7
Bump the ZO_z_extra_bytes margin for zstd. Zstd needs 3 bytes per 128 KB, and has a 22 byte fixed overhead. Zstd needs to maintain 128 KB of space at all times, since that is the maximum block size. See the comments regarding in-place decompression added in lib/decompress_unzstd.c for details. The existing code is written so that all the compression algorithms use the same ZO_z_extra_bytes. It is taken to be the maximum of the growth rate plus the maximum fixed overhead. The comments just above this diff state that: Signed-off-by: Nick Terrell <terrelln@fb.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20200730190841.2071656-6-nickrterrell@gmail.com
2020-07-31x86/kaslr: Add a check that the random address is in rangeArvind Sankar1-1/+11
Check in find_random_phys_addr() that the chosen address is inside the range that was required. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200728225722.67457-22-nivedita@alum.mit.edu
2020-07-31x86/kaslr: Make local variables 64-bitArvind Sankar1-6/+7
Change the type of local variables/fields that store mem_vector addresses to u64 to make it less likely that 32-bit overflow will cause issues on 32-bit. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200728225722.67457-21-nivedita@alum.mit.edu
2020-07-31x86/kaslr: Replace 'unsigned long long' with 'u64'Arvind Sankar2-9/+8
No functional change. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200728225722.67457-20-nivedita@alum.mit.edu
2020-07-31x86/kaslr: Make minimum/image_size 'unsigned long'Arvind Sankar1-2/+2
Change type of minimum/image_size arguments in process_mem_region to 'unsigned long'. These actually can never be above 4G (even on x86_64), and they're 'unsigned long' in every other function except this one. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200728225722.67457-19-nivedita@alum.mit.edu
2020-07-31x86/kaslr: Small cleanup of find_random_phys_addr()Arvind Sankar1-3/+2
Just a trivial rearrangement to do all the processing together, and only have one call to slots_fetch_random() in the source. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200728225722.67457-18-nivedita@alum.mit.edu
2020-07-31x86/kaslr: Drop unnecessary alignment in find_random_virt_addr()Arvind Sankar1-5/+1
Drop unnecessary alignment of image_size to CONFIG_PHYSICAL_ALIGN in find_random_virt_addr, it cannot change the result: the largest valid slot is the largest n that satisfies minimum + n * CONFIG_PHYSICAL_ALIGN + image_size <= KERNEL_IMAGE_SIZE (since minimum is already aligned) and so n is equal to (KERNEL_IMAGE_SIZE - minimum - image_size) / CONFIG_PHYSICAL_ALIGN even if image_size is not aligned to CONFIG_PHYSICAL_ALIGN. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200728225722.67457-17-nivedita@alum.mit.edu
2020-07-31x86/kaslr: Drop redundant check in store_slot_info()Arvind Sankar1-6/+3
Drop unnecessary check that number of slots is not zero in store_slot_info, it's guaranteed to be at least 1 by the calculation. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200728225722.67457-16-nivedita@alum.mit.edu
2020-07-31x86/kaslr: Make the type of number of slots/slot areas consistentArvind Sankar1-5/+3
The number of slots can be 'unsigned int', since on 64-bit, the maximum amount of memory is 2^52, the minimum alignment is 2^21, so the slot number cannot be greater than 2^31. But in case future processors have more than 52 physical address bits, make it 'unsigned long'. The slot areas are limited by MAX_SLOT_AREA, currently 100. It is indexed by an int, but the number of areas is stored as 'unsigned long'. Change both to 'unsigned int' for consistency. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200728225722.67457-15-nivedita@alum.mit.edu
2020-07-31x86/kaslr: Drop test for command-line parameters before parsingArvind Sankar1-4/+0
This check doesn't save anything. In the case when none of the parameters are present, each strstr will scan args twice (once to find the length and then for searching), six scans in total. Just going ahead and parsing the arguments only requires three scans: strlen, memcpy, and parsing. This will be the first malloc, so free will actually free up the memory, so the check doesn't save heap space either. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200728225722.67457-14-nivedita@alum.mit.edu
2020-07-31x86/kaslr: Simplify process_gb_huge_pages()Arvind Sankar1-26/+21
Replace the loop to determine the number of 1Gb pages with arithmetic. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200728225722.67457-13-nivedita@alum.mit.edu
2020-07-31x86/kaslr: Short-circuit gb_huge_pages on x86-32Arvind Sankar1-2/+2
32-bit does not have GB pages, so don't bother checking for them. Using the IS_ENABLED() macro allows the compiler to completely remove the gb_huge_pages code. Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200728225722.67457-12-nivedita@alum.mit.edu