aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kernel/setup.c
AgeCommit message (Collapse)AuthorFilesLines
2022-11-10x86/mtrr: Add a stop_machine() handler calling only cache_cpu_init()Juergen Gross1-1/+2
Instead of having a stop_machine() handler for either a specific MTRR register or all state at once, add a handler just for calling cache_cpu_init() if appropriate. Add functions for calling stop_machine() with this handler as well. Add a generic replacement for mtrr_bp_restore() and a wrapper for mtrr_bp_init(). Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Borislav Petkov <[email protected]>
2022-07-11x86/setup: Use rng seeds from setup_dataJason A. Donenfeld1-0/+10
Currently, the only way x86 can get an early boot RNG seed is via EFI, which is generally always used now for physical machines, but is very rarely used in VMs, especially VMs that are optimized for starting "instantaneously", such as Firecracker's MicroVM. For tiny fast booting VMs, EFI is not something you generally need or want. Rather, the image loader or firmware should be able to pass a single random seed, exactly as device tree platforms do with the "rng-seed" property. Additionally, this is something that bootloaders can append, with their own seed file management, which is something every other major OS ecosystem has that Linux does not (yet). Add SETUP_RNG_SEED, similar to the other eight setup_data entries that are parsed at boot. It also takes care to zero out the seed immediately after using, in order to retain forward secrecy. This all takes about 7 trivial lines of code. Then, on kexec_file_load(), a new fresh seed is generated and passed to the next kernel, just as is done on device tree architectures when using kexec. And, importantly, I've tested that QEMU is able to properly pass SETUP_RNG_SEED as well, making this work for every step of the way. This code too is pretty straight forward. Together these measures ensure that VMs and nested kexec()'d kernels always receive a proper boot time RNG seed at the earliest possible stage from their parents: - Host [already has strongly initialized RNG] - QEMU [passes fresh seed in SETUP_RNG_SEED field] - Linux [uses parent's seed and gathers entropy of its own] - kexec [passes this in SETUP_RNG_SEED field] - Linux [uses parent's seed and gathers entropy of its own] - kexec [passes this in SETUP_RNG_SEED field] - Linux [uses parent's seed and gathers entropy of its own] - kexec [passes this in SETUP_RNG_SEED field] - ... I've verified in several scenarios that this works quite well from a host kernel to QEMU and down inwards, mixing and matching loaders, with every layer providing a seed to the next. [ bp: Massage commit message. ] Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: H. Peter Anvin (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-07-01x86/kexec: Carry forward IMA measurement log on kexecJonathan McDowell1-0/+63
On kexec file load, the Integrity Measurement Architecture (IMA) subsystem may verify the IMA signature of the kernel and initramfs, and measure it. The command line parameters passed to the kernel in the kexec call may also be measured by IMA. A remote attestation service can verify a TPM quote based on the TPM event log, the IMA measurement list and the TPM PCR data. This can be achieved only if the IMA measurement log is carried over from the current kernel to the next kernel across the kexec call. PowerPC and ARM64 both achieve this using device tree with a "linux,ima-kexec-buffer" node. x86 platforms generally don't make use of device tree, so use the setup_data mechanism to pass the IMA buffer to the new kernel. Signed-off-by: Jonathan McDowell <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Mimi Zohar <[email protected]> # IMA function definitions Link: https://lore.kernel.org/r/YmKyvlF3my1yWTvK@noodles-fedora-PC23Y6EG
2022-06-13x86/mm: Fix RESERVE_BRK() for older binutilsJosh Poimboeuf1-5/+0
With binutils 2.26, RESERVE_BRK() causes a build failure: /tmp/ccnGOKZ5.s: Assembler messages: /tmp/ccnGOKZ5.s:98: Error: missing ')' /tmp/ccnGOKZ5.s:98: Error: missing ')' /tmp/ccnGOKZ5.s:98: Error: missing ')' /tmp/ccnGOKZ5.s:98: Error: junk at end of line, first unrecognized character is `U' The problem is this line: RESERVE_BRK(early_pgt_alloc, INIT_PGT_BUF_SIZE) Specifically, the INIT_PGT_BUF_SIZE macro which (via PAGE_SIZE's use _AC()) has a "1UL", which makes older versions of the assembler unhappy. Unfortunately the _AC() macro doesn't work for inline asm. Inline asm was only needed here to convince the toolchain to add the STT_NOBITS flag. However, if a C variable is placed in a section whose name is prefixed with ".bss", GCC and Clang automatically set STT_NOBITS. In fact, ".bss..page_aligned" already relies on this trick. So fix the build failure (and simplify the macro) by allocating the variable in C. Also, add NOLOAD to the ".brk" output section clause in the linker script. This is a failsafe in case the ".bss" prefix magic trick ever stops working somehow. If there's a section type mismatch, the GNU linker will force the ".brk" output section to be STT_NOBITS. The LLVM linker will fail with a "section type mismatch" error. Note this also changes the name of the variable from .brk.##name to __brk_##name. The variable names aren't actually used anywhere, so it's harmless. Fixes: a1e2c031ec39 ("x86/mm: Simplify RESERVE_BRK()") Reported-by: Joe Damato <[email protected]> Reported-by: Byungchul Park <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Joe Damato <[email protected]> Link: https://lore.kernel.org/r/22d07a44c80d8e8e1e82b9a806ddc8c6bbb2606e.1654759036.git.jpoimboe@kernel.org
2022-05-25x86/setup: Use strscpy() to replace deprecated strlcpy()XueBing Chen1-3/+3
strlcpy() is marked deprecated and should not be used, because it doesn't limit the source length. The preferred interface for when strlcpy()'s return value is not checked (truncation) is strscpy(). [ mingo: Tweaked the changelog ] Signed-off-by: XueBing Chen <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-04-04x86/cpu: Remove "noexec"Borislav Petkov1-3/+25
It doesn't make any sense to disable non-executable mappings - security-wise or else. So rip out that switch and move the remaining code into setup.c and delete setup_nx.c Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Lai Jiangshan <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-03-23x86/setup: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdefJisheng Zhang1-7/+3
Replace the conditional compilation using "#ifdef CONFIG_KEXEC_CORE" by a check for "IS_ENABLED(CONFIG_KEXEC_CORE)", to simplify the code and increase compile coverage. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jisheng Zhang <[email protected]> Acked-by: Baoquan He <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Russell King <[email protected]> Cc: Russell King (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-03-09x86/boot: Fix memremap of setup_indirect structuresRoss Philipson1-7/+27
As documented, the setup_indirect structure is nested inside the setup_data structures in the setup_data list. The code currently accesses the fields inside the setup_indirect structure but only the sizeof(struct setup_data) is being memremapped. No crash occurred but this is just due to how the area is remapped under the covers. Properly memremap both the setup_data and setup_indirect structures in these cases before accessing them. Fixes: b3c72fc9a78e ("x86/boot: Introduce setup_indirect") Signed-off-by: Ross Philipson <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Daniel Kiper <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-01-10Merge tag 'x86_misc_for_v5.17_rc1' of ↵Linus Torvalds1-1/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Borislav Petkov: "The pile which we cannot find the proper topic for so we stick it in x86/misc: - Add support for decoding instructions which do MMIO accesses in order to use it in SEV and TDX guests - An include fix and reorg to allow for removing set_fs in UML later" * tag 'x86_misc_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mtrr: Remove the mtrr_bp_init() stub x86/sev-es: Use insn_decode_mmio() for MMIO implementation x86/insn-eval: Introduce insn_decode_mmio() x86/insn-eval: Introduce insn_get_modrm_reg_ptr() x86/insn-eval: Handle insn_get_opcode() failure
2021-12-22x86/mtrr: Remove the mtrr_bp_init() stubChristoph Hellwig1-1/+6
Add an IS_ENABLED() check in setup_arch() and call pat_disable() directly if MTRRs are not supported. This allows to remove the <asm/memtype.h> include in <asm/mtrr.h>, which pull in lowlevel x86 headers that should not be included for UML builds and will cause build warnings with a following patch. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-12-15x86/boot: Move EFI range reservation after cmdline parsingMike Rapoport1-3/+3
The memory reservation in arch/x86/platform/efi/efi.c depends on at least two command line parameters. Put it back later in the boot process and move efi_memblock_x86_reserve_range() out of early_memory_reserve(). An attempt to fix this was done in 8d48bf8206f7 ("x86/boot: Pull up cmdline preparation and early param parsing") but that caused other troubles so it got reverted. The bug this is addressing is: Dan reports that Anjaneya Chagam can no longer use the efi=nosoftreserve kernel command line parameter to suppress "soft reservation" behavior. This is due to the fact that the following call-chain happens at boot: early_reserve_memory |-> efi_memblock_x86_reserve_range |-> efi_fake_memmap_early which does if (!efi_soft_reserve_enabled()) return; and that would have set EFI_MEM_NO_SOFT_RESERVE after having parsed "nosoftreserve". However, parse_early_param() gets called *after* it, leading to the boot cmdline not being taken into account. See also https://lore.kernel.org/r/[email protected] [ bp: Turn into a proper patch. ] Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-12-15Revert "x86/boot: Pull up cmdline preparation and early param parsing"Borislav Petkov1-39/+27
This reverts commit 8d48bf8206f77aa8687f0e241e901e5197e52423. It turned out to be a bad idea as it broke supplying mem= cmdline parameters due to parse_memopt() requiring preparatory work like setting up the e820 table in e820__memory_setup() in order to be able to exclude the range specified by mem=. Pulling that up would've broken Xen PV again, see threads at https://lkml.kernel.org/r/[email protected] due to xen_memory_setup() needing the first reservations in early_reserve_memory() - kernel and initrd - to have happened already. This could be fixed again by having Xen do those reservations itself... Long story short, revert this and do a simpler fix in a later patch. Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-12-15Revert "x86/boot: Mark prepare_command_line() __init"Borislav Petkov1-1/+1
This reverts commit c0f2077baa4113f38f008b8e912b9fb3ff8d43df. Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-11-24x86/boot: Mark prepare_command_line() __initBorislav Petkov1-1/+1
Fix: WARNING: modpost: vmlinux.o(.text.unlikely+0x64d0): Section mismatch in reference \ from the function prepare_command_line() to the variable .init.data:command_line The function prepare_command_line() references the variable __initdata command_line. This is often because prepare_command_line lacks a __initdata annotation or the annotation of command_line is wrong. Apparently some toolchains do different inlining decisions. Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-11-15x86/boot: Pull up cmdline preparation and early param parsingBorislav Petkov1-27/+39
Dan reports that Anjaneya Chagam can no longer use the efi=nosoftreserve kernel command line parameter to suppress "soft reservation" behavior. This is due to the fact that the following call-chain happens at boot: early_reserve_memory |-> efi_memblock_x86_reserve_range |-> efi_fake_memmap_early which does if (!efi_soft_reserve_enabled()) return; and that would have set EFI_MEM_NO_SOFT_RESERVE after having parsed "nosoftreserve". However, parse_early_param() gets called *after* it, leading to the boot cmdline not being taken into account. Therefore, carve out the command line preparation into a separate function which does the early param parsing too. So that it all goes together. And then call that function before early_reserve_memory() so that the params would have been parsed by then. Fixes: 8aa83e6395ce ("x86/setup: Call early_reserve_memory() earlier") Reported-by: Dan Williams <[email protected]> Reviewed-by: Dan Williams <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Tested-by: Anjaneya Chagam <[email protected]> Cc: <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-11-06memblock: rename memblock_free to memblock_phys_freeMike Rapoport1-2/+2
Since memblock_free() operates on a physical range, make its name reflect it and rename it to memblock_phys_free(), so it will be a logical counterpart to memblock_phys_alloc(). The callers are updated with the below semantic patch: @@ expression addr; expression size; @@ - memblock_free(addr, size); + memblock_phys_free(addr, size); Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Shahab Vahedi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-21x86/setup: Call early_reserve_memory() earlierJuergen Gross1-12/+14
Commit in Fixes introduced early_reserve_memory() to do all needed initial memblock_reserve() calls in one function. Unfortunately, the call of early_reserve_memory() is done too late for Xen dom0, as in some cases a Xen hook called by e820__memory_setup() will need those memory reservations to have happened already. Move the call of early_reserve_memory() before the call of e820__memory_setup() in order to avoid such problems. Fixes: a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") Reported-by: Marek Marczykowski-Górecki <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Tested-by: Marek Marczykowski-Górecki <[email protected]> Tested-by: Nathan Chancellor <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2021-09-01x86/setup: Explicitly include acpi.hNathan Chancellor1-0/+1
After commit 342f43af70db ("iscsi_ibft: fix crash due to KASLR physical memory remapping") x86_64_defconfig shows the following errors: arch/x86/kernel/setup.c: In function ‘setup_arch’: arch/x86/kernel/setup.c:916:13: error: implicit declaration of function ‘acpi_mps_check’ [-Werror=implicit-function-declaration] 916 | if (acpi_mps_check()) { | ^~~~~~~~~~~~~~ arch/x86/kernel/setup.c:1110:9: error: implicit declaration of function ‘acpi_table_upgrade’ [-Werror=implicit-function-declaration] 1110 | acpi_table_upgrade(); | ^~~~~~~~~~~~~~~~~~ [... more acpi noise ...] acpi.h was being implicitly included from iscsi_ibft.h in this configuration so the removal of that header means these functions have no definition or declaration. In most other configurations, <linux/acpi.h> continued to be included through at least <linux/tboot.h> if CONFIG_INTEL_TXT was enabled, and there were probably other implicit include paths too. Add acpi.h explicitly so there is no more error, and so that we don't continue to depend on these unreliable implicit include paths. Tested-by: Matthieu Baerts <[email protected]> Signed-off-by: Nathan Chancellor <[email protected]> Cc: Maurizio Lombardi <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-08-31Merge branch 'stable/for-linus-5.15' of ↵Linus Torvalds1-10/+0
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/ibft Pull ibft updates from Konrad Rzeszutek Wilk: "A fix for iBFT parsing code badly interfacing when KASLR is enabled" * 'stable/for-linus-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/ibft: iscsi_ibft: fix warning in reserve_ibft_region() iscsi_ibft: fix crash due to KASLR physical memory remapping
2021-07-31iscsi_ibft: fix crash due to KASLR physical memory remappingMaurizio Lombardi1-10/+0
Starting with commit a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") memory reservations have been moved earlier during the boot process, before the execution of the Kernel Address Space Layout Randomization code. setup_arch() calls the iscsi_ibft's find_ibft_region() function to find and reserve the memory dedicated to the iBFT and this function also saves a virtual pointer to the iBFT table for later use. The problem is that if KALSR is active, the physical memory gets remapped somewhere else in the virtual address space and the pointer is no longer valid, this will cause a kernel panic when the iscsi driver tries to dereference it. iBFT detected. BUG: unable to handle page fault for address: ffff888000099fd8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI ..snip.. Call Trace: ? ibft_create_kobject+0x1d2/0x1d2 [iscsi_ibft] do_one_initcall+0x44/0x1d0 ? kmem_cache_alloc_trace+0x119/0x220 do_init_module+0x5c/0x270 __do_sys_init_module+0x12e/0x1b0 do_syscall_64+0x40/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Fix this bug by saving the address of the physical location of the ibft; later the driver will use isa_bus_to_virt() to get the correct virtual address. N.B. On each reboot KASLR randomizes the virtual addresses so assuming phys_to_virt before KASLR does its deed is incorrect. Simplify the code by renaming find_ibft_region() to reserve_ibft_region() and remove all the wrappers. Signed-off-by: Maurizio Lombardi <[email protected]> Reviewed-by: Mike Rapoport <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
2021-07-08x86: convert to setup_initial_init_mm()Kefeng Wang1-4/+1
Use setup_initial_init_mm() helper to simplify code. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kefeng Wang <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-0/+1
Merge more updates from Andrew Morton: "190 patches. Subsystems affected by this patch series: mm (hugetlb, userfaultfd, vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock, migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap, zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc, core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs, signals, exec, kcov, selftests, compress/decompress, and ipc" * emailed patches from Andrew Morton <[email protected]>: (190 commits) ipc/util.c: use binary search for max_idx ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock ipc: use kmalloc for msg_queue and shmid_kernel ipc sem: use kvmalloc for sem_undo allocation lib/decompressors: remove set but not used variabled 'level' selftests/vm/pkeys: exercise x86 XSAVE init state selftests/vm/pkeys: refill shadow register after implicit kernel write selftests/vm/pkeys: handle negative sys_pkey_alloc() return code selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random kcov: add __no_sanitize_coverage to fix noinstr for all architectures exec: remove checks in __register_bimfmt() x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned hfsplus: report create_date to kstat.btime hfsplus: remove unnecessary oom message nilfs2: remove redundant continue statement in a while-loop kprobes: remove duplicated strong free_insn_page in x86 and s390 init: print out unknown kernel parameters checkpatch: do not complain about positive return values starting with EPOLL checkpatch: improve the indented label test checkpatch: scripts/spdxcheck.py now requires python3 ...
2021-07-01kernel.h: split out panic and oops helpersAndy Shevchenko1-0/+1
kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out panic and oops helpers. There are several purposes of doing this: - dropping dependency in bug.h - dropping a loop by moving out panic_notifier.h - unload kernel.h from something which has its own domain At the same time convert users tree-wide to use new headers, although for the time being include new header back to kernel.h to avoid twisted indirected includes for existing users. [[email protected]: thread_info.h needs limits.h] [[email protected]: ia64 fix] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Reviewed-by: Bjorn Andersson <[email protected]> Co-developed-by: Andrew Morton <[email protected]> Acked-by: Mike Rapoport <[email protected]> Acked-by: Corey Minyard <[email protected]> Acked-by: Christian Brauner <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Acked-by: Kees Cook <[email protected]> Acked-by: Wei Liu <[email protected]> Acked-by: Rasmus Villemoes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Sebastian Reichel <[email protected]> Acked-by: Luis Chamberlain <[email protected]> Acked-by: Stephen Boyd <[email protected]> Acked-by: Thomas Bogendoerfer <[email protected]> Acked-by: Helge Deller <[email protected]> # parisc Signed-off-by: Linus Torvalds <[email protected]>
2021-06-08x86/setup: Document that Windows reserves the first MiBBorislav Petkov1-10/+11
It does so unconditionally too, on Intel and AMD machines, to work around BIOS bugs, as confirmed by Microsoft folks (see Link for full details). Reflow the paragraph, while at it. Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/MWHPR21MB159330952629D36EEDE706B3D7379@MWHPR21MB1593.namprd21.prod.outlook.com
2021-06-07x86/setup: Remove CONFIG_X86_RESERVE_LOW and reservelow= optionsMike Rapoport1-24/+0
The CONFIG_X86_RESERVE_LOW build time and reservelow= command line option allowed to control the amount of memory under 1M that would be reserved at boot to avoid using memory that can be potentially clobbered by BIOS. Since the entire range under 1M is always reserved there is no need for these options anymore and they can be removed. Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-03x86/setup: Always reserve the first 1M of RAMMike Rapoport1-14/+21
There are BIOSes that are known to corrupt the memory under 1M, or more precisely under 640K because the memory above 640K is anyway reserved for the EGA/VGA frame buffer and BIOS. To prevent usage of the memory that will be potentially clobbered by the kernel, the beginning of the memory is always reserved. The exact size of the reserved area is determined by CONFIG_X86_RESERVE_LOW build time and the "reservelow=" command line option. The reserved range may be from 4K to 640K with the default of 64K. There are also configurations that reserve the entire 1M range, like machines with SandyBridge graphic devices or systems that enable crash kernel. In addition to the potentially clobbered memory, EBDA of unknown size may be as low as 128K and the memory above that EBDA start is also reserved early. It would have been possible to reserve the entire range under 1M unless for the real mode trampoline that must reside in that area. To accommodate placement of the real mode trampoline and keep the memory safe from being clobbered by BIOS, reserve the first 64K of RAM before memory allocations are possible and then, after the real mode trampoline is allocated, reserve the entire range from 0 to 1M. Update trim_snb_memory() and reserve_real_mode() to avoid redundant reservations of the same memory range. Also make sure the memory under 1M is not getting freed by efi_free_boot_services(). [ bp: Massage commit message and comments. ] Fixes: a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Tested-by: Hugh Dickins <[email protected]> Link: https://bugzilla.kernel.org/show_bug.cgi?id=213177 Link: https://lkml.kernel.org/r/[email protected]
2021-05-31x86/thermal: Fix LVT thermal setup for SMI delivery modeBorislav Petkov1-0/+9
There are machines out there with added value crap^WBIOS which provide an SMI handler for the local APIC thermal sensor interrupt. Out of reset, the BSP on those machines has something like 0x200 in that APIC register (timestamps left in because this whole issue is timing sensitive): [ 0.033858] read lvtthmr: 0x330, val: 0x200 which means: - bit 16 - the interrupt mask bit is clear and thus that interrupt is enabled - bits [10:8] have 010b which means SMI delivery mode. Now, later during boot, when the kernel programs the local APIC, it soft-disables it temporarily through the spurious vector register: setup_local_APIC: ... /* * If this comes from kexec/kcrash the APIC might be enabled in * SPIV. Soft disable it before doing further initialization. */ value = apic_read(APIC_SPIV); value &= ~APIC_SPIV_APIC_ENABLED; apic_write(APIC_SPIV, value); which means (from the SDM): "10.4.7.2 Local APIC State After It Has Been Software Disabled ... * The mask bits for all the LVT entries are set. Attempts to reset these bits will be ignored." And this happens too: [ 0.124111] APIC: Switch to symmetric I/O mode setup [ 0.124117] lvtthmr 0x200 before write 0xf to APIC 0xf0 [ 0.124118] lvtthmr 0x10200 after write 0xf to APIC 0xf0 This results in CPU 0 soft lockups depending on the placement in time when the APIC soft-disable happens. Those soft lockups are not 100% reproducible and the reason for that can only be speculated as no one tells you what SMM does. Likely, it confuses the SMM code that the APIC is disabled and the thermal interrupt doesn't doesn't fire at all, leading to CPU 0 stuck in SMM forever... Now, before 4f432e8bb15b ("x86/mce: Get rid of mcheck_intel_therm_init()") due to how the APIC_LVTTHMR was read before APIC initialization in mcheck_intel_therm_init(), it would read the value with the mask bit 16 clear and then intel_init_thermal() would replicate it onto the APs and all would be peachy - the thermal interrupt would remain enabled. But that commit moved that reading to a later moment in intel_init_thermal(), resulting in reading APIC_LVTTHMR on the BSP too late and with its interrupt mask bit set. Thus, revert back to the old behavior of reading the thermal LVT register before the APIC gets initialized. Fixes: 4f432e8bb15b ("x86/mce: Get rid of mcheck_intel_therm_init()") Reported-by: James Feeney <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: <[email protected]> Cc: Zhang Rui <[email protected]> Cc: Srinivas Pandruvada <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-27Merge tag 'x86_core_for_v5.13' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 updates from Borislav Petkov: - Turn the stack canary into a normal __percpu variable on 32-bit which gets rid of the LAZY_GS stuff and a lot of code. - Add an insn_decode() API which all users of the instruction decoder should preferrably use. Its goal is to keep the details of the instruction decoder away from its users and simplify and streamline how one decodes insns in the kernel. Convert its users to it. - kprobes improvements and fixes - Set the maximum DIE per package variable on Hygon - Rip out the dynamic NOP selection and simplify all the machinery around selecting NOPs. Use the simplified NOPs in objtool now too. - Add Xeon Sapphire Rapids to list of CPUs that support PPIN - Simplify the retpolines by folding the entire thing into an alternative now that objtool can handle alternatives with stack ops. Then, have objtool rewrite the call to the retpoline with the alternative which then will get patched at boot time. - Document Intel uarch per models in intel-family.h - Make Sub-NUMA Clustering topology the default and Cluster-on-Die the exception on Intel. * tag 'x86_core_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits) x86, sched: Treat Intel SNC topology as default, COD as exception x86/cpu: Comment Skylake server stepping too x86/cpu: Resort and comment Intel models objtool/x86: Rewrite retpoline thunk calls objtool: Skip magical retpoline .altinstr_replacement objtool: Cache instruction relocs objtool: Keep track of retpoline call sites objtool: Add elf_create_undef_symbol() objtool: Extract elf_symbol_add() objtool: Extract elf_strtab_concat() objtool: Create reloc sections implicitly objtool: Add elf_create_reloc() helper objtool: Rework the elf_rebuild_reloc_section() logic objtool: Fix static_call list generation objtool: Handle per arch retpoline naming objtool: Correctly handle retpoline thunk calls x86/retpoline: Simplify retpolines x86/alternatives: Optimize optimize_nops() x86: Add insn_decode_kernel() x86/kprobes: Move 'inline' to the beginning of the kprobe_is_ss() declaration ...
2021-04-26Merge tag 'x86_cleanups_for_v5.13' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 cleanups from Borislav Petkov: "Trivial cleanups and fixes all over the place" * tag 'x86_cleanups_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Remove me from IDE/ATAPI section x86/pat: Do not compile stubbed functions when X86_PAT is off x86/asm: Ensure asm/proto.h can be included stand-alone x86/platform/intel/quark: Fix incorrect kernel-doc comment syntax in files x86/msr: Make locally used functions static x86/cacheinfo: Remove unneeded dead-store initialization x86/process/64: Move cpu_current_top_of_stack out of TSS tools/turbostat: Unmark non-kernel-doc comment x86/syscalls: Fix -Wmissing-prototypes warnings from COND_SYSCALL() x86/fpu/math-emu: Fix function cast warning x86/msr: Fix wr/rdmsr_safe_regs_on_cpu() prototypes x86: Fix various typos in comments, take #2 x86: Remove unusual Unicode characters from comments x86/kaslr: Return boolean values from a function returning bool x86: Fix various typos in comments x86/setup: Remove unused RESERVE_BRK_ARRAY() stacktrace: Move documentation for arch_stack_walk_reliable() to header x86: Remove duplicate TSC DEADLINE MSR definitions
2021-04-26Merge tag 'x86_boot_for_v5.13' of ↵Linus Torvalds1-51/+58
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 boot updates from Borislav Petkov: "Consolidation and cleanup of the early memory reservations, along with a couple of gcc11 warning fixes" * tag 'x86_boot_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/setup: Move trim_snb_memory() later in setup_arch() to fix boot hangs x86/setup: Merge several reservations of start of memory x86/setup: Consolidate early memory reservations x86/boot/compressed: Avoid gcc-11 -Wstringop-overread warning x86/boot/tboot: Avoid Wstringop-overread-warning
2021-04-14x86/setup: Move trim_snb_memory() later in setup_arch() to fix boot hangsMike Rapoport1-5/+15
Commit a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") moved reservation of the memory inaccessible by Sandy Bride integrated graphics very early, and, as a result, on systems with such devices the first 1M was reserved by trim_snb_memory() which prevented the allocation of the real mode trampoline and made the boot hang very early. Since the purpose of trim_snb_memory() is to prevent problematic pages ever reaching the graphics device, it is safe to reserve these pages after memblock allocations are possible. Move trim_snb_memory() later in boot so that it will be called after reserve_real_mode() and make comments describing trim_snb_memory() operation more elaborate. [ bp: Massage a bit. ] Fixes: a799c2bd29d1 ("x86/setup: Consolidate early memory reservations") Reported-by: Randy Dunlap <[email protected]> Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Tested-by: Randy Dunlap <[email protected]> Tested-by: Hugh Dickins <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-13ACPI: x86: Call acpi_boot_table_init() after acpi_table_upgrade()Rafael J. Wysocki1-3/+2
Commit 1a1c130ab757 ("ACPI: tables: x86: Reserve memory occupied by ACPI tables") attempted to address an issue with reserving the memory occupied by ACPI tables, but it broke the initrd-based table override mechanism relied on by multiple users. To restore the initrd-based ACPI table override functionality, move the acpi_boot_table_init() invocation in setup_arch() on x86 after the acpi_table_upgrade() one. Fixes: 1a1c130ab757 ("ACPI: tables: x86: Reserve memory occupied by ACPI tables") Reported-by: Hans de Goede <[email protected]> Tested-by: Hans de Goede <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2021-03-29ACPI: tables: x86: Reserve memory occupied by ACPI tablesRafael J. Wysocki1-5/+3
The following problem has been reported by George Kennedy: Since commit 7fef431be9c9 ("mm/page_alloc: place pages to tail in __free_pages_core()") the following use after free occurs intermittently when ACPI tables are accessed. BUG: KASAN: use-after-free in ibft_init+0x134/0xc49 Read of size 4 at addr ffff8880be453004 by task swapper/0/1 CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.12.0-rc1-7a7fd0d #1 Call Trace: dump_stack+0xf6/0x158 print_address_description.constprop.9+0x41/0x60 kasan_report.cold.14+0x7b/0xd4 __asan_report_load_n_noabort+0xf/0x20 ibft_init+0x134/0xc49 do_one_initcall+0xc4/0x3e0 kernel_init_freeable+0x5af/0x66b kernel_init+0x16/0x1d0 ret_from_fork+0x22/0x30 ACPI tables mapped via kmap() do not have their mapped pages reserved and the pages can be "stolen" by the buddy allocator. Apparently, on the affected system, the ACPI table in question is not located in "reserved" memory, like ACPI NVS or ACPI Data, that will not be used by the buddy allocator, so the memory occupied by that table has to be explicitly reserved to prevent the buddy allocator from using it. In order to address this problem, rearrange the initialization of the ACPI tables on x86 to locate the initial tables earlier and reserve the memory occupied by them. The other architectures using ACPI should not be affected by this change. Link: https://lore.kernel.org/linux-acpi/[email protected]/ Reported-by: George Kennedy <[email protected]> Tested-by: George Kennedy <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Reviewed-by: Mike Rapoport <[email protected]> Cc: 5.10+ <[email protected]> # 5.10+
2021-03-23x86/setup: Merge several reservations of start of memoryMike Rapoport1-9/+10
Currently, the first several pages are reserved both to avoid leaking their contents on systems with L1TF and to avoid corrupting BIOS memory. Merge the two memory reservations. Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Acked-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-03-23x86/setup: Consolidate early memory reservationsMike Rapoport1-48/+44
The early reservations of memory areas used by the firmware, bootloader, kernel text and data are spread over setup_arch(). Moreover, some of them happen *after* memblock allocations, e.g trim_platform_memory_ranges() and trim_low_memory_range() are called after reserve_real_mode() that allocates memory. There was no corruption of these memory regions because memblock always allocates memory either from the end of memory (in top-down mode) or above the kernel image (in bottom-up mode). However, the bottom up mode is going to be updated to span the entire memory [1] to avoid limitations caused by KASLR. Consolidate early memory reservations in a dedicated function to improve robustness against future changes. Having the early reservations in one place also makes it clearer what memory must be reserved before memblock allocations are allowed. Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Baoquan He <[email protected]> Acked-by: Borislav Petkov <[email protected]> Acked-by: David Hildenbrand <[email protected]> Link: [1] https://lore.kernel.org/lkml/[email protected] Link: https://lkml.kernel.org/r/[email protected]
2021-03-15x86: Remove dynamic NOP selectionPeter Zijlstra1-1/+0
This ensures that a NOP is a NOP and not a random other instruction that is also a NOP. It allows simplification of dynamic code patching that wants to verify existing code before writing new instructions (ftrace, jump_label, static_call, etc..). Differentiating on NOPs is not a feature. This pessimises 32bit (DONTCARE) and 32bit on 64bit CPUs (CARELESS). 32bit is not a performance target. Everything x86_64 since AMD K10 (2007) and Intel IvyBridge (2012) is fine with using NOPL (as opposed to prefix NOP). And per FEATURE_NOPL being required for x86_64, all x86_64 CPUs can use NOPL. So stop caring about NOPs, simplify things and get on with life. [ The problem seems to be that some uarchs can only decode NOPL on a single front-end port while others have severe decode penalties for excessive prefixes. All modern uarchs can handle both, except Atom, which has prefix penalties. ] [ Also, much doubt you can actually measure any of this on normal workloads. ] After this, FEATURE_NOPL is unused except for required-features for x86_64. FEATURE_K8 is only used for PTI. [ bp: Kernel build measurements showed ~0.3s slowdown on Sandybridge which is hardly a slowdown. Get rid of X86_FEATURE_K7, while at it. ] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> # bpf Acked-by: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-03-11x86/setup: Remove unused RESERVE_BRK_ARRAY()Cao jin1-3/+3
Since a13f2ef168cb ("x86/xen: remove 32-bit Xen PV guest support"), RESERVE_BRK_ARRAY() has no user anymore so drop it. Update related comments too. Signed-off-by: Cao jin <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-02-15sfi: Remove framework for deprecated firmwareAndy Shevchenko1-2/+0
SFI-based platforms are gone. So does this framework. This removes mention of SFI through the drivers and other code as well. Signed-off-by: Andy Shevchenko <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Acked-by: Linus Walleij <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2021-02-04Revert "x86/setup: don't remove E820_TYPE_RAM for pfn 0"Mike Rapoport1-9/+11
This reverts commit bde9cfa3afe4324ec251e4af80ebf9b7afaf7afe. Changing the first memory page type from E820_TYPE_RESERVED to E820_TYPE_RAM makes it a part of "System RAM" resource rather than a reserved resource and this in turn causes devmem_is_allowed() to treat is as area that can be accessed but it is filled with zeroes instead of the actual data as previously. The change in /dev/mem output causes lilo to fail as was reported at slakware users forum, and probably other legacy applications will experience similar problems. Link: https://www.linuxquestions.org/questions/slackware-14/slackware-current-lilo-vesa-warnings-after-recent-updates-4175689617/#post6214439 Signed-off-by: Mike Rapoport <[email protected]> Cc: [email protected] Signed-off-by: Linus Torvalds <[email protected]>
2021-01-24x86/setup: don't remove E820_TYPE_RAM for pfn 0Mike Rapoport1-11/+9
Patch series "mm: fix initialization of struct page for holes in memory layout", v3. Commit 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather that check each PFN") exposed several issues with the memory map initialization and these patches fix those issues. Initially there were crashes during compaction that Qian Cai reported back in April [1]. It seemed back then that the problem was fixed, but a few weeks ago Andrea Arcangeli hit the same bug [2] and there was an additional discussion at [3]. [1] https://lore.kernel.org/lkml/[email protected] [2] https://lore.kernel.org/lkml/[email protected] [3] https://lore.kernel.org/mm-commits/20201206005401.qKuAVgOXr%[email protected] This patch (of 2): The first 4Kb of memory is a BIOS owned area and to avoid its allocation for the kernel it was not listed in e820 tables as memory. As the result, pfn 0 was never recognised by the generic memory management and it is not a part of neither node 0 nor ZONE_DMA. If set_pfnblock_flags_mask() would be ever called for the pageblock corresponding to the first 2Mbytes of memory, having pfn 0 outside of ZONE_DMA would trigger VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page); Along with reserving the first 4Kb in e820 tables, several first pages are reserved with memblock in several places during setup_arch(). These reservations are enough to ensure the kernel does not touch the BIOS area and it is not necessary to remove E820_TYPE_RAM for pfn 0. Remove the update of e820 table that changes the type of pfn 0 and move the comment describing why it was done to trim_low_memory_range() that reserves the beginning of the memory. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport <[email protected]> Cc: Baoquan He <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Qian Cai <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-16Merge branch 'stable/for-linus-5.11' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb Pull swiotlb update from Konrad Rzeszutek Wilk: "A generic (but for right now engaged only with AMD SEV) mechanism to adjust a larger size SWIOTLB based on the total memory of the SEV guests which right now require the bounce buffer for interacting with the outside world. Normal knobs (swiotlb=XYZ) still work" * 'stable/for-linus-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: x86,swiotlb: Adjust SWIOTLB bounce buffer size for SEV guests
2020-12-11x86,swiotlb: Adjust SWIOTLB bounce buffer size for SEV guestsAshish Kalra1-0/+6
For SEV, all DMA to and from guest has to use shared (un-encrypted) pages. SEV uses SWIOTLB to make this happen without requiring changes to device drivers. However, depending on the workload being run, the default 64MB of it might not be enough and it may run out of buffers to use for DMA, resulting in I/O errors and/or performance degradation for high I/O workloads. Adjust the default size of SWIOTLB for SEV guests using a percentage of the total memory available to guest for the SWIOTLB buffers. Adds a new sev_setup_arch() function which is invoked from setup_arch() and it calls into a new swiotlb generic code function swiotlb_adjust_size() to do the SWIOTLB buffer adjustment. v5 fixed build errors and warnings as Reported-by: kbuild test robot <[email protected]> Signed-off-by: Ashish Kalra <[email protected]> Co-developed-by: Borislav Petkov <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
2020-10-28x86/setup: Remove unused MCA variablesBorislav Petkov1-5/+0
Commit bb8187d35f82 ("MCA: delete all remaining traces of microchannel bus support.") removed the remaining traces of Micro Channel Architecture support but one trace remained - three variables in setup.c which have been unused since 2012 at least. Drop them finally. Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-10-15Merge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds1-0/+2
Pull dma-mapping updates from Christoph Hellwig: - rework the non-coherent DMA allocator - move private definitions out of <linux/dma-mapping.h> - lower CMA_ALIGNMENT (Paul Cercueil) - remove the omap1 dma address translation in favor of the common code - make dma-direct aware of multiple dma offset ranges (Jim Quinlan) - support per-node DMA CMA areas (Barry Song) - increase the default seg boundary limit (Nicolin Chen) - misc fixes (Robin Murphy, Thomas Tai, Xu Wang) - various cleanups * tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits) ARM/ixp4xx: add a missing include of dma-map-ops.h dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling dma-direct: factor out a dma_direct_alloc_from_pool helper dma-direct check for highmem pages in dma_direct_alloc_pages dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h> dma-mapping: move large parts of <linux/dma-direct.h> to kernel/dma dma-mapping: move dma-debug.h to kernel/dma/ dma-mapping: remove <asm/dma-contiguous.h> dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h> dma-contiguous: remove dma_contiguous_set_default dma-contiguous: remove dev_set_cma_area dma-contiguous: remove dma_declare_contiguous dma-mapping: split <linux/dma-mapping.h> cma: decrease CMA_ALIGNMENT lower limit to 2 firewire-ohci: use dma_alloc_pages dma-iommu: implement ->alloc_noncoherent dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods dma-mapping: add a new dma_alloc_pages API dma-mapping: remove dma_cache_sync 53c700: convert to dma_alloc_noncoherent ...
2020-10-14Merge tag 'acpi-5.10-rc1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it, clean up some non-ACPICA code referring to debug facilities from ACPICA, reduce the overhead related to accessing GPE registers, add a new DPTF (Dynamic Power and Thermal Framework) participant driver, update the ACPICA code in the kernel to upstream revision 20200925, add a new ACPI backlight whitelist entry, fix a few assorted issues and clean up some code. Specifics: - Add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it (Jonathan Cameron) - Clean up some non-ACPICA code referring to debug facilities from ACPICA that are not actually used in there (Hanjun Guo) - Add new DPTF driver for the PCH FIVR participant (Srinivas Pandruvada) - Reduce overhead related to accessing GPE registers in ACPICA and the OS interface layer and make it possible to access GPE registers using logical addresses if they are memory-mapped (Rafael Wysocki) - Update the ACPICA code in the kernel to upstream revision 20200925 including changes as follows: + Add predefined names from the SMBus sepcification (Bob Moore) + Update acpi_help UUID list (Bob Moore) + Return exceptions for string-to-integer conversions in iASL (Bob Moore) + Add a new "ALL <NameSeg>" debugger command (Bob Moore) + Add support for 64 bit risc-v compilation (Colin Ian King) + Do assorted cleanups (Bob Moore, Colin Ian King, Randy Dunlap) - Add new ACPI backlight whitelist entry for HP 635 Notebook (Alex Hung) - Move TPS68470 OpRegion driver to drivers/acpi/pmic/ and split out Kconfig and Makefile specific for ACPI PMIC (Andy Shevchenko) - Clean up the ACPI SoC driver for AMD SoCs (Hanjun Guo) - Add missing config_item_put() to fix refcount leak (Hanjun Guo) - Drop lefrover field from struct acpi_memory_device (Hanjun Guo) - Make the ACPI extlog driver check for RDMSR failures (Ben Hutchings) - Fix handling of lid state changes in the ACPI button driver when input device is closed (Dmitry Torokhov) - Fix several assorted build issues (Barnabás Pőcze, John Garry, Nathan Chancellor, Tian Tao) - Drop unused inline functions and reduce code duplication by using kobj_to_dev() in the NFIT parsing code (YueHaibing, Wang Qing) - Serialize tools/power/acpi Makefile (Thomas Renninger)" * tag 'acpi-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits) ACPICA: Update version to 20200925 Version 20200925 ACPICA: Remove unnecessary semicolon ACPICA: Debugger: Add a new command: "ALL <NameSeg>" ACPICA: iASL: Return exceptions for string-to-integer conversions ACPICA: acpi_help: Update UUID list ACPICA: Add predefined names found in the SMBus sepcification ACPICA: Tree-wide: fix various typos and spelling mistakes ACPICA: Drop the repeated word "an" in a comment ACPICA: Add support for 64 bit risc-v compilation ACPI: button: fix handling lid state changes when input device closed tools/power/acpi: Serialize Makefile ACPI: scan: Replace ACPI_DEBUG_PRINT() with pr_debug() ACPI: memhotplug: Remove 'state' from struct acpi_memory_device ACPI / extlog: Check for RDMSR failure ACPI: Make acpi_evaluate_dsm() prototype consistent docs: mm: numaperf.rst Add brief description for access class 1. node: Add access1 class to represent CPU to memory characteristics ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3 ACPI: Let ACPI know we support Generic Initiator Affinity Structures x86: Support Generic Initiator only proximity domains ...
2020-10-13x86/setup: simplify reserve_crashkernel()Mike Rapoport1-26/+14
* Replace magic numbers with defines * Replace memblock_find_in_range() + memblock_reserve() with memblock_phys_alloc_range() * Stop checking for low memory size in reserve_crashkernel_low(). The allocation from limited range will anyway fail if there is no enough memory, so there is no need for extra traversal of memblock.memory Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Daniel Axtens <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Emil Renner Berthing <[email protected]> Cc: Hari Bathini <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Marek Szyprowski <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Simek <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yoshinori Sato <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13x86/setup: simplify initrd relocation and reservationMike Rapoport1-13/+3
Currently, initrd image is reserved very early during setup and then it might be relocated and re-reserved after the initial physical memory mapping is created. The "late" reservation of memblock verifies that mapped memory size exceeds the size of initrd, then checks whether the relocation required and, if yes, relocates inirtd to a new memory allocated from memblock and frees the old location. The check for memory size is excessive as memblock allocation will anyway fail if there is not enough memory. Besides, there is no point to allocate memory from memblock using memblock_find_in_range() + memblock_reserve() when there exists memblock_phys_alloc_range() with required functionality. Remove the redundant check and simplify memblock allocation. Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Daniel Axtens <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Emil Renner Berthing <[email protected]> Cc: Hari Bathini <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Marek Szyprowski <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Simek <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yoshinori Sato <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-12Merge tag 'core-static_call-2020-10-12' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull static call support from Ingo Molnar: "This introduces static_call(), which is the idea of static_branch() applied to indirect function calls. Remove a data load (indirection) by modifying the text. They give the flexibility of function pointers, but with better performance. (This is especially important for cases where retpolines would otherwise be used, as retpolines can be pretty slow.) API overview: DECLARE_STATIC_CALL(name, func); DEFINE_STATIC_CALL(name, func); DEFINE_STATIC_CALL_NULL(name, typename); static_call(name)(args...); static_call_cond(name)(args...); static_call_update(name, func); x86 is supported via text patching, otherwise basic indirect calls are used, with function pointers. There's a second variant using inline code patching, inspired by jump-labels, implemented on x86 as well. The new APIs are utilized in the x86 perf code, a heavy user of function pointers, where static calls speed up the PMU handler by 4.2% (!). The generic implementation is not really excercised on other architectures, outside of the trivial test_static_call_init() self-test" * tag 'core-static_call-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) static_call: Fix return type of static_call_init tracepoint: Fix out of sync data passing by static caller tracepoint: Fix overly long tracepoint names x86/perf, static_call: Optimize x86_pmu methods tracepoint: Optimize using static_call() static_call: Allow early init static_call: Add some validation static_call: Handle tail-calls static_call: Add static_call_cond() x86/alternatives: Teach text_poke_bp() to emulate RET static_call: Add simple self-test for static calls x86/static_call: Add inline static call implementation for x86-64 x86/static_call: Add out-of-line static call implementation static_call: Avoid kprobes on inline static_call()s static_call: Add inline static call infrastructure static_call: Add basic static call infrastructure compiler.h: Make __ADDRESSABLE() symbol truly unique jump_label,module: Fix module lifetime for __jump_label_mod_text_reserved() module: Properly propagate MODULE_STATE_COMING failure module: Fix up module_notifier return values ...
2020-10-06dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h>Christoph Hellwig1-1/+1
Merge dma-contiguous.h into dma-map-ops.h, after removing the comment describing the contiguous allocator into kernel/dma/contigous.c. Signed-off-by: Christoph Hellwig <[email protected]>
2020-10-06dma-mapping: split <linux/dma-mapping.h>Christoph Hellwig1-0/+2
Split out all the bits that are purely for dma_map_ops implementations and related code into a new <linux/dma-map-ops.h> header so that they don't get pulled into all the drivers. That also means the architecture specific <asm/dma-mapping.h> is not pulled in by <linux/dma-mapping.h> any more, which leads to a missing includes that were pulled in by the x86 or arm versions in a few not overly portable drivers. Signed-off-by: Christoph Hellwig <[email protected]>