aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kernel
AgeCommit message (Collapse)AuthorFilesLines
2020-11-08Merge tag 'x86-urgent-2020-11-08' of ↵Linus Torvalds2-23/+51
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A set of x86 fixes: - Use SYM_FUNC_START_WEAK in the mem* ASM functions instead of a combination of .weak and SYM_FUNC_START_LOCAL which makes LLVMs integrated assembler upset - Correct the mitigation selection logic which prevented the related prctl to work correctly - Make the UV5 hubless system work correctly by fixing up the malformed table entries and adding the missing ones" * tag 'x86-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/uv: Recognize UV5 hubless system identifier x86/platform/uv: Remove spaces from OEM IDs x86/platform/uv: Fix missing OEM_TABLE_ID x86/speculation: Allow IBPB to be conditionally enabled on CPUs with always-on STIBP x86/lib: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S
2020-11-07x86/platform/uv: Recognize UV5 hubless system identifierMike Travis1-3/+10
Testing shows a problem in that UV5 hubless systems were not being recognized. Add them to the list of OEM IDs checked. Fixes: 6c7794423a998 ("Add UV5 direct references") Signed-off-by: Mike Travis <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-11-07x86/platform/uv: Remove spaces from OEM IDsMike Travis1-0/+3
Testing shows that trailing spaces caused problems with the OEM_ID and the OEM_TABLE_ID. One being that the OEM_ID would not string compare correctly. Another the OEM_ID and OEM_TABLE_ID would be concatenated in the printout. Remove any trailing spaces. Fixes: 1e61f5a95f191 ("Add and decode Arch Type in UVsystab") Signed-off-by: Mike Travis <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-11-07x86/platform/uv: Fix missing OEM_TABLE_IDMike Travis1-2/+5
Testing shows a problem in that the OEM_TABLE_ID was missing for hubless systems. This is used to determine the APIC type (legacy or extended). Add the OEM_TABLE_ID to the early hubless processing. Fixes: 1e61f5a95f191 ("Add and decode Arch Type in UVsystab") Signed-off-by: Mike Travis <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-11-05x86/speculation: Allow IBPB to be conditionally enabled on CPUs with ↵Anand K Mistry1-18/+33
always-on STIBP On AMD CPUs which have the feature X86_FEATURE_AMD_STIBP_ALWAYS_ON, STIBP is set to on and spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED At the same time, IBPB can be set to conditional. However, this leads to the case where it's impossible to turn on IBPB for a process because in the PR_SPEC_DISABLE case in ib_prctl_set() the spectre_v2_user_stibp == SPECTRE_V2_USER_STRICT_PREFERRED condition leads to a return before the task flag is set. Similarly, ib_prctl_get() will return PR_SPEC_DISABLE even though IBPB is set to conditional. More generally, the following cases are possible: 1. STIBP = conditional && IBPB = on for spectre_v2_user=seccomp,ibpb 2. STIBP = on && IBPB = conditional for AMD CPUs with X86_FEATURE_AMD_STIBP_ALWAYS_ON The first case functions correctly today, but only because spectre_v2_user_ibpb isn't updated to reflect the IBPB mode. At a high level, this change does one thing. If either STIBP or IBPB is set to conditional, allow the prctl to change the task flag. Also, reflect that capability when querying the state. This isn't perfect since it doesn't take into account if only STIBP or IBPB is unconditionally on. But it allows the conditional feature to work as expected, without affecting the unconditional one. [ bp: Massage commit message and comment; space out statements for better readability. ] Fixes: 21998a351512 ("x86/speculation: Avoid force-disabling IBPB based on STIBP and enhanced IBRS.") Signed-off-by: Anand K Mistry <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Tom Lendacky <[email protected]> Link: https://lkml.kernel.org/r/20201105163246.v2.1.Ifd7243cd3e2c2206a893ad0a5b9a4f19549e22c6@changeid
2020-11-03Merge tag 'x86_seves_for_v5.10_rc3' of ↵Linus Torvalds4-7/+144
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV-ES fixes from Borislav Petkov: "A couple of changes to the SEV-ES code to perform more stringent hypervisor checks before enabling encryption (Joerg Roedel)" * tag 'x86_seves_for_v5.10_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev-es: Do not support MMIO to/from encrypted memory x86/head/64: Check SEV encryption before switching to kernel page-table x86/boot/compressed/64: Check SEV encryption in 64-bit boot-path x86/boot/compressed/64: Sanity-check CPUID results in the early #VC handler x86/boot/compressed/64: Introduce sev_status
2020-10-29x86/sev-es: Do not support MMIO to/from encrypted memoryJoerg Roedel1-7/+13
MMIO memory is usually not mapped encrypted, so there is no reason to support emulated MMIO when it is mapped encrypted. Prevent a possible hypervisor attack where a RAM page is mapped as an MMIO page in the nested page-table, so that any guest access to it will trigger a #VC exception and leak the data on that page to the hypervisor via the GHCB (like with valid MMIO). On the read side this attack would allow the HV to inject data into the guest. Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Tom Lendacky <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-10-29x86/head/64: Check SEV encryption before switching to kernel page-tableJoerg Roedel1-0/+16
When SEV is enabled, the kernel requests the C-bit position again from the hypervisor to build its own page-table. Since the hypervisor is an untrusted source, the C-bit position needs to be verified before the kernel page-table is used. Call sev_verify_cbit() before writing the CR3. [ bp: Massage. ] Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Tom Lendacky <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-10-29x86/boot/compressed/64: Check SEV encryption in 64-bit boot-pathJoerg Roedel1-0/+89
Check whether the hypervisor reported the correct C-bit when running as an SEV guest. Using a wrong C-bit position could be used to leak sensitive data from the guest to the hypervisor. The check function is in a separate file: arch/x86/kernel/sev_verify_cbit.S so that it can be re-used in the running kernel image. [ bp: Massage. ] Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Tom Lendacky <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-10-29x86/boot/compressed/64: Sanity-check CPUID results in the early #VC handlerJoerg Roedel1-0/+26
The early #VC handler which doesn't have a GHCB can only handle CPUID exit codes. It is needed by the early boot code to handle #VC exceptions raised in verify_cpu() and to get the position of the C-bit. But the CPUID information comes from the hypervisor which is untrusted and might return results which trick the guest into the no-SEV boot path with no C-bit set in the page-tables. All data written to memory would then be unencrypted and could leak sensitive data to the hypervisor. Add sanity checks to the early #VC handler to make sure the hypervisor can not pretend that SEV is disabled. [ bp: Massage a bit. ] Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Tom Lendacky <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-10-27x86/debug: Fix DR_STEP vs ptrace_get_debugreg(6)Peter Zijlstra1-3/+6
Commit d53d9bc0cf78 ("x86/debug: Change thread.debugreg6 to thread.virtual_dr6") changed the semantics of the variable from random collection of bits, to exactly only those bits that ptrace() needs. Unfortunately this lost DR_STEP for PTRACE_{BLOCK,SINGLE}STEP. Furthermore, it turns out that userspace expects DR_STEP to be unconditionally available, even for manual TF usage outside of PTRACE_{BLOCK,SINGLE}_STEP. Fixes: d53d9bc0cf78 ("x86/debug: Change thread.debugreg6 to thread.virtual_dr6") Reported-by: Kyle Huey <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Kyle Huey <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-27x86/debug: Only clear/set ->virtual_dr6 for userspace #DBPeter Zijlstra1-6/+6
The ->virtual_dr6 is the value used by ptrace_{get,set}_debugreg(6). A kernel #DB clearing it could mean spurious malfunction of ptrace() expectations. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Kyle Huey <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-27x86/debug: Fix BTF handlingPeter Zijlstra1-7/+21
The SDM states that #DB clears DEBUGCTLMSR_BTF, this means that when the bit is set for userspace (TIF_BLOCKSTEP) and a kernel #DB happens first, the BTF bit meant for userspace execution is lost. Have the kernel #DB handler restore the BTF bit when it was requested for userspace. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Kyle Huey <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-27Merge tag 'x86-urgent-2020-10-27' of ↵Linus Torvalds3-10/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A couple of x86 fixes which missed rc1 due to my stupidity: - Drop lazy TLB mode before switching to the temporary address space for text patching. text_poke() switches to the temporary mm which clears the lazy mode and restores the original mm afterwards. Due to clearing lazy mode this might restore a already dead mm if exit_mmap() runs in parallel on another CPU. - Document the x32 syscall design fail vs. syscall numbers 512-547 properly. - Fix the ORC unwinder to handle the inactive task frame correctly. This was unearthed due to the slightly different code generation of gcc-10. - Use an up to date screen_info for the boot params of kexec instead of the possibly stale and invalid version which happened to be valid when the kexec kernel was loaded" * tag 'x86-urgent-2020-10-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/alternative: Don't call text_poke() in lazy TLB mode x86/syscalls: Document the fact that syscalls 512-547 are a legacy mistake x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 compiled kernels hyperv_fb: Update screen_info after removing old framebuffer x86/kexec: Use up-to-dated screen_info copy to fill boot params
2020-10-25treewide: Convert macro and uses of __section(foo) to __section("foo")Joe Perches2-2/+2
Use a more generic form for __section that requires quotes to avoid complications with clang and gcc differences. Remove the quote operator # from compiler_attributes.h __section macro. Convert all unquoted __section(foo) uses to quoted __section("foo"). Also convert __attribute__((section("foo"))) uses to __section("foo") even if the __attribute__ has multiple list entry forms. Conversion done using the script at: https://lore.kernel.org/lkml/[email protected]/2-convert_section.pl Signed-off-by: Joe Perches <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Reviewed-by: Miguel Ojeda <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-10-24Merge tag 'x86_seves_fixes_for_v5.10_rc1' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV-ES fixes from Borislav Petkov: "Three fixes to SEV-ES to correct setting up the new early pagetable on 5-level paging machines, to always map boot_params and the kernel cmdline, and disable stack protector for ../compressed/head{32,64}.c. (Arvind Sankar)" * tag 'x86_seves_fixes_for_v5.10_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot/64: Explicitly map boot_params and command line x86/head/64: Disable stack protection for head$(BITS).o x86/boot/64: Initialize 5-level paging variables earlier
2020-10-23Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-1/+1
Pull KVM updates from Paolo Bonzini: "For x86, there is a new alternative and (in the future) more scalable implementation of extended page tables that does not need a reverse map from guest physical addresses to host physical addresses. For now it is disabled by default because it is still lacking a few of the existing MMU's bells and whistles. However it is a very solid piece of work and it is already available for people to hammer on it. Other updates: ARM: - New page table code for both hypervisor and guest stage-2 - Introduction of a new EL2-private host context - Allow EL2 to have its own private per-CPU variables - Support of PMU event filtering - Complete rework of the Spectre mitigation PPC: - Fix for running nested guests with in-kernel IRQ chip - Fix race condition causing occasional host hard lockup - Minor cleanups and bugfixes x86: - allow trapping unknown MSRs to userspace - allow userspace to force #GP on specific MSRs - INVPCID support on AMD - nested AMD cleanup, on demand allocation of nested SVM state - hide PV MSRs and hypercalls for features not enabled in CPUID - new test for MSR_IA32_TSC writes from host and guest - cleanups: MMU, CPUID, shared MSRs - LAPIC latency optimizations ad bugfixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (232 commits) kvm: x86/mmu: NX largepage recovery for TDP MMU kvm: x86/mmu: Don't clear write flooding count for direct roots kvm: x86/mmu: Support MMIO in the TDP MMU kvm: x86/mmu: Support write protection for nesting in tdp MMU kvm: x86/mmu: Support disabling dirty logging for the tdp MMU kvm: x86/mmu: Support dirty logging for the TDP MMU kvm: x86/mmu: Support changed pte notifier in tdp MMU kvm: x86/mmu: Add access tracking for tdp_mmu kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU kvm: x86/mmu: Allocate struct kvm_mmu_pages for all pages in TDP MMU kvm: x86/mmu: Add TDP MMU PF handler kvm: x86/mmu: Remove disallowed_hugepage_adjust shadow_walk_iterator arg kvm: x86/mmu: Support zapping SPTEs in the TDP MMU KVM: Cache as_id in kvm_memory_slot kvm: x86/mmu: Add functions to handle changed TDP SPTEs kvm: x86/mmu: Allocate and free TDP MMU roots kvm: x86/mmu: Init / Uninit the TDP MMU kvm: x86/mmu: Introduce tdp_iter KVM: mmu: extract spte.h and spte.c KVM: mmu: Separate updating a PTE from kvm_set_pte_rmapp ...
2020-10-23Merge tag 'arch-cleanup-2020-10-22' of git://git.kernel.dk/linux-blockLinus Torvalds2-2/+2
Pull arch task_work cleanups from Jens Axboe: "Two cleanups that don't fit other categories: - Finally get the task_work_add() cleanup done properly, so we don't have random 0/1/false/true/TWA_SIGNAL confusing use cases. Updates all callers, and also fixes up the documentation for task_work_add(). - While working on some TIF related changes for 5.11, this TIF_NOTIFY_RESUME cleanup fell out of that. Remove some arch duplication for how that is handled" * tag 'arch-cleanup-2020-10-22' of git://git.kernel.dk/linux-block: task_work: cleanup notification modes tracehook: clear TIF_NOTIFY_RESUME in tracehook_notify_resume()
2020-10-22Merge branch 'work.set_fs' of ↵Linus Torvalds1-3/+0
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull initial set_fs() removal from Al Viro: "Christoph's set_fs base series + fixups" * 'work.set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Allow a NULL pos pointer to __kernel_read fs: Allow a NULL pos pointer to __kernel_write powerpc: remove address space overrides using set_fs() powerpc: use non-set_fs based maccess routines x86: remove address space overrides using set_fs() x86: make TASK_SIZE_MAX usable from assembly code x86: move PAGE_OFFSET, TASK_SIZE & friends to page_{32,64}_types.h lkdtm: remove set_fs-based tests test_bitmap: remove user bitmap tests uaccess: add infrastructure for kernel builds with set_fs() fs: don't allow splice read/write without explicit ops fs: don't allow kernel reads and writes without iter ops sysctl: Convert to iter interfaces proc: add a read_iter method to proc proc_ops proc: cleanup the compat vs no compat file ops proc: remove a level of indentation in proc_get_inode
2020-10-22x86/alternative: Don't call text_poke() in lazy TLB modeJuergen Gross1-0/+9
When running in lazy TLB mode the currently active page tables might be the ones of a previous process, e.g. when running a kernel thread. This can be problematic in case kernel code is being modified via text_poke() in a kernel thread, and on another processor exit_mmap() is active for the process which was running on the first cpu before the kernel thread. As text_poke() is using a temporary address space and the former address space (obtained via cpu_tlbstate.loaded_mm) is restored afterwards, there is a race possible in case the cpu on which exit_mmap() is running wants to make sure there are no stale references to that address space on any cpu active (this e.g. is required when running as a Xen PV guest, where this problem has been observed and analyzed). In order to avoid that, drop off TLB lazy mode before switching to the temporary address space. Fixes: cefa929c034eb5d ("x86/mm: Introduce temporary mm structs") Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-10-19x86/head/64: Disable stack protection for head$(BITS).oArvind Sankar1-0/+2
On 64-bit, the startup_64_setup_env() function added in 866b556efa12 ("x86/head/64: Install startup GDT") has stack protection enabled because of set_bringup_idt_handler(). This happens when CONFIG_STACKPROTECTOR_STRONG is enabled. It also currently needs CONFIG_AMD_MEM_ENCRYPT enabled because then set_bringup_idt_handler() is not an empty stub but that might change in the future, when the other vendor adds their similar technology. At this point, %gs is not yet initialized, and this doesn't cause a crash only because the #PF handler from the decompressor stub is still installed and handles the page fault. Disable stack protection for the whole file, and do it on 32-bit as well to avoid surprises. [ bp: Extend commit message with the exact explanation how it happens. ] Signed-off-by: Arvind Sankar <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Joerg Roedel <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-10-17task_work: cleanup notification modesJens Axboe2-2/+2
A previous commit changed the notification mode from true/false to an int, allowing notify-no, notify-yes, or signal-notify. This was backwards compatible in the sense that any existing true/false user would translate to either 0 (on notification sent) or 1, the latter which mapped to TWA_RESUME. TWA_SIGNAL was assigned a value of 2. Clean this up properly, and define a proper enum for the notification mode. Now we have: - TWA_NONE. This is 0, same as before the original change, meaning no notification requested. - TWA_RESUME. This is 1, same as before the original change, meaning that we use TIF_NOTIFY_RESUME. - TWA_SIGNAL. This uses TIF_SIGPENDING/JOBCTL_TASK_WORK for the notification. Clean up all the callers, switching their 0/1/false/true to using the appropriate TWA_* mode for notifications. Fixes: e91b48162332 ("task_work: teach task_work_add() to do signal_wake_up()") Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-10-15Merge tag 'hyperv-next-signed' of ↵Linus Torvalds1-1/+6
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull another Hyper-V update from Wei Liu: "One patch from Michael to get VMbus interrupt from ACPI DSDT" * tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: vmbus: Add parsing of VMbus interrupt in ACPI DSDT
2020-10-15Merge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds3-6/+10
Pull dma-mapping updates from Christoph Hellwig: - rework the non-coherent DMA allocator - move private definitions out of <linux/dma-mapping.h> - lower CMA_ALIGNMENT (Paul Cercueil) - remove the omap1 dma address translation in favor of the common code - make dma-direct aware of multiple dma offset ranges (Jim Quinlan) - support per-node DMA CMA areas (Barry Song) - increase the default seg boundary limit (Nicolin Chen) - misc fixes (Robin Murphy, Thomas Tai, Xu Wang) - various cleanups * tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits) ARM/ixp4xx: add a missing include of dma-map-ops.h dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling dma-direct: factor out a dma_direct_alloc_from_pool helper dma-direct check for highmem pages in dma_direct_alloc_pages dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h> dma-mapping: move large parts of <linux/dma-direct.h> to kernel/dma dma-mapping: move dma-debug.h to kernel/dma/ dma-mapping: remove <asm/dma-contiguous.h> dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h> dma-contiguous: remove dma_contiguous_set_default dma-contiguous: remove dev_set_cma_area dma-contiguous: remove dma_declare_contiguous dma-mapping: split <linux/dma-mapping.h> cma: decrease CMA_ALIGNMENT lower limit to 2 firewire-ohci: use dma_alloc_pages dma-iommu: implement ->alloc_noncoherent dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods dma-mapping: add a new dma_alloc_pages API dma-mapping: remove dma_cache_sync 53c700: convert to dma_alloc_noncoherent ...
2020-10-14Merge tag 'kernel-clone-v5.9' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull kernel_clone() updates from Christian Brauner: "During the v5.9 merge window we reworked the process creation codepaths across multiple architectures. After this work we were only left with the _do_fork() helper based on the struct kernel_clone_args calling convention. As was pointed out _do_fork() isn't valid kernelese especially for a helper that isn't just static. This series removes the _do_fork() helper and introduces the new kernel_clone() helper. The process creation cleanup didn't change the name to something more reasonable mainly because _do_fork() was used in quite a few places. So sending this as a separate series seemed the better strategy. I originally intended to send this early in the v5.9 development cycle after the merge window had closed but given that this was touching quite a few places I decided to defer this until the v5.10 merge window" * tag 'kernel-clone-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: sched: remove _do_fork() tracing: switch to kernel_clone() kgdbts: switch to kernel_clone() kprobes: switch to kernel_clone() x86: switch to kernel_clone() sparc: switch to kernel_clone() nios2: switch to kernel_clone() m68k: switch to kernel_clone() ia64: switch to kernel_clone() h8300: switch to kernel_clone() fork: introduce kernel_clone()
2020-10-14Drivers: hv: vmbus: Add parsing of VMbus interrupt in ACPI DSDTMichael Kelley1-1/+6
On ARM64, Hyper-V now specifies the interrupt to be used by VMbus in the ACPI DSDT. This information is not used on x86 because the interrupt vector must be hardcoded. But update the generic VMbus driver to do the parsing and pass the information to the architecture specific code that sets up the Linux IRQ. Update consumers of the interrupt to get it from an architecture specific function. Signed-off-by: Michael Kelley <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Wei Liu <[email protected]>
2020-10-14Merge tag 'acpi-5.10-rc1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it, clean up some non-ACPICA code referring to debug facilities from ACPICA, reduce the overhead related to accessing GPE registers, add a new DPTF (Dynamic Power and Thermal Framework) participant driver, update the ACPICA code in the kernel to upstream revision 20200925, add a new ACPI backlight whitelist entry, fix a few assorted issues and clean up some code. Specifics: - Add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it (Jonathan Cameron) - Clean up some non-ACPICA code referring to debug facilities from ACPICA that are not actually used in there (Hanjun Guo) - Add new DPTF driver for the PCH FIVR participant (Srinivas Pandruvada) - Reduce overhead related to accessing GPE registers in ACPICA and the OS interface layer and make it possible to access GPE registers using logical addresses if they are memory-mapped (Rafael Wysocki) - Update the ACPICA code in the kernel to upstream revision 20200925 including changes as follows: + Add predefined names from the SMBus sepcification (Bob Moore) + Update acpi_help UUID list (Bob Moore) + Return exceptions for string-to-integer conversions in iASL (Bob Moore) + Add a new "ALL <NameSeg>" debugger command (Bob Moore) + Add support for 64 bit risc-v compilation (Colin Ian King) + Do assorted cleanups (Bob Moore, Colin Ian King, Randy Dunlap) - Add new ACPI backlight whitelist entry for HP 635 Notebook (Alex Hung) - Move TPS68470 OpRegion driver to drivers/acpi/pmic/ and split out Kconfig and Makefile specific for ACPI PMIC (Andy Shevchenko) - Clean up the ACPI SoC driver for AMD SoCs (Hanjun Guo) - Add missing config_item_put() to fix refcount leak (Hanjun Guo) - Drop lefrover field from struct acpi_memory_device (Hanjun Guo) - Make the ACPI extlog driver check for RDMSR failures (Ben Hutchings) - Fix handling of lid state changes in the ACPI button driver when input device is closed (Dmitry Torokhov) - Fix several assorted build issues (Barnabás Pőcze, John Garry, Nathan Chancellor, Tian Tao) - Drop unused inline functions and reduce code duplication by using kobj_to_dev() in the NFIT parsing code (YueHaibing, Wang Qing) - Serialize tools/power/acpi Makefile (Thomas Renninger)" * tag 'acpi-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits) ACPICA: Update version to 20200925 Version 20200925 ACPICA: Remove unnecessary semicolon ACPICA: Debugger: Add a new command: "ALL <NameSeg>" ACPICA: iASL: Return exceptions for string-to-integer conversions ACPICA: acpi_help: Update UUID list ACPICA: Add predefined names found in the SMBus sepcification ACPICA: Tree-wide: fix various typos and spelling mistakes ACPICA: Drop the repeated word "an" in a comment ACPICA: Add support for 64 bit risc-v compilation ACPI: button: fix handling lid state changes when input device closed tools/power/acpi: Serialize Makefile ACPI: scan: Replace ACPI_DEBUG_PRINT() with pr_debug() ACPI: memhotplug: Remove 'state' from struct acpi_memory_device ACPI / extlog: Check for RDMSR failure ACPI: Make acpi_evaluate_dsm() prototype consistent docs: mm: numaperf.rst Add brief description for access class 1. node: Add access1 class to represent CPU to memory characteristics ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3 ACPI: Let ACPI know we support Generic Initiator Affinity Structures x86: Support Generic Initiator only proximity domains ...
2020-10-14Merge tag 'x86_seves_for_v5.10' of ↵Linus Torvalds17-150/+2373
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV-ES support from Borislav Petkov: "SEV-ES enhances the current guest memory encryption support called SEV by also encrypting the guest register state, making the registers inaccessible to the hypervisor by en-/decrypting them on world switches. Thus, it adds additional protection to Linux guests against exfiltration, control flow and rollback attacks. With SEV-ES, the guest is in full control of what registers the hypervisor can access. This is provided by a guest-host exchange mechanism based on a new exception vector called VMM Communication Exception (#VC), a new instruction called VMGEXIT and a shared Guest-Host Communication Block which is a decrypted page shared between the guest and the hypervisor. Intercepts to the hypervisor become #VC exceptions in an SEV-ES guest so in order for that exception mechanism to work, the early x86 init code needed to be made able to handle exceptions, which, in itself, brings a bunch of very nice cleanups and improvements to the early boot code like an early page fault handler, allowing for on-demand building of the identity mapping. With that, !KASLR configurations do not use the EFI page table anymore but switch to a kernel-controlled one. The main part of this series adds the support for that new exchange mechanism. The goal has been to keep this as much as possibly separate from the core x86 code by concentrating the machinery in two SEV-ES-specific files: arch/x86/kernel/sev-es-shared.c arch/x86/kernel/sev-es.c Other interaction with core x86 code has been kept at minimum and behind static keys to minimize the performance impact on !SEV-ES setups. Work by Joerg Roedel and Thomas Lendacky and others" * tag 'x86_seves_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (73 commits) x86/sev-es: Use GHCB accessor for setting the MMIO scratch buffer x86/sev-es: Check required CPU features for SEV-ES x86/efi: Add GHCB mappings when SEV-ES is active x86/sev-es: Handle NMI State x86/sev-es: Support CPU offline/online x86/head/64: Don't call verify_cpu() on starting APs x86/smpboot: Load TSS and getcpu GDT entry before loading IDT x86/realmode: Setup AP jump table x86/realmode: Add SEV-ES specific trampoline entry point x86/vmware: Add VMware-specific handling for VMMCALL under SEV-ES x86/kvm: Add KVM-specific VMMCALL handling under SEV-ES x86/paravirt: Allow hypervisor-specific VMMCALL handling under SEV-ES x86/sev-es: Handle #DB Events x86/sev-es: Handle #AC Events x86/sev-es: Handle VMMCALL Events x86/sev-es: Handle MWAIT/MWAITX Events x86/sev-es: Handle MONITOR/MONITORX Events x86/sev-es: Handle INVD Events x86/sev-es: Handle RDPMC Events x86/sev-es: Handle RDTSC(P) Events ...
2020-10-14Merge tag 'objtool-core-2020-10-13' of ↵Linus Torvalds4-8/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: "Most of the changes are cleanups and reorganization to make the objtool code more arch-agnostic. This is in preparation for non-x86 support. Other changes: - KASAN fixes - Handle unreachable trap after call to noreturn functions better - Ignore unreachable fake jumps - Misc smaller fixes & cleanups" * tag 'objtool-core-2020-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) perf build: Allow nested externs to enable BUILD_BUG() usage objtool: Allow nested externs to enable BUILD_BUG() objtool: Permit __kasan_check_{read,write} under UACCESS objtool: Ignore unreachable trap after call to noreturn functions objtool: Handle calling non-function symbols in other sections objtool: Ignore unreachable fake jumps objtool: Remove useless tests before save_reg() objtool: Decode unwind hint register depending on architecture objtool: Make unwind hint definitions available to other architectures objtool: Only include valid definitions depending on source file type objtool: Rename frame.h -> objtool.h objtool: Refactor jump table code to support other architectures objtool: Make relocation in alternative handling arch dependent objtool: Abstract alternative special case handling objtool: Move macros describing structures to arch-dependent code objtool: Make sync-check consider the target architecture objtool: Group headers to check in a single list objtool: Define 'struct orc_entry' only when needed objtool: Skip ORC entry creation for non-text sections objtool: Move ORC logic out of check() ...
2020-10-14x86/unwind/orc: Fix inactive tasks with stack pointer in %sp on GCC 10 ↵Jiri Slaby1-8/+1
compiled kernels GCC 10 optimizes the scheduler code differently than its predecessors. When CONFIG_DEBUG_SECTION_MISMATCH=y, the Makefile forces GCC not to inline some functions (-fno-inline-functions-called-once). Before GCC 10, "no-inlined" __schedule() starts with the usual prologue: push %bp mov %sp, %bp So the ORC unwinder simply picks stack pointer from %bp and unwinds from __schedule() just perfectly: $ cat /proc/1/stack [<0>] ep_poll+0x3e9/0x450 [<0>] do_epoll_wait+0xaa/0xc0 [<0>] __x64_sys_epoll_wait+0x1a/0x20 [<0>] do_syscall_64+0x33/0x40 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 But now, with GCC 10, there is no %bp prologue in __schedule(): $ cat /proc/1/stack <nothing> The ORC entry of the point in __schedule() is: sp:sp+88 bp:last_sp-48 type:call end:0 In this case, nobody subtracts sizeof "struct inactive_task_frame" in __unwind_start(). The struct is put on the stack by __switch_to_asm() and only then __switch_to_asm() stores %sp to task->thread.sp. But we start unwinding from a point in __schedule() (stored in frame->ret_addr by 'call') and not in __switch_to_asm(). So for these example values in __unwind_start(): sp=ffff94b50001fdc8 bp=ffff8e1f41d29340 ip=__schedule+0x1f0 The stack is: ffff94b50001fdc8: ffff8e1f41578000 # struct inactive_task_frame ffff94b50001fdd0: 0000000000000000 ffff94b50001fdd8: ffff8e1f41d29340 ffff94b50001fde0: ffff8e1f41611d40 # ... ffff94b50001fde8: ffffffff93c41920 # bx ffff94b50001fdf0: ffff8e1f41d29340 # bp ffff94b50001fdf8: ffffffff9376cad0 # ret_addr (and end of the struct) 0xffffffff9376cad0 is __schedule+0x1f0 (after the call to __switch_to_asm). Now follow those 88 bytes from the ORC entry (sp+88). The entry is correct, __schedule() really pushes 48 bytes (8*7) + 32 bytes via subq to store some local values (like 4U below). So to unwind, look at the offset 88-sizeof(long) = 0x50 from here: ffff94b50001fe00: ffff8e1f41578618 ffff94b50001fe08: 00000cc000000255 ffff94b50001fe10: 0000000500000004 ffff94b50001fe18: 7793fab6956b2d00 # NOTE (see below) ffff94b50001fe20: ffff8e1f41578000 ffff94b50001fe28: ffff8e1f41578000 ffff94b50001fe30: ffff8e1f41578000 ffff94b50001fe38: ffff8e1f41578000 ffff94b50001fe40: ffff94b50001fed8 ffff94b50001fe48: ffff8e1f41577ff0 ffff94b50001fe50: ffffffff9376cf12 Here ^^^^^^^^^^^^^^^^ is the correct ret addr from __schedule(). It translates to schedule+0x42 (insn after a call to __schedule()). BUT, unwind_next_frame() tries to take the address starting from 0xffff94b50001fdc8. That is exactly from thread.sp+88-sizeof(long) = 0xffff94b50001fdc8+88-8 = 0xffff94b50001fe18, which is garbage marked as NOTE above. So this quits the unwinding as 7793fab6956b2d00 is obviously not a kernel address. There was a fix to skip 'struct inactive_task_frame' in unwind_get_return_address_ptr in the following commit: 187b96db5ca7 ("x86/unwind/orc: Fix unwind_get_return_address_ptr() for inactive tasks") But we need to skip the struct already in the unwinder proper. So subtract the size (increase the stack pointer) of the structure in __unwind_start() directly. This allows for removal of the code added by commit 187b96db5ca7 completely, as the address is now at '(unsigned long *)state->sp - 1', the same as in the generic case. [ mingo: Cleaned up the changelog a bit, for better readability. ] Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Bug: https://bugzilla.suse.com/show_bug.cgi?id=1176907 Signed-off-by: Jiri Slaby <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-14x86/kexec: Use up-to-dated screen_info copy to fill boot paramsKairui Song1-2/+1
kexec_file_load() currently reuses the old boot_params.screen_info, but if drivers have change the hardware state, boot_param.screen_info could contain invalid info. For example, the video type might be no longer VGA, or the frame buffer address might be changed. If the kexec kernel keeps using the old screen_info, kexec'ed kernel may attempt to write to an invalid framebuffer memory region. There are two screen_info instances globally available, boot_params.screen_info and screen_info. Later one is a copy, and is updated by drivers. So let kexec_file_load use the updated copy. [ mingo: Tidied up the changelog. ] Signed-off-by: Kairui Song <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-13x86/setup: simplify reserve_crashkernel()Mike Rapoport1-26/+14
* Replace magic numbers with defines * Replace memblock_find_in_range() + memblock_reserve() with memblock_phys_alloc_range() * Stop checking for low memory size in reserve_crashkernel_low(). The allocation from limited range will anyway fail if there is no enough memory, so there is no need for extra traversal of memblock.memory Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Daniel Axtens <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Emil Renner Berthing <[email protected]> Cc: Hari Bathini <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Marek Szyprowski <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Simek <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yoshinori Sato <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13x86/setup: simplify initrd relocation and reservationMike Rapoport1-13/+3
Currently, initrd image is reserved very early during setup and then it might be relocated and re-reserved after the initial physical memory mapping is created. The "late" reservation of memblock verifies that mapped memory size exceeds the size of initrd, then checks whether the relocation required and, if yes, relocates inirtd to a new memory allocated from memblock and frees the old location. The check for memory size is excessive as memblock allocation will anyway fail if there is not enough memory. Besides, there is no point to allocate memory from memblock using memblock_find_in_range() + memblock_reserve() when there exists memblock_phys_alloc_range() with required functionality. Remove the redundant check and simplify memblock allocation. Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Daniel Axtens <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Emil Renner Berthing <[email protected]> Cc: Hari Bathini <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Marek Szyprowski <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Simek <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yoshinori Sato <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13efi/fake_mem: arrange for a resource entry per efi_fake_mem instanceDan Williams1-1/+15
In preparation for attaching a platform device per iomem resource teach the efi_fake_mem code to create an e820 entry per instance. Similar to E820_TYPE_PRAM, bypass merging resource when the e820 map is sanitized. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Ard Biesheuvel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/159643096068.4062302.11590041070221681669.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13Merge tag 'x86_asm_for_v5.10' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm updates from Borislav Petkov: "Two asm wrapper fixes: - Use XORL instead of XORQ to avoid a REX prefix and save some bytes in the .fixup section, by Uros Bizjak. - Replace __force_order dummy variable with a memory clobber to fix LLVM requiring a definition for former and to prevent memory accesses from still being cached/reordered, by Arvind Sankar" * tag 'x86_asm_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm: Replace __force_order with a memory clobber x86/uaccess: Use XORL %0,%0 in __get_user_asm()
2020-10-13Merge tag 'x86_urgent_for_v5.10-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Fix the #DE oops message string format which confused tools parsing crash information (Thomas Gleixner) - Remove an unused variable in the UV5 code which was triggering a build warning with clang (Mike Travis) * tag 'x86_urgent_for_v5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/uv: Remove unused variable in UV5 NMI handler x86/traps: Fix #DE Oops message regression
2020-10-13x86/traps: Fix #DE Oops message regressionThomas Gleixner1-1/+1
The conversion of #DE to the idtentry mechanism introduced a change in the Ooops message which confuses tools which parse crash information in dmesg. Remove the underscore from 'divide_error' to restore previous behaviour. Fixes: 9d06c4027f21 ("x86/entry: Convert Divide Error to IDTENTRY") Reported-by: Dmitry Vyukov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/CACT4Y+bTZFkuZd7+bPArowOv-7Die+WZpfOWnEO_Wgs3U59+oA@mail.gmail.com
2020-10-13Merge branch 'acpi-numa'Rafael J. Wysocki1-0/+1
* acpi-numa: docs: mm: numaperf.rst Add brief description for access class 1. node: Add access1 class to represent CPU to memory characteristics ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3 ACPI: Let ACPI know we support Generic Initiator Affinity Structures x86: Support Generic Initiator only proximity domains ACPI: Support Generic Initiator only domains ACPI / NUMA: Add stub function for pxm_to_node() irq-chip/gic-v3-its: Fix crash if ITS is in a proximity domain without processor or memory ACPI: Remove side effect of partly creating a node in acpi_get_node() ACPI: Rename acpi_map_pxm_to_online_node() to pxm_to_online_node() ACPI: Remove side effect of partly creating a node in acpi_map_pxm_to_online_node() ACPI: Do not create new NUMA domains from ACPI static tables that are not SRAT ACPI: Add out of bounds and numa_off protections to pxm_to_node()
2020-10-12Merge tag 'x86-hyperv-2020-10-12' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 Hyper-V update from Ingo Molnar: "A single commit harmonizing the x86 and ARM64 Hyper-V constants namespace" * tag 'x86-hyperv-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/hyperv: Remove aliases with X64 in their name
2020-10-12Merge tag 'x86-paravirt-2020-10-12' of ↵Linus Torvalds5-46/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirt cleanup from Ingo Molnar: "Clean up the paravirt code after the removal of 32-bit Xen PV support" * tag 'x86-paravirt-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/paravirt: Avoid needless paravirt step clearing page table entries x86/paravirt: Remove set_pte_at() pv-op x86/entry/32: Simplify CONFIG_XEN_PV build dependency x86/paravirt: Use CONFIG_PARAVIRT_XXL instead of CONFIG_PARAVIRT x86/paravirt: Clean up paravirt macros x86/paravirt: Remove 32-bit support from CONFIG_PARAVIRT_XXL
2020-10-12Merge tag 'perf-kprobes-2020-10-12' of ↵Linus Torvalds1-105/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf/kprobes updates from Ingo Molnar: "This prepares to unify the kretprobe trampoline handler and make kretprobe lockless (those patches are still work in progress)" * tag 'perf-kprobes-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: kprobes: Fix to check probe enabled before disarm_kprobe_ftrace() kprobes: Make local functions static kprobes: Free kretprobe_instance with RCU callback kprobes: Remove NMI context check sparc: kprobes: Use generic kretprobe trampoline handler sh: kprobes: Use generic kretprobe trampoline handler s390: kprobes: Use generic kretprobe trampoline handler powerpc: kprobes: Use generic kretprobe trampoline handler parisc: kprobes: Use generic kretprobe trampoline handler mips: kprobes: Use generic kretprobe trampoline handler ia64: kprobes: Use generic kretprobe trampoline handler csky: kprobes: Use generic kretprobe trampoline handler arc: kprobes: Use generic kretprobe trampoline handler arm64: kprobes: Use generic kretprobe trampoline handler arm: kprobes: Use generic kretprobe trampoline handler x86/kprobes: Use generic kretprobe trampoline handler kprobes: Add generic kretprobe trampoline handler
2020-10-12Merge tag 'core-static_call-2020-10-12' of ↵Linus Torvalds6-1/+110
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull static call support from Ingo Molnar: "This introduces static_call(), which is the idea of static_branch() applied to indirect function calls. Remove a data load (indirection) by modifying the text. They give the flexibility of function pointers, but with better performance. (This is especially important for cases where retpolines would otherwise be used, as retpolines can be pretty slow.) API overview: DECLARE_STATIC_CALL(name, func); DEFINE_STATIC_CALL(name, func); DEFINE_STATIC_CALL_NULL(name, typename); static_call(name)(args...); static_call_cond(name)(args...); static_call_update(name, func); x86 is supported via text patching, otherwise basic indirect calls are used, with function pointers. There's a second variant using inline code patching, inspired by jump-labels, implemented on x86 as well. The new APIs are utilized in the x86 perf code, a heavy user of function pointers, where static calls speed up the PMU handler by 4.2% (!). The generic implementation is not really excercised on other architectures, outside of the trivial test_static_call_init() self-test" * tag 'core-static_call-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) static_call: Fix return type of static_call_init tracepoint: Fix out of sync data passing by static caller tracepoint: Fix overly long tracepoint names x86/perf, static_call: Optimize x86_pmu methods tracepoint: Optimize using static_call() static_call: Allow early init static_call: Add some validation static_call: Handle tail-calls static_call: Add static_call_cond() x86/alternatives: Teach text_poke_bp() to emulate RET static_call: Add simple self-test for static calls x86/static_call: Add inline static call implementation for x86-64 x86/static_call: Add out-of-line static call implementation static_call: Avoid kprobes on inline static_call()s static_call: Add inline static call infrastructure static_call: Add basic static call infrastructure compiler.h: Make __ADDRESSABLE() symbol truly unique jump_label,module: Fix module lifetime for __jump_label_mod_text_reserved() module: Properly propagate MODULE_STATE_COMING failure module: Fix up module_notifier return values ...
2020-10-12Merge tag 'core-build-2020-10-12' of ↵Linus Torvalds1-1/+38
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull orphan section checking from Ingo Molnar: "Orphan link sections were a long-standing source of obscure bugs, because the heuristics that various linkers & compilers use to handle them (include these bits into the output image vs discarding them silently) are both highly idiosyncratic and also version dependent. Instead of this historically problematic mess, this tree by Kees Cook (et al) adds build time asserts and build time warnings if there's any orphan section in the kernel or if a section is not sized as expected. And because we relied on so many silent assumptions in this area, fix a metric ton of dependencies and some outright bugs related to this, before we can finally enable the checks on the x86, ARM and ARM64 platforms" * tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) x86/boot/compressed: Warn on orphan section placement x86/build: Warn on orphan section placement arm/boot: Warn on orphan section placement arm/build: Warn on orphan section placement arm64/build: Warn on orphan section placement x86/boot/compressed: Add missing debugging sections to output x86/boot/compressed: Remove, discard, or assert for unwanted sections x86/boot/compressed: Reorganize zero-size section asserts x86/build: Add asserts for unwanted sections x86/build: Enforce an empty .got.plt section x86/asm: Avoid generating unused kprobe sections arm/boot: Handle all sections explicitly arm/build: Assert for unwanted sections arm/build: Add missing sections arm/build: Explicitly keep .ARM.attributes sections arm/build: Refactor linker script headers arm64/build: Assert for unwanted sections arm64/build: Add missing DWARF sections arm64/build: Use common DISCARDS in linker script arm64/build: Remove .eh_frame* sections due to unwind tables ...
2020-10-12Merge tag 'efi-core-2020-10-12' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI changes from Ingo Molnar: - Preliminary RISC-V enablement - the bulk of it will arrive via the RISCV tree. - Relax decompressed image placement rules for 32-bit ARM - Add support for passing MOK certificate table contents via a config table rather than a EFI variable. - Add support for 18 bit DIMM row IDs in the CPER records. - Work around broken Dell firmware that passes the entire Boot#### variable contents as the command line - Add definition of the EFI_MEMORY_CPU_CRYPTO memory attribute so we can identify it in the memory map listings. - Don't abort the boot on arm64 if the EFI RNG protocol is available but returns with an error - Replace slashes with exclamation marks in efivarfs file names - Split efi-pstore from the deprecated efivars sysfs code, so we can disable the latter on !x86. - Misc fixes, cleanups and updates. * tag 'efi-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) efi: mokvar: add missing include of asm/early_ioremap.h efi: efivars: limit availability to X86 builds efi: remove some false dependencies on CONFIG_EFI_VARS efi: gsmi: fix false dependency on CONFIG_EFI_VARS efi: efivars: un-export efivars_sysfs_init() efi: pstore: move workqueue handling out of efivars efi: pstore: disentangle from deprecated efivars module efi: mokvar-table: fix some issues in new code efi/arm64: libstub: Deal gracefully with EFI_RNG_PROTOCOL failure efivarfs: Replace invalid slashes with exclamation marks in dentries. efi: Delete deprecated parameter comments efi/libstub: Fix missing-prototypes in string.c efi: Add definition of EFI_MEMORY_CPU_CRYPTO and ability to report it cper,edac,efi: Memory Error Record: bank group/address and chip id edac,ghes,cper: Add Row Extension to Memory Error Record efi/x86: Add a quirk to support command line arguments on Dell EFI firmware efi/libstub: Add efi_warn and *_once logging helpers integrity: Load certs from the EFI MOK config table integrity: Move import of MokListRT certs to a separate routine efi: Support for MOK variable config table ...
2020-10-12Merge tag 'locking-core-2020-10-12' of ↵Linus Torvalds1-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "These are the locking updates for v5.10: - Add deadlock detection for recursive read-locks. The rationale is outlined in commit 224ec489d3cd ("lockdep/ Documention: Recursive read lock detection reasoning") The main deadlock pattern we want to detect is: TASK A: TASK B: read_lock(X); write_lock(X); read_lock_2(X); - Add "latch sequence counters" (seqcount_latch_t): A sequence counter variant where the counter even/odd value is used to switch between two copies of protected data. This allows the read path, typically NMIs, to safely interrupt the write side critical section. We utilize this new variant for sched-clock, and to make x86 TSC handling safer. - Other seqlock cleanups, fixes and enhancements - KCSAN updates - LKMM updates - Misc updates, cleanups and fixes" * tag 'locking-core-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits) lockdep: Revert "lockdep: Use raw_cpu_*() for per-cpu variables" lockdep: Fix lockdep recursion lockdep: Fix usage_traceoverflow locking/atomics: Check atomic-arch-fallback.h too locking/seqlock: Tweak DEFINE_SEQLOCK() kernel doc lockdep: Optimize the memory usage of circular queue seqlock: Unbreak lockdep seqlock: PREEMPT_RT: Do not starve seqlock_t writers seqlock: seqcount_LOCKNAME_t: Introduce PREEMPT_RT support seqlock: seqcount_t: Implement all read APIs as statement expressions seqlock: Use unique prefix for seqcount_t property accessors seqlock: seqcount_LOCKNAME_t: Standardize naming convention seqlock: seqcount latch APIs: Only allow seqcount_latch_t rbtree_latch: Use seqcount_latch_t x86/tsc: Use seqcount_latch_t timekeeping: Use seqcount_latch_t time/sched_clock: Use seqcount_latch_t seqlock: Introduce seqcount_latch_t mm/swap: Do not abuse the seqcount_t latching API time/sched_clock: Use raw_read_seqcount_latch() during suspend ...
2020-10-12Merge tag 'x86-entry-2020-10-12' of ↵Linus Torvalds4-126/+89
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 entry code updates from Thomas Gleixner: "More consolidation and correctness fixes for the debug exception: - Ensure BTF synchronization under all circumstances - Distangle kernel and user mode #DB further - Get ordering vs. the debug notifier correct to make KGDB work more reliably. - Cleanup historical gunk and make the code simpler to understand" * tag 'x86-entry-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/debug: Change thread.debugreg6 to thread.virtual_dr6 x86/debug: Support negative polarity DR6 bits x86/debug: Simplify hw_breakpoint_handler() x86/debug: Remove aout_dump_debugregs() x86/debug: Remove the historical junk x86/debug: Move cond_local_irq_enable() block into exc_debug_user() x86/debug: Move historical SYSENTER junk into exc_debug_kernel() x86/debug: Simplify #DB signal code x86/debug: Remove handle_debug(.user) argument x86/debug: Move kprobe_debug_handler() into exc_debug_kernel() x86/debug: Sync BTF earlier
2020-10-12Merge tag 'x86-irq-2020-10-12' of ↵Linus Torvalds9-174/+86
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 irq updates from Thomas Gleixner: "Surgery of the MSI interrupt handling to prepare the support of upcoming devices which require non-PCI based MSI handling: - Cleanup historical leftovers all over the place - Rework the code to utilize more core functionality - Wrap XEN PCI/MSI interrupts into an irqdomain to make irqdomain assignment to PCI devices possible. - Assign irqdomains to PCI devices at initialization time which allows to utilize the full functionality of hierarchical irqdomains. - Remove arch_.*_msi_irq() functions from X86 and utilize the irqdomain which is assigned to the device for interrupt management. - Make the arch_.*_msi_irq() support conditional on a config switch and let the last few users select it" * tag 'x86-irq-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) PCI: MSI: Fix Kconfig dependencies for PCI_MSI_ARCH_FALLBACKS x86/apic/msi: Unbreak DMAR and HPET MSI iommu/amd: Remove domain search for PCI/MSI iommu/vt-d: Remove domain search for PCI/MSI[X] x86/irq: Make most MSI ops XEN private x86/irq: Cleanup the arch_*_msi_irqs() leftovers PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable x86/pci: Set default irq domain in pcibios_add_device() iommm/amd: Store irq domain in struct device iommm/vt-d: Store irq domain in struct device x86/xen: Wrap XEN MSI management into irqdomain irqdomain/msi: Allow to override msi_domain_alloc/free_irqs() x86/xen: Consolidate XEN-MSI init x86/xen: Rework MSI teardown x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init() PCI/MSI: Provide pci_dev_has_special_msi_domain() helper PCI_vmd_Mark_VMD_irqdomain_with_DOMAIN_BUS_VMD_MSI irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI x86/irq: Initialize PCI/MSI domain at PCI init time x86/pci: Reducde #ifdeffery in PCI init code ...
2020-10-12Merge tag 'x86_core_for_v5.10' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Borislav Petkov: "A single fix making the error message when the opcode bytes at rIP cannot be accessed during an oops, more precise" * tag 'x86_core_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/dumpstack: Fix misleading instruction pointer error message
2020-10-12Merge tag 'x86_cache_for_v5.10' of ↵Linus Torvalds7-155/+145
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cache resource control updates from Borislav Petkov: - Misc cleanups to the resctrl code in preparation for the ARM side (James Morse) - Add support for controlling per-thread memory bandwidth throttling delay values on hw which supports it (Fenghua Yu) * tag 'x86_cache_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Enable user to view thread or core throttling mode x86/resctrl: Enumerate per-thread MBA controls cacheinfo: Move resctrl's get_cache_id() to the cacheinfo header file x86/resctrl: Add struct rdt_cache::arch_has_{sparse, empty}_bitmaps x86/resctrl: Merge AMD/Intel parse_bw() calls x86/resctrl: Add struct rdt_membw::arch_needs_linear to explain AMD/Intel MBA difference x86/resctrl: Use is_closid_match() in more places x86/resctrl: Include pid.h x86/resctrl: Use container_of() in delayed_work handlers x86/resctrl: Fix stale comment x86/resctrl: Remove struct rdt_membw::max_delay x86/resctrl: Remove unused struct mbm_state::chunks_bw
2020-10-12Merge tag 'x86_cleanups_for_v5.10' of ↵Linus Torvalds2-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: "Misc minor cleanups" * tag 'x86_cleanups_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry: Fix typo in comments for syscall_enter_from_user_mode() x86/resctrl: Fix spelling in user-visible warning messages x86/entry/64: Do not include inst.h in calling.h x86/mpparse: Remove duplicate io_apic.h include