blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2020-07-17	x86/platform/uv: Remove vestigial mention of UV1 platform from bios header	[email protected]	1	-1/+1
	Remove UV1 reference as UV1 is not longer supported by HPE. Signed-off-by: Steve Wahl <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-17	x86/platform/uv: Remove support for UV1 platform from uv	[email protected]	1	-1/+1
	UV1 is not longer supported by HPE Signed-off-by: Steve Wahl <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-17	x86/platform/uv: Remove support for uv1 platform from uv_hub	[email protected]	1	-31/+3
	UV1 is not longer supported by HPE. Signed-off-by: Steve Wahl <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-17	x86/platform/uv: Remove support for UV1 platform from uv_bau	[email protected]	1	-111/+7
	UV1 is not longer supported. Signed-off-by: Steve Wahl <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-17	x86/platform/uv: Remove support for UV1 platform from uv_mmrs	[email protected]	1	-712/+0
	UV1 is not longer supported. Signed-off-by: Steve Wahl <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-16	x86/entry: Fix vectors to IDTENTRY_SYSVEC for CONFIG_HYPERV	Sedat Dilek	1	-2/+2
	When assembling with Clang via `make LLVM_IAS=1` and CONFIG_HYPERV enabled, we observe the following error: <instantiation>:9:6: error: expected absolute expression .if HYPERVISOR_REENLIGHTENMENT_VECTOR == 3 ^ <instantiation>:1:1: note: while in macro instantiation idtentry HYPERVISOR_REENLIGHTENMENT_VECTOR asm_sysvec_hyperv_reenlightenment sysvec_hyperv_reenlightenment has_error_code=0 ^ ./arch/x86/include/asm/idtentry.h:627:1: note: while in macro instantiation idtentry_sysvec HYPERVISOR_REENLIGHTENMENT_VECTOR sysvec_hyperv_reenlightenment; ^ <instantiation>:9:6: error: expected absolute expression .if HYPERVISOR_STIMER0_VECTOR == 3 ^ <instantiation>:1:1: note: while in macro instantiation idtentry HYPERVISOR_STIMER0_VECTOR asm_sysvec_hyperv_stimer0 sysvec_hyperv_stimer0 has_error_code=0 ^ ./arch/x86/include/asm/idtentry.h:628:1: note: while in macro instantiation idtentry_sysvec HYPERVISOR_STIMER0_VECTOR sysvec_hyperv_stimer0; This is caused by typos in arch/x86/include/asm/idtentry.h: HYPERVISOR_REENLIGHTENMENT_VECTOR -> HYPERV_REENLIGHTENMENT_VECTOR HYPERVISOR_STIMER0_VECTOR -> HYPERV_STIMER0_VECTOR For more details see ClangBuiltLinux issue #1088. Fixes: a16be368dd3f ("x86/entry: Convert various hypervisor vectors to IDTENTRY_SYSVEC") Suggested-by: Nick Desaulniers <[email protected]> Signed-off-by: Sedat Dilek <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Reviewed-by: Wei Liu <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Link: https://github.com/ClangBuiltLinux/linux/issues/1088 Link: https://github.com/ClangBuiltLinux/linux/issues/1043 Link: https://lore.kernel.org/patchwork/patch/1272115/ Link: https://lkml.kernel.org/r/[email protected]
2020-07-16	x86/entry: Add compatibility with IAS	Jian Cai	1	-8/+6
	Clang's integrated assembler does not allow symbols with non-absolute values to be reassigned. Modify the interrupt entry loop macro to be compatible with IAS by using a label and an offset. Reported-by: Nick Desaulniers <[email protected]> Reported-by: Sedat Dilek <[email protected]> Suggested-by: Nick Desaulniers <[email protected]> Suggested-by: Brian Gerst <[email protected]> Suggested-by: Arvind Sankar <[email protected]> Signed-off-by: Jian Cai <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Sedat Dilek <[email protected]> # Link: https://github.com/ClangBuiltLinux/linux/issues/1043 Link: https://lkml.kernel.org/r/[email protected]
2020-07-16	crypto: x86 - Remove include/asm/inst.h	Uros Bizjak	1	-311/+0
	Current minimum required version of binutils is 2.23, which supports PSHUFB, PCLMULQDQ, PEXTRD, AESKEYGENASSIST, AESIMC, AESENC, AESENCLAST, AESDEC, AESDECLAST and MOVQ instruction mnemonics. Substitute macros from include/asm/inst.h with a proper instruction mnemonics in various assmbly files from x86/crypto directory, and remove now unneeded file. The patch was tested by calculating and comparing sha256sum hashes of stripped object files before and after the patch, to be sure that executable code didn't change. Signed-off-by: Uros Bizjak <[email protected]> CC: Herbert Xu <[email protected]> CC: "David S. Miller" <[email protected]> CC: Thomas Gleixner <[email protected]> CC: Ingo Molnar <[email protected]> CC: Borislav Petkov <[email protected]> CC: "H. Peter Anvin" <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2020-07-16	x86/idtentry: Remove stale comment	Thomas Gleixner	1	-4/+2
	Stack switching for interrupt handlers happens in C now for both 64 and 32bit. Remove the stale comment which claims the contrary. Signed-off-by: Thomas Gleixner <[email protected]>
2020-07-10	KVM: x86: Add a capability for GUEST_MAXPHYADDR < HOST_MAXPHYADDR support	Mohammed Gamal	1	-1/+1
	This patch adds a new capability KVM_CAP_SMALLER_MAXPHYADDR which allows userspace to query if the underlying architecture would support GUEST_MAXPHYADDR < HOST_MAXPHYADDR and hence act accordingly (e.g. qemu can decide if it should warn for -cpu ..,phys-bits=X) The complications in this patch are due to unexpected (but documented) behaviour we see with NPF vmexit handling in AMD processor. If SVM is modified to add guest physical address checks in the NPF and guest #PF paths, we see the followning error multiple times in the 'access' test in kvm-unit-tests: test pte.p pte.36 pde.p: FAIL: pte 2000021 expected 2000001 Dump mapping: address: 0x123400000000 ------L4: 24c3027 ------L3: 24c4027 ------L2: 24c5021 ------L1: 1002000021 This is because the PTE's accessed bit is set by the CPU hardware before the NPF vmexit. This is handled completely by hardware and cannot be fixed in software. Therefore, availability of the new capability depends on a boolean variable allow_smaller_maxphyaddr which is set individually by VMX and SVM init routines. On VMX it's always set to true, on SVM it's only set to true when NPT is not enabled. CC: Tom Lendacky <[email protected]> CC: Babu Moger <[email protected]> Signed-off-by: Mohammed Gamal <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-10	KVM: x86: rename update_bp_intercept to update_exception_bitmap	Paolo Bonzini	1	-1/+1
	We would like to introduce a callback to update the #PF intercept when CPUID changes. Just reuse update_bp_intercept since VMX is already using update_exception_bitmap instead of a bespoke function. While at it, remove an unnecessary assignment in the SVM version, which is already done in the caller (kvm_arch_vcpu_ioctl_set_guest_debug) and has nothing to do with the exception bitmap. Reviewed-by: Jim Mattson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-10	KVM: x86: mmu: Move translate_gpa() to mmu.c	Mohammed Gamal	1	-6/+0
	Also no point of it being inline since it's always called through function pointers. So remove that. Signed-off-by: Mohammed Gamal <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-10	x86/entry: Fix NMI vs IRQ state tracking	Peter Zijlstra	1	-0/+3
	While the nmi_enter() users did trace_hardirqs_{off_prepare,on_finish}() there was no matching lockdep_hardirqs_*() calls to complete the picture. Introduce idtentry_{enter,exit}_nmi() to enable proper IRQ state tracking across the NMIs. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-10	Merge branch 'x86/urgent' into x86/entry to pick up upstream fixes.	Thomas Gleixner	1	-2/+2

2020-07-09	KVM: Move x86's version of struct kvm_mmu_memory_cache to common code	Sean Christopherson	2	-13/+7
	Move x86's 'struct kvm_mmu_memory_cache' to common code in anticipation of moving the entire x86 implementation code to common KVM and reusing it for arm64 and MIPS. Add a new architecture specific asm/kvm_types.h to control the existence and parameters of the struct. The new header is needed to avoid a chicken-and-egg problem with asm/kvm_host.h as all architectures define instances of the struct in their vCPU structs. Add an asm-generic version of kvm_types.h to avoid having empty files on PPC and s390 in the long term, and for arm64 and mips in the short term. Suggested-by: Christoffer Dall <[email protected]> Reviewed-by: Ben Gardon <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-09	KVM: x86/mmu: Make __GFP_ZERO a property of the memory cache	Sean Christopherson	1	-0/+1
	Add a gfp_zero flag to 'struct kvm_mmu_memory_cache' and use it to control __GFP_ZERO instead of hardcoding a call to kmem_cache_zalloc(). A future patch needs such a flag for the __get_free_page() path, as gfn arrays do not need/want the allocator to zero the memory. Convert the kmem_cache paths to __GFP_ZERO now so as to avoid a weird and inconsistent API in the future. No functional change intended. Reviewed-by: Ben Gardon <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-09	KVM: x86/mmu: Separate the memory caches for shadow pages and gfn arrays	Sean Christopherson	1	-1/+2
	Use separate caches for allocating shadow pages versus gfn arrays. This sets the stage for specifying __GFP_ZERO when allocating shadow pages without incurring extra cost for gfn arrays. No functional change intended. Reviewed-by: Ben Gardon <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-09	KVM: x86/mmu: Track the associated kmem_cache in the MMU caches	Sean Christopherson	1	-0/+1
	Track the kmem_cache used for non-page KVM MMU memory caches instead of passing in the associated kmem_cache when filling the cache. This will allow consolidating code and other cleanups. No functional change intended. Reviewed-by: Ben Gardon <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-09	x86/kvm/vmx: Move guest enter/exit into .noinstr.text	Thomas Gleixner	2	-2/+10
	Move the functions which are inside the RCU off region into the non-instrumentable text section. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Alexandre Chartre <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-09	KVM: x86: Rename cpuid_update() callback to vcpu_after_set_cpuid()	Xiaoyao Li	1	-1/+1
	The name of callback cpuid_update() is misleading that it's not about updating CPUID settings of vcpu but updating the configurations of vcpu based on the CPUIDs. So rename it to vcpu_after_set_cpuid(). Signed-off-by: Xiaoyao Li <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-08	KVM: x86: Create mask for guest CR4 reserved bits in kvm_update_cpuid()	Krish Sadhukhan	1	-0/+1
	Instead of creating the mask for guest CR4 reserved bits in kvm_valid_cr4(), do it in kvm_update_cpuid() so that it can be reused instead of creating it each time kvm_valid_cr4() is called. Suggested-by: Paolo Bonzini <[email protected]> Signed-off-by: Krish Sadhukhan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-08	x86/kvm: Add "nopvspin" parameter to disable PV spinlocks	Zhenzhong Duan	1	-0/+1
	There are cases where a guest tries to switch spinlocks to bare metal behavior (e.g. by setting "xen_nopvspin" on XEN platform and "hv_nopvspin" on HYPER_V). That feature is missed on KVM, add a new parameter "nopvspin" to disable PV spinlocks for KVM guest. The new 'nopvspin' parameter will also replace Xen and Hyper-V specific parameters in future patches. Define variable nopvsin as global because it will be used in future patches as above. Signed-off-by: Zhenzhong Duan <[email protected]> Reviewed-by: Vitaly Kuznetsov <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Radim Krcmar <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: Jim Mattson <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-08	KVM: x86/mmu: Make kvm_mmu_page definition and accessor internal-only	Sean Christopherson	1	-44/+2
	Make 'struct kvm_mmu_page' MMU-only, nothing outside of the MMU should be poking into the gory details of shadow pages. Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-08	KVM: x86: Use VMCALL and VMMCALL mnemonics in kvm_para.h	Uros Bizjak	1	-1/+1
	Current minimum required version of binutils is 2.23, which supports VMCALL and VMMCALL instruction mnemonics. Replace the byte-wise specification of VMCALL and VMMCALL with these proper mnemonics. Signed-off-by: Uros Bizjak <[email protected]> CC: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-08	kvm: x86: Move last_cpu into kvm_vcpu_arch as last_vmentry_cpu	Jim Mattson	1	-0/+3
	Both the vcpu_vmx structure and the vcpu_svm structure have a 'last_cpu' field. Move the common field into the kvm_vcpu_arch structure. For clarity, rename it to 'last_vmentry_cpu.' Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Jim Mattson <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Reviewed-by: Peter Shier <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-08	KVM: x86/mmu: Make .write_log_dirty a nested operation	Sean Christopherson	1	-1/+1
	Move .write_log_dirty() into kvm_x86_nested_ops to help differentiate it from the non-nested dirty log hooks. And because it's a nested-only operation. Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-07-08	Merge branch 'kvm-async-pf-int' into HEAD	Paolo Bonzini	2	-0/+5

2020-07-08	Merge branch 'sched/urgent'	Peter Zijlstra	11	-40/+61

2020-07-08	perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch	Kan Liang	3	-4/+23
	In the LBR call stack mode, LBR information is used to reconstruct a call stack. To get the complete call stack, perf has to save/restore all LBR registers during a context switch. Due to a large number of the LBR registers, this process causes a high CPU overhead. To reduce the CPU overhead during a context switch, use the XSAVES/XRSTORS instructions. Every XSAVE area must follow a canonical format: the legacy region, an XSAVE header and the extended region. Although the LBR information is only kept in the extended region, a space for the legacy region and XSAVE header is still required. Add a new dedicated structure for LBR XSAVES support. Before enabling XSAVES support, the size of the LBR state has to be sanity checked, because: - the size of the software structure is calculated from the max number of the LBR depth, which is enumerated by the CPUID leaf for Arch LBR. The size of the LBR state is enumerated by the CPUID leaf for XSAVE support of Arch LBR. If the values from the two CPUID leaves are not consistent, it may trigger a buffer overflow. For example, a hypervisor may unconsciously set inconsistent values for the two emulated CPUID. - unlike other state components, the size of an LBR state depends on the max number of LBRs, which may vary from generation to generation. Expose the function xfeature_size() for the sanity check. The LBR XSAVES support will be disabled if the size of the LBR state enumerated by CPUID doesn't match with the size of the software structure. The XSAVE instruction requires 64-byte alignment for state buffers. A new macro is added to reflect the alignment requirement. A 64-byte aligned kmem_cache is created for architecture LBR. Currently, the structure for each state component is maintained in fpu/types.h. The structure for the new LBR state component should be maintained in the same place. Move structure lbr_entry to fpu/types.h as well for broader sharing. Add dedicated lbr_save/lbr_restore functions for LBR XSAVES support, which invokes the corresponding xstate helpers to XSAVES/XRSTORS LBR information at the context switch when the call stack mode is enabled. Since the XSAVES/XRSTORS instructions will be eventually invoked, the dedicated functions is named with '_xsaves'/'_xrstors' postfix. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-08	x86/fpu/xstate: Add helpers for LBR dynamic supervisor feature	Kan Liang	1	-0/+3
	The perf subsystem will only need to save/restore the LBR state. However, the existing helpers save all supported supervisor states to a kernel buffer, which will be unnecessary. Two helpers are introduced to only save/restore requested dynamic supervisor states. The supervisor features in XFEATURE_MASK_SUPERVISOR_SUPPORTED and XFEATURE_MASK_SUPERVISOR_UNSUPPORTED mask cannot be saved/restored using these helpers. The helpers will be used in the following patch. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-08	x86/fpu/xstate: Support dynamic supervisor feature for LBR	Kan Liang	2	-0/+37
	Last Branch Records (LBR) registers are used to log taken branches and other control flows. In perf with call stack mode, LBR information is used to reconstruct a call stack. To get the complete call stack, perf has to save/restore all LBR registers during a context switch. Due to the large number of the LBR registers, e.g., the current platform has 96 LBR registers, this process causes a high CPU overhead. To reduce the CPU overhead during a context switch, an LBR state component that contains all the LBR related registers is introduced in hardware. All LBR registers can be saved/restored together using one XSAVES/XRSTORS instruction. However, the kernel should not save/restore the LBR state component at each context switch, like other state components, because of the following unique features of LBR: - The LBR state component only contains valuable information when LBR is enabled in the perf subsystem, but for most of the time, LBR is disabled. - The size of the LBR state component is huge. For the current platform, it's 808 bytes. If the kernel saves/restores the LBR state at each context switch, for most of the time, it is just a waste of space and cycles. To efficiently support the LBR state component, it is desired to have: - only context-switch the LBR when the LBR feature is enabled in perf. - only allocate an LBR-specific XSAVE buffer on demand. (Besides the LBR state, a legacy region and an XSAVE header have to be included in the buffer as well. There is a total of (808+576) byte overhead for the LBR-specific XSAVE buffer. The overhead only happens when the perf is actively using LBRs. There is still a space-saving, on average, when it replaces the constant 808 bytes of overhead for every task, all the time on the systems that support architectural LBR.) - be able to use XSAVES/XRSTORS for accessing LBR at run time. However, the IA32_XSS should not be adjusted at run time. (The XCR0 \| IA32_XSS are used to determine the requested-feature bitmap (RFBM) of XSAVES.) A solution, called dynamic supervisor feature, is introduced to address this issue, which - does not allocate a buffer in each task->fpu; - does not save/restore a state component at each context switch; - sets the bit corresponding to the dynamic supervisor feature in IA32_XSS at boot time, and avoids setting it at run time. - dynamically allocates a specific buffer for a state component on demand, e.g. only allocates LBR-specific XSAVE buffer when LBR is enabled in perf. (Note: The buffer has to include the LBR state component, a legacy region and a XSAVE header space.) (Implemented in a later patch) - saves/restores a state component on demand, e.g. manually invokes the XSAVES/XRSTORS instruction to save/restore the LBR state to/from the buffer when perf is active and a call stack is required. (Implemented in a later patch) A new mask XFEATURE_MASK_DYNAMIC and a helper xfeatures_mask_dynamic() are introduced to indicate the dynamic supervisor feature. For the systems which support the Architecture LBR, LBR is the only dynamic supervisor feature for now. For the previous systems, there is no dynamic supervisor feature available. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-08	x86/fpu: Use proper mask to replace full instruction mask	Kan Liang	1	-40/+7
	When saving xstate to a kernel/user XSAVE area with the XSAVE family of instructions, the current code applies the 'full' instruction mask (-1), which tries to XSAVE all possible features. This method relies on hardware to trim 'all possible' down to what is enabled in the hardware. The code works well for now. However, there will be a problem, if some features are enabled in hardware, but are not suitable to be saved into all kernel XSAVE buffers, like task->fpu, due to performance consideration. One such example is the Last Branch Records (LBR) state. The LBR state only contains valuable information when LBR is explicitly enabled by the perf subsystem, and the size of an LBR state is large (808 bytes for now). To avoid both CPU overhead and space overhead at each context switch, the LBR state should not be saved into task->fpu like other state components. It should be saved/restored on demand when LBR is enabled in the perf subsystem. Current copy_xregs_to_* will trigger a buffer overflow for such cases. Three sites use the '-1' instruction mask which must be updated. Two are saving/restoring the xstate to/from a kernel-allocated XSAVE buffer and can use 'xfeatures_mask_all', which will save/restore all of the features present in a normal task FPU buffer. The last one saves the register state directly to a user buffer. It could also use 'xfeatures_mask_all'. Just as it was with the '-1' argument, any supervisor states in the mask will be filtered out by the hardware and not saved to the buffer. But, to be more explicit about what is expected to be saved, use xfeatures_mask_user() for the instruction mask. KVM includes the header file fpu/internal.h. To avoid 'undefined xfeatures_mask_all' compiling issue, move copy_fpregs_to_fpstate() to fpu/core.c and export it, because: - The xfeatures_mask_all is indirectly used via copy_fpregs_to_fpstate() by KVM. The function which is directly used by other modules should be exported. - The copy_fpregs_to_fpstate() is a function, while xfeatures_mask_all is a variable for the "internal" FPU state. It's safer to export a function than a variable, which may be implicitly changed by others. - The copy_fpregs_to_fpstate() is a big function with many checks. The removal of the inline keyword should not impact the performance. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-08	perf/x86/intel/lbr: Unify the stored format of LBR information	Kan Liang	1	-5/+1
	Current LBR information in the structure x86_perf_task_context is stored in a different format from the PEBS LBR record and Architecture LBR, which prevents the sharing of the common codes. Use the format of the PEBS LBR record as a unified format. Use a generic name lbr_entry to replace pebs_lbr_entry. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-08	perf/x86: Expose CPUID enumeration bits for arch LBR	Kan Liang	1	-0/+40
	The LBR capabilities of Architecture LBR are retrieved from the CPUID enumeration once at boot time. The capabilities have to be saved for future usage. Several new fields are added into structure x86_pmu to indicate the capabilities. The fields will be used in the following patches. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-08	x86/msr-index: Add bunch of MSRs for Arch LBR	Kan Liang	1	-0/+16
	Add Arch LBR related MSRs and the new LBR INFO bits in MSR-index. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-08	x86/cpufeatures: Add Architectural LBRs feature bit	Kan Liang	1	-0/+1
	CPUID.(EAX=07H, ECX=0):EDX[19] indicates whether an Intel CPU supports Architectural LBRs. The "X86_FEATURE_..., word 18" is already mirrored from CPUID "0x00000007:0 (EDX)". Add X86_FEATURE_ARCH_LBR under the "word 18" section. The feature will appear as "arch_lbr" in /proc/cpuinfo. The Architectural Last Branch Records (LBR) feature enables recording of software path history by logging taken branches and other control flows. The feature will be supported in the perf_events subsystem. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-06	x86/entry: Rename idtentry_enter/exit_cond_rcu() to idtentry_enter/exit()	Andy Lutomirski	1	-12/+16
	They were originally called _cond_rcu because they were special versions with conditional RCU handling. Now they're the standard entry and exit path, so the _cond_rcu part is just confusing. Drop it. Also change the signature to make them more extensible and more foolproof. No functional change -- it's pure refactoring. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/247fc67685263e0b673e1d7f808182d28ff80359.1593795633.git.luto@kernel.org
2020-07-05	x86/entry/32: Fix XEN_PV build dependency	Ingo Molnar	1	-2/+2
	xenpv_exc_nmi() and xenpv_exc_debug() are only defined on 64-bit kernels, but they snuck into the 32-bit build via <asm/identry.h>, causing the link to fail: ld: arch/x86/entry/entry_32.o: in function `asm_xenpv_exc_nmi': (.entry.text+0x817): undefined reference to `xenpv_exc_nmi' ld: arch/x86/entry/entry_32.o: in function `asm_xenpv_exc_debug': (.entry.text+0x827): undefined reference to `xenpv_exc_debug' Only use them on 64-bit kernels. Fixes: f41f0824224e: ("x86/entry/xen: Route #DB correctly on Xen PV") Cc: Andy Lutomirski <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra (Intel) <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2020-07-05	Merge tag 'x86-urgent-2020-07-05' of ↵	Linus Torvalds	2	-28/+24
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A series of fixes for x86: - Reset MXCSR in kernel_fpu_begin() to prevent using a stale user space value. - Prevent writing MSR_TEST_CTRL on CPUs which are not explicitly whitelisted for split lock detection. Some CPUs which do not support it crash even when the MSR is written to 0 which is the default value. - Fix the XEN PV fallout of the entry code rework - Fix the 32bit fallout of the entry code rework - Add more selftests to ensure that these entry problems don't come back. - Disable 16 bit segments on XEN PV. It's not supported because XEN PV does not implement ESPFIX64" * tag 'x86-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/ldt: Disable 16-bit segments on Xen PV x86/entry/32: Fix #MC and #DB wiring on x86_32 x86/entry/xen: Route #DB correctly on Xen PV x86/entry, selftests: Further improve user entry sanity checks x86/entry/compat: Clear RAX high bits on Xen PV SYSENTER selftests/x86: Consolidate and fix get/set_eflags() helpers selftests/x86/syscall_nt: Clear weird flags after each test selftests/x86/syscall_nt: Add more flag combinations x86/entry/64/compat: Fix Xen PV SYSENTER frame setup x86/entry: Move SYSENTER's regs->sp and regs->flags fixups into C x86/entry: Assert that syscalls are on the right stack x86/split_lock: Don't write MSR_TEST_CTRL on CPUs that aren't whitelisted x86/fpu: Reset MXCSR to default in kernel_fpu_begin()
2020-07-04	x86/entry/32: Fix #MC and #DB wiring on x86_32	Andy Lutomirski	1	-10/+13
	DEFINE_IDTENTRY_MCE and DEFINE_IDTENTRY_DEBUG were wired up as non-RAW on x86_32, but the code expected them to be RAW. Get rid of all the macro indirection for them on 32-bit and just use DECLARE_IDTENTRY_RAW and DEFINE_IDTENTRY_RAW directly. Also add a warning to make sure that we only hit the _kernel paths in kernel mode. Reported-by: Naresh Kamboju <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/9e90a7ee8e72fd757db6d92e1e5ff16339c1ecf9.1593795633.git.luto@kernel.org
2020-07-04	x86/entry/xen: Route #DB correctly on Xen PV	Andy Lutomirski	1	-18/+6
	On Xen PV, #DB doesn't use IST. It still needs to be correctly routed depending on whether it came from user or kernel mode. Get rid of DECLARE/DEFINE_IDTENTRY_XEN -- it was too hard to follow the logic. Instead, route #DB and NMI through DECLARE/DEFINE_IDTENTRY_RAW on Xen, and do the right thing for #DB. Also add more warnings to the exc_debug* handlers to make this type of failure more obvious. This fixes various forms of corruption that happen when usermode triggers #DB on Xen PV. Fixes: 4c0dcd8350a0 ("x86/entry: Implement user mode C entry points for #DB and #MCE") Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/4163e733cce0b41658e252c6c6b3464f33fdff17.1593795633.git.luto@kernel.org
2020-07-02	Merge branch 'perf/vlbr'	Peter Zijlstra	10	-13/+70

2020-07-02	perf/x86: Add constraint to create guest LBR event without hw counter	Like Xu	1	-1/+21
	The hypervisor may request the perf subsystem to schedule a time window to directly access the LBR records msrs for its own use. Normally, it would create a guest LBR event with callstack mode enabled, which is scheduled along with other ordinary LBR events on the host but in an exclusive way. To avoid wasting a counter for the guest LBR event, the perf tracks its hw->idx via INTEL_PMC_IDX_FIXED_VLBR and assigns it with a fake VLBR counter with the help of new vlbr_constraint. As with the BTS event, there is actually no hardware counter assigned for the guest LBR event. Signed-off-by: Like Xu <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-02	perf/x86/lbr: Add interface to get LBR information	Like Xu	1	-0/+12
	The LBR records msrs are model specific. The perf subsystem has already obtained the base addresses of LBR records based on the cpu model. Therefore, an interface is added to allow callers outside the perf subsystem to obtain these LBR information. It's useful for hypervisors to emulate the LBR feature for guests with less code. Signed-off-by: Like Xu <[email protected]> Signed-off-by: Wei Wang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-02	cpufreq: intel_pstate: Allow enable/disable energy efficiency	Srinivas Pandruvada	1	-2/+4
	By default intel_pstate the driver disables energy efficiency by setting MSR_IA32_POWER_CTL bit 19 for Kaby Lake desktop CPU model in HWP mode. This CPU model is also shared by Coffee Lake desktop CPUs. This allows these systems to reach maximum possible frequency. But this adds power penalty, which some customers don't want. They want some way to enable/ disable dynamically. So, add an additional attribute "energy_efficiency" under /sys/devices/system/cpu/intel_pstate/ for these CPU models. This allows to read and write bit 19 ("Disable Energy Efficiency Optimization") in the MSR IA32_POWER_CTL. This attribute is present in both HWP and non-HWP mode as this has an effect in both modes. Refer to Intel Software Developer's manual for details. The scope of this bit is package wide. Also these systems are single package systems. So read/write MSR on the current CPU is enough. The energy efficiency (EE) bit setting needs to be preserved during suspend/resume and CPU offline/online operation. To do this: - Restoring the EE setting from the cpufreq resume() callback, if there is change from the system default. - By default, don't disable EE from cpufreq init() callback for matching CPU models. Since the scope is package wide and is a single package system, move the disable EE calls from init() callback to intel_pstate_init() function, which is called only once. Suggested-by: Len Brown <[email protected]> Signed-off-by: Srinivas Pandruvada <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2020-07-01	x86/ptrace: Fix 32-bit PTRACE_SETREGS vs fsbase and gsbase	Andy Lutomirski	1	-0/+2
	Debuggers expect that doing PTRACE_GETREGS, then poking at a tracee and maybe letting it run for a while, then doing PTRACE_SETREGS will put the tracee back where it was. In the specific case of a 32-bit tracer and tracee, the PTRACE_GETREGS/SETREGS data structure doesn't have fs_base or gs_base fields, so FSBASE and GSBASE fields are never stored anywhere. Everything used to still work because nonzero FS or GS would result full reloads of the segment registers when the tracee resumes, and the bases associated with FS==0 or GS==0 are irrelevant to 32-bit code. Adding FSGSBASE support broke this: when FSGSBASE is enabled, FSBASE and GSBASE are now restored independently of FS and GS for all tasks when context-switched in. This means that, if a 32-bit tracer restores a previous state using PTRACE_SETREGS but the tracee's pre-restore and post-restore bases don't match, then the tracee is resumed with the wrong base. Fix it by explicitly loading the base when a 32-bit tracer pokes FS or GS on a 64-bit kernel. Also add a test case. Fixes: 673903495c85 ("x86/process/64: Use FSBSBASE in switch_to() if available") Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/229cc6a50ecbb701abd50fe4ddaf0eda888898cd.1593192140.git.luto@kernel.org
2020-06-30	x86: Remove dev->archdata.iommu pointer	Joerg Roedel	1	-3/+0
	There are no users left, all drivers have been converted to use the per-device private pointer offered by IOMMU core. Signed-off-by: Joerg Roedel <[email protected]> Reviewed-by: Jerry Snitselaar <[email protected]> Acked-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-06-29	x86/fpu: Reset MXCSR to default in kernel_fpu_begin()	Petteri Aimonen	1	-0/+5
	Previously, kernel floating point code would run with the MXCSR control register value last set by userland code by the thread that was active on the CPU core just before kernel call. This could affect calculation results if rounding mode was changed, or a crash if a FPU/SIMD exception was unmasked. Restore MXCSR to the kernel's default value. [ bp: Carve out from a bigger patch by Petteri, add feature check, add FNINIT call too (amluto). ] Signed-off-by: Petteri Aimonen <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://bugzilla.kernel.org/show_bug.cgi?id=207979 Link: https://lkml.kernel.org/r/[email protected]
2020-06-28	Merge tag 'x86_urgent_for_5.8_rc3' of ↵	Linus Torvalds	1	-0/+5
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - AMD Memory bandwidth counter width fix, by Babu Moger. - Use the proper length type in the 32-bit truncate() syscall variant, by Jiri Slaby. - Reinit IA32_FEAT_CTL during wakeup to fix the case where after resume, VMXON would #GP due to VMX not being properly enabled, by Sean Christopherson. - Fix a static checker warning in the resctrl code, by Dan Carpenter. - Add a CR4 pinning mask for bits which cannot change after boot, by Kees Cook. - Align the start of the loop of __clear_user() to 16 bytes, to improve performance on AMD zen1 and zen2 microarchitectures, by Matt Fleming. * tag 'x86_urgent_for_5.8_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/asm/64: Align start of __clear_user() loop to 16-bytes x86/cpu: Use pinning mask for CR4 bits needing to be 0 x86/resctrl: Fix a NULL vs IS_ERR() static checker warning in rdt_cdp_peer_get() x86/cpu: Reinitialize IA32_FEAT_CTL MSR on BSP during wakeup syscalls: Fix offset type of ksys_ftruncate() x86/resctrl: Fix memory bandwidth counter width for AMD
2020-06-28	Merge tag 'x86_entry_for_5.8' of ↵	Linus Torvalds	4	-6/+26
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 entry fixes from Borislav Petkov: "This is the x86/entry urgent pile which has accumulated since the merge window. It is not the smallest but considering the almost complete entry core rewrite, the amount of fixes to follow is somewhat higher than usual, which is to be expected. Peter Zijlstra says: 'These patches address a number of instrumentation issues that were found after the x86/entry overhaul. When combined with rcu/urgent and objtool/urgent, these patches make UBSAN/KASAN/KCSAN happy again. Part of making this all work is bumping the minimum GCC version for KASAN builds to gcc-8.3, the reason for this is that the __no_sanitize_address function attribute is broken in GCC releases before that. No known GCC version has a working __no_sanitize_undefined, however because the only noinstr violation that results from this happens when an UB is found, we treat it like WARN. That is, we allow it to violate the noinstr rules in order to get the warning out'" * tag 'x86_entry_for_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/entry: Fix #UD vs WARN more x86/entry: Increase entry_stack size to a full page x86/entry: Fixup bad_iret vs noinstr objtool: Don't consider vmlinux a C-file kasan: Fix required compiler version compiler_attributes.h: Support no_sanitize_undefined check with GCC 4 x86/entry, bug: Comment the instrumentation_begin() usage for WARN() x86/entry, ubsan, objtool: Whitelist __ubsan_handle_*() x86/entry, cpumask: Provide non-instrumented variant of cpu_is_offline() compiler_types.h: Add __no_sanitize_{address,undefined} to noinstr kasan: Bump required compiler version x86, kcsan: Add __no_kcsan to noinstr kcsan: Remove __no_kcsan_or_inline x86, kcsan: Remove __no_kcsan_or_inline usage