blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2021-09-22	KVM: KVM: Use cpumask_available() to check for NULL cpumask when kicking vCPUs	Sean Christopherson	1	-3/+15
	Check for a NULL cpumask_var_t when kicking multiple vCPUs via cpumask_available(), which performs a !NULL check if and only if cpumasks are configured to be allocated off-stack. This is a meaningless optimization, e.g. avoids a TEST+Jcc and TEST+CMOV on x86, but more importantly helps document that the NULL check is necessary even though all callers pass in a local variable. No functional change intended. Cc: Lai Jiangshan <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Vitaly Kuznetsov <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: Clean up benign vcpu->cpu data races when kicking vCPUs	Sean Christopherson	1	-8/+28
	Fix a benign data race reported by syzbot+KCSAN[] by ensuring vcpu->cpu is read exactly once, and by ensuring the vCPU is booted from guest mode if kvm_arch_vcpu_should_kick() returns true. Fix a similar race in kvm_make_vcpus_request_mask() by ensuring the vCPU is interrupted if kvm_request_needs_ipi() returns true. Reading vcpu->cpu before vcpu->mode (via kvm_arch_vcpu_should_kick() or kvm_request_needs_ipi()) means the target vCPU could get migrated (change vcpu->cpu) and enter !OUTSIDE_GUEST_MODE between reading vcpu->cpud and reading vcpu->mode. If that happens, the kick/IPI will be sent to the old pCPU, not the new pCPU that is now running the vCPU or reading SPTEs. Although failing to kick the vCPU is not exactly ideal, practically speaking it cannot cause a functional issue unless there is also a bug in the caller, and any such bug would exist regardless of kvm_vcpu_kick()'s behavior. The purpose of sending an IPI is purely to get a vCPU into the host (or out of reading SPTEs) so that the vCPU can recognize a change in state, e.g. a KVM_REQ_ request. If vCPU's handling of the state change is required for correctness, KVM must ensure either the vCPU sees the change before entering the guest, or that the sender sees the vCPU as running in guest mode. All architectures handle this by (a) sending the request before calling kvm_vcpu_kick() and (b) checking for requests _after_ setting vcpu->mode. x86's READING_SHADOW_PAGE_TABLES has similar requirements; KVM needs to ensure it kicks and waits for vCPUs that started reading SPTEs _before_ MMU changes were finalized, but any vCPU that starts reading after MMU changes were finalized will see the new state and can continue on uninterrupted. For uses of kvm_vcpu_kick() that are not paired with a KVM_REQ_, e.g. x86's kvm_arch_sync_dirty_log(), the order of the kick must not be relied upon for functional correctness, e.g. in the dirty log case, userspace cannot assume it has a 100% complete log if vCPUs are still running. All that said, eliminate the benign race since the cost of doing so is an "extra" atomic cmpxchg() in the case where the target vCPU is loaded by the current pCPU or is not loaded at all. I.e. the kick will be skipped due to kvm_vcpu_exiting_guest_mode() seeing a compatible vcpu->mode as opposed to the kick being skipped because of the cpu checks. Keep the "cpu != me" checks even though they appear useless/impossible at first glance. x86 processes guest IPI writes in a fast path that runs in IN_GUEST_MODE, i.e. can call kvm_vcpu_kick() from IN_GUEST_MODE. And calling kvm_vm_bugged()->kvm_make_vcpus_request_mask() from IN_GUEST or READING_SHADOW_PAGE_TABLES is perfectly reasonable. Note, a race with the cpu_online() check in kvm_vcpu_kick() likely persists, e.g. the vCPU could exit guest mode and get offlined between the cpu_online() check and the sending of smp_send_reschedule(). But, the online check appears to exist only to avoid a WARN in x86's native_smp_send_reschedule() that fires if the target CPU is not online. The reschedule WARN exists because CPU offlining takes the CPU out of the scheduling pool, i.e. the WARN is intended to detect the case where the kernel attempts to schedule a task on an offline CPU. The actual sending of the IPI is a non-issue as at worst it will simpy be dropped on the floor. In other words, KVM's usurping of the reschedule IPI could theoretically trigger a WARN if the stars align, but there will be no loss of functionality. [] https://syzkaller.appspot.com/bug?extid=cd4154e502f43f10808a Cc: Venkatesh Srinivas <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Fixes: 97222cc83163 ("KVM: Emulate local APIC in kernel") Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Vitaly Kuznetsov <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: x86: Fix stack-out-of-bounds memory access from ioapic_write_indirect()	Vitaly Kuznetsov	1	-5/+5
	KASAN reports the following issue: BUG: KASAN: stack-out-of-bounds in kvm_make_vcpus_request_mask+0x174/0x440 [kvm] Read of size 8 at addr ffffc9001364f638 by task qemu-kvm/4798 CPU: 0 PID: 4798 Comm: qemu-kvm Tainted: G X --------- --- Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RYM0081C 07/13/2020 Call Trace: dump_stack+0xa5/0xe6 print_address_description.constprop.0+0x18/0x130 ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm] __kasan_report.cold+0x7f/0x114 ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm] kasan_report+0x38/0x50 kasan_check_range+0xf5/0x1d0 kvm_make_vcpus_request_mask+0x174/0x440 [kvm] kvm_make_scan_ioapic_request_mask+0x84/0xc0 [kvm] ? kvm_arch_exit+0x110/0x110 [kvm] ? sched_clock+0x5/0x10 ioapic_write_indirect+0x59f/0x9e0 [kvm] ? static_obj+0xc0/0xc0 ? __lock_acquired+0x1d2/0x8c0 ? kvm_ioapic_eoi_inject_work+0x120/0x120 [kvm] The problem appears to be that 'vcpu_bitmap' is allocated as a single long on stack and it should really be KVM_MAX_VCPUS long. We also seem to clear the lower 16 bits of it with bitmap_zero() for no particular reason (my guess would be that 'bitmap' and 'vcpu_bitmap' variables in kvm_bitmap_or_dest_vcpus() caused the confusion: while the later is indeed 16-bit long, the later should accommodate all possible vCPUs). Fixes: 7ee30bc132c6 ("KVM: x86: deliver KVM IOAPIC scan request to target vCPUs") Fixes: 9a2ae9f6b6bb ("KVM: x86: Zero the IOAPIC scan request dest vCPUs bitmap") Reported-by: Dr. David Alan Gilbert <[email protected]> Signed-off-by: Vitaly Kuznetsov <[email protected]> Reviewed-by: Maxim Levitsky <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: selftests: Create a separate dirty bitmap per slot	David Matlack	1	-15/+39
	The calculation to get the per-slot dirty bitmap was incorrect leading to a buffer overrun. Fix it by splitting out the dirty bitmap into a separate bitmap per slot. Fixes: 609e6202ea5f ("KVM: selftests: Support multiple slots in dirty_log_perf_test") Signed-off-by: David Matlack <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: selftests: Refactor help message for -s backing_src	David Matlack	6	-22/+25
	All selftests that support the backing_src option were printing their own description of the flag and then calling backing_src_help() to dump the list of available backing sources. Consolidate the flag printing in backing_src_help() to align indentation, reduce duplicated strings, and improve consistency across tests. Note: Passing "-s" to backing_src_help is unnecessary since every test uses the same flag. However I decided to keep it for code readability at the call sites. While here this opportunistically fixes the incorrectly interleaved printing -x help message and list of backing source types in dirty_log_perf_test. Fixes: 609e6202ea5f ("KVM: selftests: Support multiple slots in dirty_log_perf_test") Reviewed-by: Ben Gardon <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: David Matlack <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: selftests: Change backing_src flag to -s in demand_paging_test	David Matlack	1	-5/+5
	Every other KVM selftest uses -s for the backing_src, so switch demand_paging_test to match. Reviewed-by: Ben Gardon <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: David Matlack <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: SEV: Allow some commands for mirror VM	Peter Gonda	1	-2/+17
	A mirrored SEV-ES VM will need to call KVM_SEV_LAUNCH_UPDATE_VMSA to setup its vCPUs and have them measured, and their VMSAs encrypted. Without this change, it is impossible to have mirror VMs as part of SEV-ES VMs. Also allow the guest status check and debugging commands since they do not change any guest state. Signed-off-by: Peter Gonda <[email protected]> Cc: Marc Orr <[email protected]> Cc: Nathan Tempelman <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Steve Rutherford <[email protected]> Cc: Brijesh Singh <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Fixes: 54526d1fd593 ("KVM: x86: Support KVM VMs sharing SEV context", 2021-04-21) Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: SEV: Update svm_vm_copy_asid_from for SEV-ES	Peter Gonda	1	-4/+12
	For mirroring SEV-ES the mirror VM will need more then just the ASID. The FD and the handle are required to all the mirror to call psp commands. The mirror VM will need to call KVM_SEV_LAUNCH_UPDATE_VMSA to setup its vCPUs' VMSAs for SEV-ES. Signed-off-by: Peter Gonda <[email protected]> Cc: Marc Orr <[email protected]> Cc: Nathan Tempelman <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Steve Rutherford <[email protected]> Cc: Brijesh Singh <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Fixes: 54526d1fd593 ("KVM: x86: Support KVM VMs sharing SEV context", 2021-04-21) Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: nVMX: Fix nested bus lock VM exit	Chenyi Qiang	1	-0/+6
	Nested bus lock VM exits are not supported yet. If L2 triggers bus lock VM exit, it will be directed to L1 VMM, which would cause unexpected behavior. Therefore, handle L2's bus lock VM exits in L0 directly. Fixes: fe6b6bc802b4 ("KVM: VMX: Enable bus lock VM exit") Signed-off-by: Chenyi Qiang <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Reviewed-by: Xiaoyao Li <[email protected]> Message-Id: <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: x86: Identify vCPU0 by its vcpu_idx instead of its vCPUs array entry	Sean Christopherson	1	-1/+1
	Use vcpu_idx to identify vCPU0 when updating HyperV's TSC page, which is shared by all vCPUs and "owned" by vCPU0 (because vCPU0 is the only vCPU that's guaranteed to exist). Using kvm_get_vcpu() to find vCPU works, but it's a rather odd and suboptimal method to check the index of a given vCPU. No functional change intended. Signed-off-by: Sean Christopherson <[email protected]> Reviewed-by: Jim Mattson <[email protected]> Reviewed-by: Maxim Levitsky <[email protected]> Reviewed-by: Vitaly Kuznetsov <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: x86: Query vcpu->vcpu_idx directly and drop its accessor	Sean Christopherson	6	-14/+8
	Read vcpu->vcpu_idx directly instead of bouncing through the one-line wrapper, kvm_vcpu_get_idx(), and drop the wrapper. The wrapper is a remnant of the original implementation and serves no purpose; remove it before it gains more users. Back when kvm_vcpu_get_idx() was added by commit 497d72d80a78 ("KVM: Add kvm_vcpu_get_idx to get vcpu index in kvm->vcpus"), the implementation was more than just a simple wrapper as vcpu->vcpu_idx did not exist and retrieving the index meant walking over the vCPU array to find the given vCPU. When vcpu_idx was introduced by commit 8750e72a79dd ("KVM: remember position in kvm->vcpus array"), the helper was left behind, likely to avoid extra thrash (but even then there were only two users, the original arm usage having been removed at some point in the past). No functional change intended. Suggested-by: Vitaly Kuznetsov <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Reviewed-by: Maxim Levitsky <[email protected]> Reviewed-by: Vitaly Kuznetsov <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	kvm: fix wrong exception emulation in check_rdtsc	Hou Wenlong	1	-1/+1
	According to Intel's SDM Vol2 and AMD's APM Vol3, when CR4.TSD is set, use rdtsc/rdtscp instruction above privilege level 0 should trigger a #GP. Fixes: d7eb82030699e ("KVM: SVM: Add intercept checks for remaining group7 instructions") Signed-off-by: Hou Wenlong <[email protected]> Message-Id: <1297c0dd3f1bb47a6d089f850b629c7aa0247040.1629257115.git.houwenlong93@linux.alibaba.com> Reviewed-by: Sean Christopherson <[email protected]> Reviewed-by: Jim Mattson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: SEV: Pin guest memory for write for RECEIVE_UPDATE_DATA	Sean Christopherson	1	-1/+1
	Require the target guest page to be writable when pinning memory for RECEIVE_UPDATE_DATA. Per the SEV API, the PSP writes to guest memory: The result is then encrypted with GCTX.VEK and written to the memory pointed to by GUEST_PADDR field. Fixes: 15fb7de1a7f5 ("KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command") Cc: [email protected] Cc: Peter Gonda <[email protected]> Cc: Marc Orr <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Brijesh Singh <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Reviewed-by: Brijesh Singh <[email protected]> Reviewed-by: Peter Gonda <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: SVM: fix missing sev_decommission in sev_receive_start	Mingwei Zhang	1	-1/+3
	DECOMMISSION the current SEV context if binding an ASID fails after RECEIVE_START. Per AMD's SEV API, RECEIVE_START generates a new guest context and thus needs to be paired with DECOMMISSION: The RECEIVE_START command is the only command other than the LAUNCH_START command that generates a new guest context and guest handle. The missing DECOMMISSION can result in subsequent SEV launch failures, as the firmware leaks memory and might not able to allocate more SEV guest contexts in the future. Note, LAUNCH_START suffered the same bug, but was previously fixed by commit 934002cd660b ("KVM: SVM: Call SEV Guest Decommission if ASID binding fails"). Cc: Alper Gun <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brijesh Singh <[email protected]> Cc: David Rienjes <[email protected]> Cc: Marc Orr <[email protected]> Cc: John Allen <[email protected]> Cc: Peter Gonda <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vipin Sharma <[email protected]> Cc: [email protected] Reviewed-by: Marc Orr <[email protected]> Acked-by: Brijesh Singh <[email protected]> Fixes: af43cbbf954b ("KVM: SVM: Add support for KVM_SEV_RECEIVE_START command") Signed-off-by: Mingwei Zhang <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: SEV: Acquire vcpu mutex when updating VMSA	Peter Gonda	1	-22/+29
	The update-VMSA ioctl touches data stored in struct kvm_vcpu, and therefore should not be performed concurrently with any VCPU ioctl that might cause KVM or the processor to use the same data. Adds vcpu mutex guard to the VMSA updating code. Refactors out __sev_launch_update_vmsa() function to deal with per vCPU parts of sev_launch_update_vmsa(). Fixes: ad73109ae7ec ("KVM: SVM: Provide support to launch and run an SEV-ES guest") Signed-off-by: Peter Gonda <[email protected]> Cc: Marc Orr <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Brijesh Singh <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: do not shrink halt_poll_ns below grow_start	Sergey Senozhatsky	1	-1/+5
	grow_halt_poll_ns() ignores values between 0 and halt_poll_ns_grow_start (10000 by default). However, when we shrink halt_poll_ns we may fall way below halt_poll_ns_grow_start and endup with halt_poll_ns values that don't make a lot of sense: like 1 or 9, or 19. VCPU1 trace (halt_poll_ns_shrink equals 2): VCPU1 grow 10000 VCPU1 shrink 5000 VCPU1 shrink 2500 VCPU1 shrink 1250 VCPU1 shrink 625 VCPU1 shrink 312 VCPU1 shrink 156 VCPU1 shrink 78 VCPU1 shrink 39 VCPU1 shrink 19 VCPU1 shrink 9 VCPU1 shrink 4 Mirror what grow_halt_poll_ns() does and set halt_poll_ns to 0 as soon as new shrink-ed halt_poll_ns value falls below halt_poll_ns_grow_start. Signed-off-by: Sergey Senozhatsky <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: nVMX: fix comments of handle_vmon()	Yu Zhang	1	-8/+1
	"VMXON pointer" is saved in vmx->nested.vmxon_ptr since commit 3573e22cfeca ("KVM: nVMX: additional checks on vmxon region"). Also, handle_vmptrld() & handle_vmclear() now have logic to check the VMCS pointer against the VMXON pointer. So just remove the obsolete comments of handle_vmon(). Signed-off-by: Yu Zhang <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: x86: Handle SRCU initialization failure during page track init	Haimin Zhang	3	-4/+9
	Check the return of init_srcu_struct(), which can fail due to OOM, when initializing the page track mechanism. Lack of checking leads to a NULL pointer deref found by a modified syzkaller. Reported-by: TCS Robot <[email protected]> Signed-off-by: Haimin Zhang <[email protected]> Message-Id: <[email protected]> [Move the call towards the beginning of kvm_arch_init_vm. - Paolo] Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: VMX: Remove defunct "nr_active_uret_msrs" field	Sean Christopherson	1	-4/+0
	Remove vcpu_vmx.nr_active_uret_msrs and its associated comment, which are both defunct now that KVM keeps the list constant and instead explicitly tracks which entries need to be loaded into hardware. No functional change intended. Fixes: ee9d22e08d13 ("KVM: VMX: Use flag to indicate "active" uret MSRs instead of sorting list") Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	selftests: KVM: Align SMCCC call with the spec in steal_time	Oliver Upton	1	-2/+2
	The SMC64 calling convention passes a function identifier in w0 and its parameters in x1-x17. Given this, there are two deviations in the SMC64 call performed by the steal_time test: the function identifier is assigned to a 64 bit register and the parameter is only 32 bits wide. Align the call with the SMCCC by using a 32 bit register to handle the function identifier and increasing the parameter width to 64 bits. Suggested-by: Andrew Jones <[email protected]> Signed-off-by: Oliver Upton <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	selftests: KVM: Fix check for !POLLIN in demand_paging_test	Oliver Upton	1	-1/+1
	The logical not operator applies only to the left hand side of a bitwise operator. As such, the check for POLLIN not being set in revents wrong. Fix it by adding parentheses around the bitwise expression. Fixes: 4f72180eb4da ("KVM: selftests: Add demand paging content to the demand paging test") Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Oliver Upton <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: x86: Clear KVM's cached guest CR3 at RESET/INIT	Sean Christopherson	1	-0/+3
	Explicitly zero the guest's CR3 and mark it available+dirty at RESET/INIT. Per Intel's SDM and AMD's APM, CR3 is zeroed at both RESET and INIT. For RESET, this is a nop as vcpu is zero-allocated. For INIT, the bug has likely escaped notice because no firmware/kernel puts its page tables root at PA=0, let alone relies on INIT to get the desired CR3 for such page tables. Cc: [email protected] Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: x86: Mark all registers as avail/dirty at vCPU creation	Sean Christopherson	1	-0/+2
	Mark all registers as available and dirty at vCPU creation, as the vCPU has obviously not been loaded into hardware, let alone been given the chance to be modified in hardware. On SVM, reading from "uninitialized" hardware is a non-issue as VMCBs are zero allocated (thus not truly uninitialized) and hardware does not allow for arbitrary field encoding schemes. On VMX, backing memory for VMCSes is also zero allocated, but true initialization of the VMCS _technically_ requires VMWRITEs, as the VMX architectural specification technically allows CPU implementations to encode fields with arbitrary schemes. E.g. a CPU could theoretically store the inverted value of every field, which would result in VMREAD to a zero-allocated field returns all ones. In practice, only the AR_BYTES fields are known to be manipulated by hardware during VMREAD/VMREAD; no known hardware or VMM (for nested VMX) does fancy encoding of cacheable field values (CR0, CR3, CR4, etc...). In other words, this is technically a bug fix, but practically speakings it's a glorified nop. Failure to mark registers as available has been a lurking bug for quite some time. The original register caching supported only GPRs (+RIP, which is kinda sorta a GPR), with the masks initialized at ->vcpu_reset(). That worked because the two cacheable registers, RIP and RSP, are generally speaking not read as side effects in other flows. Arguably, commit aff48baa34c0 ("KVM: Fetch guest cr3 from hardware on demand") was the first instance of failure to mark regs available. While _just_ marking CR3 available during vCPU creation wouldn't have fixed the VMREAD from an uninitialized VMCS bug because ept_update_paging_mode_cr0() unconditionally read vmcs.GUEST_CR3, marking CR3 _and_ intentionally not reading GUEST_CR3 when it's available would have avoided VMREAD to a technically-uninitialized VMCS. Fixes: aff48baa34c0 ("KVM: Fetch guest cr3 from hardware on demand") Fixes: 6de4f3ada40b ("KVM: Cache pdptrs") Fixes: 6de12732c42c ("KVM: VMX: Optimize vmx_get_rflags()") Fixes: 2fb92db1ec08 ("KVM: VMX: Cache vmcs segment fields") Fixes: bd31fe495d0d ("KVM: VMX: Add proper cache tracking for CR0") Fixes: f98c1e77127d ("KVM: VMX: Add proper cache tracking for CR4") Fixes: 5addc235199f ("KVM: VMX: Cache vmcs.EXIT_QUALIFICATION using arch avail_reg flags") Fixes: 8791585837f6 ("KVM: VMX: Cache vmcs.EXIT_INTR_INFO using arch avail_reg flags") Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: selftests: Remove __NR_userfaultfd syscall fallback	Sean Christopherson	1	-3/+0
	Revert the __NR_userfaultfd syscall fallback added for KVM selftests now that x86's unistd_{32,63}.h overrides are under uapi/ and thus not in KVM selftests' search path, i.e. now that KVM gets x86 syscall numbers from the installed kernel headers. No functional change intended. Reviewed-by: Ben Gardon <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs	Sean Christopherson	3	-0/+240
	Add a test to verify an rseq's CPU ID is updated correctly if the task is migrated while the kernel is handling KVM_RUN. This is a regression test for a bug introduced by commit 72c3c0fe54a3 ("x86/kvm: Use generic xfer to guest work function"), where TIF_NOTIFY_RESUME would be cleared by KVM without updating rseq, leading to a stale CPU ID and other badness. Signed-off-by: Sean Christopherson <[email protected]> Acked-by: Mathieu Desnoyers <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	tools: Move x86 syscall number fallbacks to .../uapi/	Sean Christopherson	2	-0/+0
	Move unistd_{32,64}.h from x86/include/asm to x86/include/uapi/asm so that tools/selftests that install kernel headers, e.g. KVM selftests, can include non-uapi tools headers, e.g. to get 'struct list_head', without effectively overriding the installed non-tool uapi headers. Swapping KVM's search order, e.g. to search the kernel headers before tool headers, is not a viable option as doing results in linux/type.h and other core headers getting pulled from the kernel headers, which do not have the kernel-internal typedefs that are used through tools, including many files outside of selftests/kvm's control. Prior to commit cec07f53c398 ("perf tools: Move syscall number fallbacks from perf-sys.h to tools/arch/x86/include/asm/"), the handcoded numbers were actual fallbacks, i.e. overriding unistd_{32,64}.h from the kernel headers was unintentional. Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	entry: rseq: Call rseq_handle_notify_resume() in tracehook_notify_resume()	Sean Christopherson	8	-19/+8
	Invoke rseq_handle_notify_resume() from tracehook_notify_resume() now that the two function are always called back-to-back by architectures that have rseq. The rseq helper is stubbed out for architectures that don't support rseq, i.e. this is a nop across the board. Note, tracehook_notify_resume() is horribly named and arguably does not belong in tracehook.h as literally every line of code in it has nothing to do with tracing. But, that's been true since commit a42c6ded827d ("move key_repace_session_keyring() into tracehook_notify_resume()") first usurped tracehook_notify_resume() back in 2012. Punt cleaning that mess up to future patches. No functional change intended. Acked-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-22	KVM: rseq: Update rseq when processing NOTIFY_RESUME on xfer to KVM guest	Sean Christopherson	2	-4/+14
	Invoke rseq's NOTIFY_RESUME handler when processing the flag prior to transferring to a KVM guest, which is roughly equivalent to an exit to userspace and processes many of the same pending actions. While the task cannot be in an rseq critical section as the KVM path is reachable only by via ioctl(KVM_RUN), the side effects that apply to rseq outside of a critical section still apply, e.g. the current CPU needs to be updated if the task is migrated. Clearing TIF_NOTIFY_RESUME without informing rseq can lead to segfaults and other badness in userspace VMMs that use rseq in combination with KVM, e.g. due to the CPU ID being stale after task migration. Fixes: 72c3c0fe54a3 ("x86/kvm: Use generic xfer to guest work function") Reported-by: Peter Foley <[email protected]> Bisected-by: Doug Evans <[email protected]> Acked-by: Mathieu Desnoyers <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-09-20	KVM: arm64: Fix PMU probe ordering	Marc Zyngier	5	-7/+16
	Russell reported that since 5.13, KVM's probing of the PMU has started to fail on his HW. As it turns out, there is an implicit ordering dependency between the architectural PMU probing code and and KVM's own probing. If, due to probe ordering reasons, KVM probes before the PMU driver, it will fail to detect the PMU and prevent it from being advertised to guests as well as the VMM. Obviously, this is one probing too many, and we should be able to deal with any ordering. Add a callback from the PMU code into KVM to advertise the registration of a host CPU PMU, allowing for any probing order. Fixes: 5421db1be3b1 ("KVM: arm64: Divorce the perf code from oprofile helpers") Reported-by: "Russell King (Oracle)" <[email protected]> Tested-by: Russell King (Oracle) <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: [email protected]
2021-09-20	KVM: arm64: nvhe: Fix missing FORCE for hyp-reloc.S build rule	Zenghui Yu	1	-1/+1
	Add FORCE so that if_changed can detect the command line change. We'll otherwise see a compilation warning since commit e1f86d7b4b2a ("kbuild: warn if FORCE is missing for if_changed(_dep,_rule) and filechk"). arch/arm64/kvm/hyp/nvhe/Makefile:58: FORCE prerequisite is missing Cc: David Brazdil <[email protected]> Cc: Masahiro Yamada <[email protected]> Signed-off-by: Zenghui Yu <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-09-19	Linux 5.15-rc2	Linus Torvalds	1	-1/+1

2021-09-19	pci_iounmap'2: Electric Boogaloo: try to make sense of it all	Linus Torvalds	2	-23/+46
	Nathan Chancellor reports that the recent change to pci_iounmap in commit 9caea0007601 ("parisc: Declare pci_iounmap() parisc version only when CONFIG_PCI enabled") causes build errors on arm64. It took me about two hours to convince myself that I think I know what the logic of that mess of #ifdef's in the <asm-generic/io.h> header file really aim to do, and rewrite it to be easier to follow. Famous last words. Anyway, the code has now been lifted from that grotty header file into lib/pci_iomap.c, and has fairly extensive comments about what the logic is. It also avoids indirecting through another confusing (and badly named) helper function that has other preprocessor config conditionals. Let's see what odd architecture did something else strange in this area to break things. But my arm64 cross build is clean. Fixes: 9caea0007601 ("parisc: Declare pci_iounmap() parisc version only when CONFIG_PCI enabled") Reported-by: Nathan Chancellor <[email protected]> Cc: Helge Deller <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: Ulrich Teichert <[email protected]> Cc: James Bottomley <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-19	Merge tag 'x86_urgent_for_v5.15_rc2' of ↵	Linus Torvalds	5	-15/+47
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Prevent a infinite loop in the MCE recovery on return to user space, which was caused by a second MCE queueing work for the same page and thereby creating a circular work list. - Make kern_addr_valid() handle existing PMD entries, which are marked not present in the higher level page table, correctly instead of blindly dereferencing them. - Pass a valid address to sanitize_phys(). This was caused by the mixture of inclusive and exclusive ranges. memtype_reserve() expect 'end' being exclusive, but sanitize_phys() wants it inclusive. This worked so far, but with end being the end of the physical address space the fail is exposed. - Increase the maximum supported GPIO numbers for 64bit. Newer SoCs exceed the previous maximum. * tag 'x86_urgent_for_v5.15_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Avoid infinite loop for copy from user recovery x86/mm: Fix kern_addr_valid() to cope with existing but not present entries x86/platform: Increase maximum GPIO number for X86_64 x86/pat: Pass valid address to sanitize_phys()
2021-09-19	Merge tag 'perf-urgent-2021-09-19' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event fix from Thomas Gleixner: "A single fix for the perf core where a value read with READ_ONCE() was checked and then reread which makes all the checks invalid. Reuse the already read value instead" * tag 'perf-urgent-2021-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: events: Reuse value read using READ_ONCE instead of re-reading it
2021-09-19	Merge tag 'locking-urgent-2021-09-19' of ↵	Linus Torvalds	1	-20/+45
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "A set of updates for the RT specific reader/writer locking base code: - Make the fast path reader ordering guarantees correct. - Code reshuffling to make the fix simpler" [ This plays ugly games with atomic_add_return_release() because we don't have a plain atomic_add_release(), and should really be cleaned up, I think - Linus ] * tag 'locking-urgent-2021-09-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rwbase: Take care of ordering guarantee for fastpath reader locking/rwbase: Extract __rwbase_write_trylock() locking/rwbase: Properly match set_and_save_state() to restore_state()
2021-09-19	Merge tag 'powerpc-5.15-2' of ↵	Linus Torvalds	7	-55/+159
	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix crashes when scv (System Call Vectored) is used to make a syscall when a transaction is active, on Power9 or later. - Fix bad interactions between rfscv (Return-from scv) and Power9 fake-suspend mode. - Fix crashes when handling machine checks in LPARs using the Hash MMU. - Partly revert a recent change to our XICS interrupt controller code, which broke the recently added Microwatt support. Thanks to Cédric Le Goater, Eirik Fuller, Ganesh Goudar, Gustavo Romero, Joel Stanley, Nicholas Piggin. * tag 'powerpc-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/xics: Set the IRQ chip data for the ICS native backend powerpc/mce: Fix access error in mce handler KVM: PPC: Book3S HV: Tolerate treclaim. in fake-suspend mode changing registers powerpc/64s: system call rfscv workaround for TM bugs selftests/powerpc: Add scv versions of the basic TM syscall tests powerpc/64s: system call scv tabort fix for corrupt irq soft-mask state
2021-09-19	Merge tag 'kbuild-fixes-v5.15' of ↵	Linus Torvalds	6	-20/+27
	git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix bugs in checkkconfigsymbols.py - Fix missing sys import in gen_compile_commands.py - Fix missing FORCE warning for ARCH=sh builds - Fix -Wignored-optimization-argument warnings for Clang builds - Turn -Wignored-optimization-argument into an error in order to stop building instead of sprinkling warnings * tag 'kbuild-fixes-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: Add -Werror=ignored-optimization-argument to CLANG_FLAGS x86/build: Do not add -falign flags unconditionally for clang kbuild: Fix comment typo in scripts/Makefile.modpost sh: Add missing FORCE prerequisites in Makefile gen_compile_commands: fix missing 'sys' package checkkconfigsymbols.py: Remove skipping of help lines in parse_kconfig_file checkkconfigsymbols.py: Forbid passing 'HEAD' to --commit
2021-09-19	Merge tag 'perf-tools-fixes-for-v5.15-2021-09-18' of ↵	Linus Torvalds	7	-51/+100
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix ip display in 'perf script' when output type != attr->type. - Ignore deprecation warning when using libbpf'sg btf__get_from_id(), fixing the build with libbpf v0.6+. - Make use of FD() robust in libperf, fixing a segfault with 'perf stat --iostat list'. - Initialize addr_location:srcline pointer to NULL when resolving callchain addresses. - Fix fused instruction logic for assembly functions in 'perf annotate'. * tag 'perf-tools-fixes-for-v5.15-2021-09-18' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf bpf: Ignore deprecation warning when using libbpf's btf__get_from_id() libperf evsel: Make use of FD robust. perf machine: Initialize srcline string member in add_location struct perf script: Fix ip display when type != attr->type perf annotate: Fix fused instr logic for assembly functions
2021-09-19	dmascc: use proper 'virt_to_bus()' rather than casting to 'int'	Linus Torvalds	1	-3/+3
	The old dmascc driver depends on the legacy ISA_DMA_API, and blindly just casts the kernel virtual address to 'int' for set_dma_addr(). That works only incidentally, and because the high bits of the address will be ignored anyway. And on 64-bit architectures it causes warnings. Admittedly, 64-bit architectures with ISA are basically dead - I think the only example of this is alpha, and nobody would ever use the dmascc driver there. But hey, the fix is easy enough, the end result is cleaner, and it's yet another configuration that now builds without warnings. If somebody actually uses this driver on an alpha and this fixes it for you, please email me. Because that is just incredibly bizarre. Signed-off-by: Linus Torvalds <[email protected]>
2021-09-19	alpha: enable GENERIC_PCI_IOMAP unconditionally	Linus Torvalds	1	-1/+1
	With the previous commit (9caea0007601: "parisc: Declare pci_iounmap() parisc version only when CONFIG_PCI enabled") we can now enable GENERIC_PCI_IOMAP unconditionally on alpha, and if PCI is not enabled we will just get the nice empty helper functions that allow mixed-bus drivers to build. Example driver: the old 3com/3c59x.c driver works with either the PCI or the EISA version of the 3x59x card, but wouldn't build in an EISA-only configuration because of missing pci_iomap() and pci_iounmap() dummy wrappers. Most of the other PCI infrastructure just becomes empty wrappers even without GENERIC_PCI_IOMAP, and it's not obvious that the pci_iomap functionality shouldn't do the same, but this works. Cc: Ulrich Teichert <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-19	parisc: Declare pci_iounmap() parisc version only when CONFIG_PCI enabled	Helge Deller	3	-11/+6
	Linus noticed odd declaration rules for pci_iounmap() in iomap.h and pci_iomap.h, where it dependend on either NO_GENERIC_PCI_IOPORT_MAP or GENERIC_IOMAP when CONFIG_PCI was disabled. Testing on parisc seems to indicate that we need pci_iounmap() only when CONFIG_PCI is enabled, so the declaration of pci_iounmap() can be moved cleanly into pci_iomap.h in sync with the declarations of pci_iomap(). Link: https://lore.kernel.org/all/CAHk-=wjRrh98pZoQ+AzfWmsTZacWxTJKXZ9eKU2X_0+jM=O8nw@mail.gmail.com/ Signed-off-by: Helge Deller <[email protected]> Suggested-by: Linus Torvalds <[email protected]> Fixes: 97a29d59fc22 ("[PARISC] fix compile break caused by iomap: make IOPORT/PCI mapping functions conditional") Cc: Arnd Bergmann <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: Ulrich Teichert <[email protected]> Cc: James Bottomley <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-19	Revert "drm/vc4: hdmi: Remove drm_encoder->crtc usage"	Linus Torvalds	1	-27/+13
	This reverts commit 27da370e0fb343a0baf308f503bb3e5dcdfe3362. Sudip Mukherjee reports that this broke pulseaudio with a NULL pointer dereference in vc4_hdmi_audio_prepare(), bisected it to this commit, and confirmed that a revert fixed the problem. Revert the problematic commit until fixed. Link: https://lore.kernel.org/all/CADVatmPB9-oKd=ypvj25UYysVo6EZhQ6bCM7EvztQBMyiZfAyw@mail.gmail.com/ Link: https://lore.kernel.org/all/CADVatmN5EpRshGEPS_JozbFQRXg5w_8LFB3OMP1Ai-ghxd3w4g@mail.gmail.com/ Reported-and-tested-by: Sudip Mukherjee <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Emma Anholt <[email protected]> Cc: Dave Airlie <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-19	Revert drm/vc4 hdmi runtime PM changes	Linus Torvalds	1	-34/+10
	This reverts commits 9984d6664ce9 ("drm/vc4: hdmi: Make sure the controller is powered in detect") 411efa18e4b0 ("drm/vc4: hdmi: Move the HSM clock enable to runtime_pm") as Michael Stapelberg reports that the new runtime PM changes cause his Raspberry Pi 3 to hang on boot, probably due to interactions with other changes in the DRM tree (because a bisect points to the merge in commit e058a84bfddc: "Merge tag 'drm-next-2021-07-01' of git://.../drm"). Revert these two commits until it's been resolved. Link: https://lore.kernel.org/all/871r5mp7h2.fsf@midna.i-did-not-set--mail-host-address--so-tickle-me/ Reported-and-tested-by: Michael Stapelberg <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Dave Stevenson <[email protected]> Cc: Dave Airlie <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-19	kbuild: Add -Werror=ignored-optimization-argument to CLANG_FLAGS	Nathan Chancellor	1	-0/+5
	Similar to commit 589834b3a009 ("kbuild: Add -Werror=unknown-warning-option to CLANG_FLAGS"). Clang ignores certain GCC flags that it has not implemented, only emitting a warning: $ echo \| clang -fsyntax-only -falign-jumps -x c - clang-14: warning: optimization flag '-falign-jumps' is not supported [-Wignored-optimization-argument] When one of these flags gets added to KBUILD_CFLAGS unconditionally, all subsequent cc-{disable-warning,option} calls fail because -Werror was added to these invocations to turn the above warning and the equivalent -W flag warning into errors. To catch the presence of these flags earlier, turn -Wignored-optimization-argument into an error so that the flags can either be implemented or ignored via cc-option and there are no more weird errors. Reviewed-by: Nick Desaulniers <[email protected]> Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2021-09-19	x86/build: Do not add -falign flags unconditionally for clang	Nathan Chancellor	1	-3/+9
	clang does not support -falign-jumps and only recently gained support for -falign-loops. When one of the configuration options that adds these flags is enabled, clang warns and all cc-{disable-warning,option} that follow fail because -Werror gets added to test for the presence of this warning: clang-14: warning: optimization flag '-falign-jumps=0' is not supported [-Wignored-optimization-argument] To resolve this, add a couple of cc-option calls when building with clang; gcc has supported these options since 3.2 so there is no point in testing for their support. -falign-functions was implemented in clang-7, -falign-loops was implemented in clang-14, and -falign-jumps has not been implemented yet. Link: https://lore.kernel.org/r/[email protected]/ Link: https://lore.kernel.org/r/[email protected]/ Reported-by: kernel test robot <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Acked-by: Borislav Petkov <[email protected]> Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2021-09-19	kbuild: Fix comment typo in scripts/Makefile.modpost	Ramji Jiyani	1	-1/+1
	Change comment "create one <module>.mod.c file pr. module" to "create one <module>.mod.c file per module" Signed-off-by: Ramji Jiyani <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2021-09-19	sh: Add missing FORCE prerequisites in Makefile	Geert Uytterhoeven	1	-8/+8
	make: arch/sh/boot/Makefile:87: FORCE prerequisite is missing Add the missing FORCE prerequisites for all build targets identified by "make help". Fixes: e1f86d7b4b2a5213 ("kbuild: warn if FORCE is missing for if_changed(_dep,_rule) and filechk") Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2021-09-19	gen_compile_commands: fix missing 'sys' package	Kortan	1	-0/+1
	We need to import the 'sys' package since the script has called sys.exit() method. Fixes: 6ad7cbc01527 ("Makefile: Add clang-tidy and static analyzer support to makefile") Signed-off-by: Kortan <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2021-09-19	checkkconfigsymbols.py: Remove skipping of help lines in parse_kconfig_file	Ariel Marcovitch	1	-8/+0
	When parsing Kconfig files to find symbol definitions and references, lines after a 'help' line are skipped until a new config definition starts. However, Kconfig statements can actually be after a help section, as long as these have shallower indentation. These are skipped by the parser. This means that symbols referenced in this kind of statements are ignored by this function and thus are not considered undefined references in case the symbol is not defined. Remove the 'skip' logic entirely, as it is not needed if we just use the STMT regex to find the end of help lines. However, this means that keywords that appear as part of the help message (i.e. with the same indentation as the help lines) it will be considered as a reference/definition. This can happen now as well, but only with REGEX_KCONFIG_DEF lines. Also, the keyword must have a SYMBOL after it, which probably means that someone referenced a config in the help so it seems like a bonus :) The real solution is to keep track of the indentation when a the first help line in encountered and then handle DEF and STMT lines only if the indentation is shallower. Signed-off-by: Ariel Marcovitch <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2021-09-19	checkkconfigsymbols.py: Forbid passing 'HEAD' to --commit	Ariel Marcovitch	1	-0/+3
	As opposed to the --diff option, --commit can get ref names instead of commit hashes. When using the --commit option, the script resets the working directory to the commit before the given ref, by adding '~' to the end of the ref. However, the 'HEAD' ref is relative, and so when the working directory is reset to 'HEAD~', 'HEAD' points to what was 'HEAD~'. Then when the script resets to 'HEAD' it actually stays in the same commit. In this case, the script won't report any cases because there is no diff between the cases of the two refs. Prevent the user from using HEAD refs. A better solution might be to resolve the refs before doing the reset, but for now just disallow such refs. Signed-off-by: Ariel Marcovitch <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>