blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2022-11-09	stmmac: intel: Update PCH PTP clock rate from 200MHz to 204.8MHz	Tan, Tee Min	1	-2/+9
	Current Intel platform has an output of ~976ms interval when probed on 1 Pulse-per-Second(PPS) hardware pin. The correct PTP clock frequency for PCH GbE should be 204.8MHz instead of 200MHz. PSE GbE PTP clock rate remains at 200MHz. Fixes: 58da0cfa6cf1 ("net: stmmac: create dwmac-intel.c to contain all Intel platform") Signed-off-by: Ling Pei Lee <[email protected]> Signed-off-by: Tan, Tee Min <[email protected]> Signed-off-by: Voon Weifeng <[email protected]> Signed-off-by: Gan Yi Fang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-11-09	net: cxgb3_main: disable napi when bind qsets failed in cxgb_up()	Zhengchao Shao	1	-0/+1
	When failed to bind qsets in cxgb_up() for opening device, napi isn't disabled. When open cxgb3 device next time, it will trigger a BUG_ON() in napi_enable(). Compile tested only. Fixes: 48c4b6dbb7e2 ("cxgb3 - fix port up/down error path") Signed-off-by: Zhengchao Shao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-11-09	net: cpsw: disable napi in cpsw_ndo_open()	Zhengchao Shao	1	-0/+2
	When failed to create xdp rxqs or fill rx channels in cpsw_ndo_open() for opening device, napi isn't disabled. When open cpsw device next time, it will report a invalid opcode issue. Compiled tested only. Fixes: d354eb85d618 ("drivers: net: cpsw: dual_emac: simplify napi usage") Signed-off-by: Zhengchao Shao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-11-09	drm/amd/display: only fill dirty rectangles when PSR is enabled	Hamza Mahfooz	1	-3/+4
	Currently, we are calling fill_dc_dirty_rects() even if PSR isn't supported by the relevant link in amdgpu_dm_commit_planes(), this is undesirable especially because when drm.debug is enabled we are printing messages in fill_dc_dirty_rects() that are only useful for debugging PSR (and confusing otherwise). So, we can instead limit the filling of dirty rectangles to only when PSR is enabled. Reviewed-by: Leo Li <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-11-09	drm/amdgpu: disable BACO on special BEIGE_GOBY card	Guchun Chen	1	-1/+3
	Still avoid intermittent failure. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Acked-by: Evan Quan <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-11-09	drm/amdgpu: Drop eviction lock when allocating PT BO	Philip Yang	3	-26/+28
	Re-take the eviction lock immediately again after the allocation is completed, to fix circular locking warning with drm_buddy allocator. Move amdgpu_vm_eviction_lock/unlock/trylock to amdgpu_vm.h as they are called from multiple files. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-11-09	drm/amdgpu: Unlock bo_list_mutex after error handling	Philip Yang	1	-0/+1
	Get below kernel WARNING backtrace when pressing ctrl-C to kill kfdtest application. If amdgpu_cs_parser_bos returns error after taking bo_list_mutex, as caller amdgpu_cs_ioctl will not unlock bo_list_mutex, this generates the kernel WARNING. Add unlock bo_list_mutex after amdgpu_cs_parser_bos error handling to cleanup bo_list userptr bo. WARNING: kfdtest/2930 still has locks held! 1 lock held by kfdtest/2930: (&list->bo_list_mutex){+.+.}-{3:3}, at: amdgpu_cs_ioctl+0xce5/0x1f10 [amdgpu] stack backtrace: dump_stack_lvl+0x44/0x57 get_signal+0x79f/0xd00 arch_do_signal_or_restart+0x36/0x7b0 exit_to_user_mode_prepare+0xfd/0x1b0 syscall_exit_to_user_mode+0x19/0x40 do_syscall_64+0x40/0x80 Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-11-09	Revert "drm/amdgpu: Revert "drm/amdgpu: getting fan speed pwm for vega10 ↵	Asher Song	1	-13/+12
	properly"" This reverts commit 4545ae2ed3f2f7c3f615a53399c9c8460ee5bca7. The origin patch "drm/amdgpu: getting fan speed pwm for vega10 properly" works fine. Test failure is caused by test case self. Signed-off-by: Asher Song <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-11-09	drm/amd/display: Enforce minimum prefetch time for low memclk on DCN32	Dillon Varone	10	-2/+26
	[WHY?] Data return times when using lowest memclk can be <= 60us, which can cause underflow on high bandwidth displays with a workload. [HOW?] Enforce a minimum prefetch time during validation for low memclk modes. Reviewed-by: Jun Lei <[email protected]> Acked-by: Alan Liu <[email protected]> Signed-off-by: Dillon Varone <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-11-09	drm/amd/display: Fix gpio port mapping issue	Steve Su	2	-3/+20
	[Why] 1. Port of gpio has different mapping. [How] 1. Add a dummy entry in mapping table. 2. Fix incorrect mask bit field access. Reviewed-by: Alvin Lee <[email protected]> Acked-by: Alan Liu <[email protected]> Signed-off-by: Steve Su <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-11-09	drm/amd/display: Fix reg timeout in enc314_enable_fifo	Nicholas Kazlauskas	1	-6/+18
	[Why] The link enablement sequence can end up resetting the encoder while the PHY symclk isn't yet on. This means that waiting for symclk on will timeout, along with the reset bit never asserting high. This causes unnecessary delay when enabling the link and produces a warning affecting multiple IGT tests. [How] Don't wait for the symclk to be on here because firmware already does. Don't wait for reset if we know the symclk isn't on. Split the reset into a helper function that checks the bit and decides whether or not a delay is sufficient. Reviewed-by: Roman Li <[email protected]> Acked-by: Alan Liu <[email protected]> Signed-off-by: Nicholas Kazlauskas <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0.x
2022-11-09	drm/amd/display: Fix FCLK deviation and tool compile issues	Chaitanya Dhere	2	-2/+2
	[Why] Recent backports from open source do not have header inclusion pattern that is consistent with inclusion style in the rest of the file. This breaks the internal tool builds as well. A recent commit erronously modified the original DML formula for calculating ActiveClockChangeLatencyHidingY. This resulted in a FCLK deviation from the golden values. [How] Change the way in which display_mode_vba.h is included so that it is consistent with the inclusion style in rest of the file which also fixes the tool build. Restore the DML formula to its original state to fix the FCLK deviation. Reviewed-by: Aurabindo Pillai <[email protected]> Reviewed-by: Jun Lei <[email protected]> Acked-by: Alan Liu <[email protected]> Signed-off-by: Chaitanya Dhere <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-11-09	drm/amd/display: Zeromem mypipe heap struct before using it	Aurabindo Pillai	1	-0/+1
	[Why&How] Bug was caused when moving variable from stack to heap because it was reusable and garbage was left over, so we need to zero mem. Reviewed-by: Martin Leung <[email protected]> Acked-by: Alan Liu <[email protected]> Signed-off-by: Aurabindo Pillai <[email protected]> Signed-off-by: Martin Leung <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-11-09	drm/amd/display: Update SR watermarks for DCN314	Nicholas Kazlauskas	2	-18/+18
	[Why & How] New values requested by hardware after fine-tuning. Update for all memory types. Reviewed-by: Jun Lei <[email protected]> Acked-by: Alan Liu <[email protected]> Signed-off-by: Nicholas Kazlauskas <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0.x
2022-11-09	drm/amdgpu: workaround for TLB seq race	Christian König	1	-0/+15
	It can happen that we query the sequence value before the callback had a chance to run. Workaround that by grabbing the fence lock and releasing it again. Should be replaced by hw handling soon. Signed-off-by: Christian König <[email protected]> CC: [email protected] # 5.19+ Fixes: 5255e146c99a6 ("drm/amdgpu: rework TLB flushing") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2113 Acked-by: Alex Deucher <[email protected]> Acked-by: Philip Yang <[email protected]> Tested-by: Stefan Springer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-11-09	drm/amdkfd: Fix error handling in criu_checkpoint	Felix Kuehling	1	-19/+15
	Checkpoint BOs last. That way we don't need to close dmabuf FDs if something else fails later. This avoids problematic access to user mode memory in the error handling code path. criu_checkpoint_bos has its own error handling and cleanup that does not depend on access to user memory. In the private data, keep BOs before the remaining objects. This is necessary to restore things in the correct order as restoring events depends on the events-page BO being restored first. Fixes: be072b06c739 ("drm/amdkfd: CRIU export BOs as prime dmabuf objects") Reported-by: Jann Horn <[email protected]> CC: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Reviewed-and-tested-by: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-11-09	drm/amdkfd: Fix error handling in kfd_criu_restore_events	Felix Kuehling	1	-2/+1
	mutex_unlock before the exit label because all the error code paths that jump there didn't take that lock. This fixes unbalanced locking errors in case of restore errors. Fixes: 40e8a766a761 ("drm/amdkfd: CRIU checkpoint and restore events") Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-11-09	drm/amd/pm: update SMU IP v13.0.4 msg interface header	Tim Huang	1	-8/+7
	Some of the unused messages that were used earlier in development have been freed up as spare messages, no intended functional changes. Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Tim Huang <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0.x
2022-11-09	iavf: Fix VF driver counting VLAN 0 filters	Michal Jaron	1	-0/+2
	VF driver mistakenly counts VLAN 0 filters, when no PF driver counts them. Do not count VLAN 0 filters, when VLAN_V2 is engaged. Counting those filters in, will affect filters size by -1, when sending batched VLAN addition message. Fixes: 968996c070ef ("iavf: Fix VLAN_V2 addition/rejection") Signed-off-by: Przemyslaw Patynowski <[email protected]> Signed-off-by: Michal Jaron <[email protected]> Signed-off-by: Kamil Maziarz <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-11-09	ice: Fix spurious interrupt during removal of trusted VF	Norbert Zulinski	4	-2/+31
	Previously, during removal of trusted VF when VF is down there was number of spurious interrupt equal to number of queues on VF. Add check if VF already has inactive queues. If VF is disabled and has inactive rx queues then do not disable rx queues. Add check in ice_vsi_stop_tx_ring if it's VF's vsi and if VF is disabled. Fixes: efe41860008e ("ice: Fix memory corruption in VF driver") Signed-off-by: Norbert Zulinski <[email protected]> Signed-off-by: Mateusz Palczewski <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2022-11-09	Merge tag 'slab-for-6.1-rc4-fixes' of ↵	Linus Torvalds	2	-43/+4
	git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fixes from Vlastimil Babka: "Most are small fixups as described below. The !CONFIG_TRACING fix is a bit bigger and would normally be done in the next merge window as part of upcoming hardening changes. But we realized it can make the kmalloc waste tracking introduced in this window inaccurate, so decided to go with it now. Summary: - Remove !CONFIG_TRACING kmalloc() wrappers intended to save a function call, due to incompatilibity with recently introduced wasted space tracking and planned hardening changes. - A tracing parameter regression fix, by Kees Cook. - Two kernel-doc warning fixups, by Lukas Bulwahn and myself * tag 'slab-for-6.1-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm, slab: remove duplicate kernel-doc comment for ksize() mm/slab_common: Restore passing "caller" for tracing mm/slab: remove !CONFIG_TRACING variants of kmalloc_[node_]trace() mm/slab_common: repair kernel-doc for __ksize()
2022-11-09	ASoC: wm8962: Wait for updated value of WM8962_CLOCKING1 register	Chancel Liu	1	-0/+8
	DSPCLK_DIV field in WM8962_CLOCKING1 register is used to generate correct frequency of LRCLK and BCLK. Sometimes the read-only value can't be updated timely after enabling SYSCLK. This results in wrong calculation values. Delay is introduced here to wait for newest value from register. The time of the delay should be at least 500~1000us according to test. Signed-off-by: Chancel Liu <[email protected]> Acked-by: Charles Keepax <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2022-11-09	net/mlx5e: TC, Fix slab-out-of-bounds in parse_tc_actions	Roi Dayan	1	-2/+6
	esw_attr is only allocated if namespace is fdb. BUG: KASAN: slab-out-of-bounds in parse_tc_actions+0xdc6/0x10e0 [mlx5_core] Write of size 4 at addr ffff88815f185b04 by task tc/2135 CPU: 5 PID: 2135 Comm: tc Not tainted 6.1.0-rc2+ #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x57/0x7d print_report+0x170/0x471 ? parse_tc_actions+0xdc6/0x10e0 [mlx5_core] kasan_report+0xbc/0xf0 ? parse_tc_actions+0xdc6/0x10e0 [mlx5_core] parse_tc_actions+0xdc6/0x10e0 [mlx5_core] Fixes: 94d651739e17 ("net/mlx5e: TC, Fix cloned flow attr instance dests are not zeroed") Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Maor Dickman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-11-09	net/mlx5e: E-Switch, Fix comparing termination table instance	Roi Dayan	1	-6/+8
	The pkt_reformat pointer being saved under flow_act and not dest attribute in the termination table instance. Fix the comparison pointers. Also fix returning success if one pkt_reformat pointer is null and the other is not. Fixes: 249ccc3c95bd ("net/mlx5e: Add support for offloading traffic from uplink to uplink") Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Chris Mi <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-11-09	net/mlx5e: TC, Fix wrong rejection of packet-per-second policing	Jianbo Liu	1	-6/+0
	In the bellow commit, we added support for PPS policing without removing the check which block offload of such cases. Fix it by removing this check. Fixes: a8d52b024d6d ("net/mlx5e: TC, Support offloading police action") Signed-off-by: Jianbo Liu <[email protected]> Reviewed-by: Maor Dickman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-11-09	net/mlx5e: Fix tc acts array not to be dependent on enum order	Roi Dayan	1	-60/+32
	The tc acts array should not be dependent on kernel internal flow action id enum. Fix the array initialization. Fixes: fad547906980 ("net/mlx5e: Add tc action infrastructure") Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Maor Dickman <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-11-09	net/mlx5e: Fix usage of DMA sync API	Maxim Mikityanskiy	2	-15/+16
	DMA sync functions should use the same direction that was used by DMA mapping. Use DMA_BIDIRECTIONAL for XDP_TX from regular RQ, which reuses the same mapping that was used for RX, and DMA_TO_DEVICE for XDP_TX from XSK RQ and XDP_REDIRECT, which establish a new mapping in this direction. On the RX side, use the same direction that was used when setting up the mapping (DMA_BIDIRECTIONAL for XDP, DMA_FROM_DEVICE otherwise). Also don't skip sync for device when establishing a DMA_FROM_DEVICE mapping for RX, as some architectures (ARM) may require invalidating caches before the device can use the mapping. It doesn't break the bugfix made in commit 0b7cfa4082fb ("net/mlx5e: Fix page DMA map/unmap attributes"), since the bug happened on unmap. Fixes: 0b7cfa4082fb ("net/mlx5e: Fix page DMA map/unmap attributes") Fixes: b5503b994ed5 ("net/mlx5e: XDP TX forwarding support") Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Gal Pressman <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-11-09	net/mlx5e: Add missing sanity checks for max TX WQE size	Maxim Mikityanskiy	3	-1/+35
	The commit cited below started using the firmware capability for the maximum TX WQE size. This commit adds an important check to verify that the driver doesn't attempt to exceed this capability, and also restores another check mistakenly removed in the cited commit (a WQE must not exceed the page size). Fixes: c27bd1718c06 ("net/mlx5e: Read max WQEBBs on the SQ from firmware") Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-11-09	net/mlx5: fw_reset: Don't try to load device in case PCI isn't working	Shay Drory	1	-1/+2
	In case PCI reads fail after unload, there is no use in trying to load the device. Fixes: 5ec697446f46 ("net/mlx5: Add support for devlink reload action fw activate") Signed-off-by: Shay Drory <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-11-09	net/mlx5: E-switch, Set to legacy mode if failed to change switchdev mode	Chris Mi	2	-21/+11
	No need to rollback to the other mode because probably will fail again. Just set to legacy mode and clear fdb table created flag. So that fdb table will not be cleared again. Fixes: f019679ea5f2 ("net/mlx5: E-switch, Remove dependency between sriov and eswitch mode") Signed-off-by: Chris Mi <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-11-09	net/mlx5: Allow async trigger completion execution on single CPU systems	Roy Novich	1	-3/+8
	For a single CPU system, the kernel thread executing mlx5_cmd_flush() never releases the CPU but calls down_trylock(&cmd→sem) in a busy loop. On a single processor system, this leads to a deadlock as the kernel thread which executes mlx5_cmd_invoke() never gets scheduled. Fix this, by adding the cond_resched() call to the loop, allow the command completion kernel thread to execute. Fixes: 8e715cd613a1 ("net/mlx5: Set command entry semaphore up once got index free") Signed-off-by: Alexander Schmidt <[email protected]> Signed-off-by: Roy Novich <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-11-09	net/mlx5: Bridge, verify LAG state when adding bond to bridge	Vlad Buslov	1	-0/+31
	Mlx5 LAG is initialized asynchronously on a workqueue which means that for a brief moment after setting mlx5 UL representors as lower devices of a bond netdevice the LAG itself is not fully initialized in the driver. When adding such bond device to a bridge mlx5 bridge code will not consider it as offload-capable, skip creating necessary bookkeeping and fail any further bridge offload-related commands with it (setting VLANs, offloading FDBs, etc.). In order to make the error explicit during bridge initialization stage implement the code that detects such condition during NETDEV_PRECHANGEUPPER event and returns an error. Fixes: ff9b7521468b ("net/mlx5: Bridge, support LAG") Signed-off-by: Vlad Buslov <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Reviewed-by: Mark Bloch <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2022-11-09	ASoC: stm32: dfsdm: manage cb buffers cleanup	Olivier Moysan	1	-0/+11
	Ensure that resources allocated by iio_channel_get_all_cb() are released on driver unbind. Signed-off-by: Olivier Moysan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2022-11-09	scripts/min-tool-version.sh: raise minimum clang version to 15.0.0 for s390	Heiko Carstens	1	-1/+1
	Before version 15.0.0 llvm's integrated assembler may silently generate corrupted code on s390. See e.g. commit e9953b729b78 ("s390/boot: workaround llvm IAS bug") for further details. While there have been workarounds applied for all known existing locations, there is nothing that prevents that new code with problematic patterns will be added. Therefore raise the minimum clang version to 15.0.0. Note that llvm commit e547b04d5b2c ("[SystemZ] Bugfix for symbolic displacements."), which is included in 15.0.0, fixes the broken code generation. Signed-off-by: Heiko Carstens <[email protected]> Acked-by: Nathan Chancellor <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexander Gordeev <[email protected]>
2022-11-09	Merge tag 'kvm-s390-master-6.1-1' of ↵	Paolo Bonzini	251	-1718/+3474
	https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD A PCI allocation fix and a PV clock fix.
2022-11-09	KVM: x86/pmu: Limit the maximum number of supported AMD GP counters	Like Xu	3	-3/+8
	The AMD PerfMonV2 specification allows for a maximum of 16 GP counters, but currently only 6 pairs of MSRs are accepted by KVM. While AMD64_NUM_COUNTERS_CORE is already equal to 6, increasing without adjusting msrs_to_save_all[] could result in out-of-bounds accesses. Therefore introduce a macro (named KVM_AMD_PMC_MAX_GENERIC) to refer to the number of counters supported by KVM. Signed-off-by: Like Xu <[email protected]> Reviewed-by: Jim Mattson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-09	KVM: x86/pmu: Limit the maximum number of supported Intel GP counters	Like Xu	4	-9/+15
	The Intel Architectural IA32_PMCx MSRs addresses range allows for a maximum of 8 GP counters, and KVM cannot address any more. Introduce a local macro (named KVM_INTEL_PMC_MAX_GENERIC) and use it consistently to refer to the number of counters supported by KVM, thus avoiding possible out-of-bound accesses. Suggested-by: Jim Mattson <[email protected]> Signed-off-by: Like Xu <[email protected]> Reviewed-by: Jim Mattson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-09	KVM: x86/pmu: Do not speculatively query Intel GP PMCs that don't exist yet	Like Xu	1	-12/+2
	The SDM lists an architectural MSR IA32_CORE_CAPABILITIES (0xCF) that limits the theoretical maximum value of the Intel GP PMC MSRs allocated at 0xC1 to 14; likewise the Intel April 2022 SDM adds IA32_OVERCLOCKING_STATUS at 0x195 which limits the number of event selection MSRs to 15 (0x186-0x194). Limiting the maximum number of counters to 14 or 18 based on the currently allocated MSRs is clearly fragile, and it seems likely that Intel will even place PMCs 8-15 at a completely different range of MSR indices. So stop at the maximum number of GP PMCs supported today on Intel processors. There are some machines, like Intel P4 with non Architectural PMU, that may indeed have 18 counters, but those counters are in a completely different MSR address range and are not supported by KVM. Cc: Vitaly Kuznetsov <[email protected]> Cc: [email protected] Fixes: cf05a67b68b8 ("KVM: x86: omit "impossible" pmu MSRs from MSR list") Suggested-by: Jim Mattson <[email protected]> Signed-off-by: Like Xu <[email protected]> Reviewed-by: Jim Mattson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-09	KVM: SVM: Only dump VMSA to klog at KERN_DEBUG level	Peter Gonda	1	-1/+1
	Explicitly print the VMSA dump at KERN_DEBUG log level, KERN_CONT uses KERNEL_DEFAULT if the previous log line has a newline, i.e. if there's nothing to continuing, and as a result the VMSA gets dumped when it shouldn't. The KERN_CONT documentation says it defaults back to KERNL_DEFAULT if the previous log line has a newline. So switch from KERN_CONT to print_hex_dump_debug(). Jarkko pointed this out in reference to the original patch. See: https://lore.kernel.org/all/[email protected]/ print_hex_dump(KERN_DEBUG, ...) was pointed out there, but print_hex_dump_debug() should similar. Fixes: 6fac42f127b8 ("KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog") Signed-off-by: Peter Gonda <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Cc: Jarkko Sakkinen <[email protected]> Cc: Harald Hoyer <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: [email protected] Cc: "H. Peter Anvin" <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-09	tools/kvm_stat: update exit reasons for vmx/svm/aarch64/userspace	Rong Tao	1	-14/+82
	Update EXIT_REASONS from source, including VMX_EXIT_REASONS, SVM_EXIT_REASONS, AARCH64_EXIT_REASONS, USERSPACE_EXIT_REASONS. Signed-off-by: Rong Tao <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-09	tools/kvm_stat: fix incorrect detection of debugfs	Matthias Gerstner	1	-1/+1
	The first field in /proc/mounts can be influenced by unprivileged users through the widespread `fusermount` setuid-root program. Example: ``` user$ mkdir ~/mydebugfs user$ export _FUSE_COMMFD=0 user$ fusermount ~/mydebugfs -ononempty,fsname=debugfs user$ grep debugfs /proc/mounts debugfs /home/user/mydebugfs fuse rw,nosuid,nodev,relatime,user_id=1000,group_id=100 0 0 ``` If there is no debugfs already mounted in the system then this can be used by unprivileged users to trick kvm_stat into using a user controlled file system location for obtaining KVM statistics. Even though the root user is not allowed to access non-root FUSE mounts for security reasons, the unprivileged user can unmount the FUSE mount before kvm_stat uses the mounted path. If it wins the race, kvm_stat will read from the location where the FUSE mount resided. Note that the files in debugfs are only opened for reading, so the attacker can cause very large data to be read in by kvm_stat, or fake data to be processed, but there should be no viable way to turn this into a privilege escalation. The fix is simply to use the file system type field instead. Whitespace in the mount path is escaped in /proc/mounts thus no further safety measures in the parsing should be necessary to make this correct. Message-Id: <[email protected]> Signed-off-by: Matthias Gerstner <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-09	x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers	Paolo Bonzini	3	-8/+8
	x86_virt_spec_ctrl only deals with the paravirtualized MSR_IA32_VIRT_SPEC_CTRL now and does not handle MSR_IA32_SPEC_CTRL anymore; remove the corresponding, unused argument. Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-09	KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly	Paolo Bonzini	5	-38/+136
	Restoration of the host IA32_SPEC_CTRL value is probably too late with respect to the return thunk training sequence. With respect to the user/kernel boundary, AMD says, "If software chooses to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel exit), software should set STIBP to 1 before executing the return thunk training sequence." I assume the same requirements apply to the guest/host boundary. The return thunk training sequence is in vmenter.S, quite close to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's IA32_SPEC_CTRL value is not restored until much later. To avoid this, move the restoration of host SPEC_CTRL to assembly and, for consistency, move the restoration of the guest SPEC_CTRL as well. This is not particularly difficult, apart from some care to cover both 32- and 64-bit, and to share code between SEV-ES and normal vmentry. Cc: [email protected] Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk") Suggested-by: Jim Mattson <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-09	KVM: SVM: restore host save area from assembly	Paolo Bonzini	5	-13/+26
	Allow access to the percpu area via the GS segment base, which is needed in order to access the saved host spec_ctrl value. In linux-next FILL_RETURN_BUFFER also needs to access percpu data. For simplicity, the physical address of the save area is added to struct svm_cpu_data. Cc: [email protected] Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk") Reported-by: Nathan Chancellor <[email protected]> Analyzed-by: Andrew Cooper <[email protected]> Tested-by: Nathan Chancellor <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-09	KVM: SVM: move guest vmsave/vmload back to assembly	Paolo Bonzini	3	-20/+39
	It is error-prone that code after vmexit cannot access percpu data because GSBASE has not been restored yet. It forces MSR_IA32_SPEC_CTRL save/restore to happen very late, after the predictor untraining sequence, and it gets in the way of return stack depth tracking (a retbleed mitigation that is in linux-next as of 2022-11-09). As a first step towards fixing that, move the VMCB VMSAVE/VMLOAD to assembly, essentially undoing commit fb0c4a4fee5a ("KVM: SVM: move VMLOAD/VMSAVE to C code", 2021-03-15). The reason for that commit was that it made it simpler to use a different VMCB for VMLOAD/VMSAVE versus VMRUN; but that is not a big hassle anymore thanks to the kvm-asm-offsets machinery and other related cleanups. The idea on how to number the exception tables is stolen from a prototype patch by Peter Zijlstra. Cc: [email protected] Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk") Link: <https://lore.kernel.org/all/[email protected]/> Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-09	KVM: SVM: do not allocate struct svm_cpu_data dynamically	Paolo Bonzini	3	-29/+18
	The svm_data percpu variable is a pointer, but it is allocated via svm_hardware_setup() when KVM is loaded. Unlike hardware_enable() this means that it is never NULL for the whole lifetime of KVM, and static allocation does not waste any memory compared to the status quo. It is also more efficient and more easily handled from assembly code, so do it and don't look back. Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-09	KVM: SVM: remove dead field from struct svm_cpu_data	Paolo Bonzini	2	-3/+0
	The "cpu" field of struct svm_cpu_data has been write-only since commit 4b656b120249 ("KVM: SVM: force new asid on vcpu migration", 2009-08-05). Remove it. Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-09	KVM: SVM: remove unused field from struct vcpu_svm	Paolo Bonzini	1	-1/+0
	The pointer to svm_cpu_data in struct vcpu_svm looks interesting from the point of view of accessing it after vmexit, when the GSBASE is still containing the guest value. However, despite existing since the very first commit of drivers/kvm/svm.c (commit 6aa8b732ca01, "[PATCH] kvm: userspace interface", 2006-12-10), it was never set to anything. Ignore the opportunity to fix a 16 year old "bug" and delete it; doing things the "harder" way makes it possible to remove more old cruft. Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-09	KVM: SVM: retrieve VMCB from assembly	Paolo Bonzini	4	-15/+16
	Continue moving accesses to struct vcpu_svm to vmenter.S. Reducing the number of arguments limits the chance of mistakes due to different registers used for argument passing in 32- and 64-bit ABIs; pushing the VMCB argument and almost immediately popping it into a different register looks pretty weird. 32-bit ABI is not a concern for __svm_sev_es_vcpu_run() which is 64-bit only; however, it will soon need @svm to save/restore SPEC_CTRL so stay consistent with __svm_vcpu_run() and let them share the same prototype. No functional change intended. Cc: [email protected] Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk") Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-11-09	KVM: SVM: adjust register allocation for __svm_vcpu_run()	Paolo Bonzini	1	-19/+19
	32-bit ABI uses RAX/RCX/RDX as its argument registers, so they are in the way of instructions that hardcode their operands such as RDMSR/WRMSR or VMLOAD/VMRUN/VMSAVE. In preparation for moving vmload/vmsave to __svm_vcpu_run(), keep the pointer to the struct vcpu_svm in %rdi. In particular, it is now possible to load svm->vmcb01.pa in %rax without clobbering the struct vcpu_svm pointer. No functional change intended. Cc: [email protected] Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk") Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>