aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-04-10Merge tag 'hardening-v6.9-rc4' of ↵Linus Torvalds3-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - gcc-plugins/stackleak: Avoid .head.text section (Ard Biesheuvel) - ubsan: fix unused variable warning in test module (Arnd Bergmann) - Improve entropy diffusion in randomize_kstack * tag 'hardening-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: randomize_kstack: Improve entropy diffusion ubsan: fix unused variable warning in test module gcc-plugins/stackleak: Avoid .head.text section
2024-04-10Merge tag 'turbostat-2024.04.10' of ↵Linus Torvalds4-369/+1727
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat updates from Len Brown: - Use of the CPU MSR driver is now optional - Perf is now preferred for many counters - Non-root users can now execute turbostat, though with limited functionality - Add counters for some new GFX hardware - Minor fixes * tag 'turbostat-2024.04.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (26 commits) tools/power turbostat: v2024.04.10 tools/power/turbostat: Add support for Xe sysfs knobs tools/power/turbostat: Add support for new i915 sysfs knobs tools/power/turbostat: Introduce BIC_SAM_mc6/BIC_SAMMHz/BIC_SAMACTMHz tools/power/turbostat: Fix uncore frequency file string tools/power/turbostat: Unify graphics sysfs snapshots tools/power/turbostat: Cache graphics sysfs path tools/power/turbostat: Enable MSR_CORE_C1_RES support for ICX tools/power turbostat: Add selftests tools/power turbostat: read RAPL counters via perf tools/power turbostat: Add proper re-initialization for perf file descriptors tools/power turbostat: Clear added counters when in no-msr mode tools/power turbostat: add early exits for permission checks tools/power turbostat: detect and disable unavailable BICs at runtime tools/power turbostat: Add reading aperf and mperf via perf API tools/power turbostat: Add --no-perf option tools/power turbostat: Add --no-msr option tools/power turbostat: enhance -D (debug counter dump) output tools/power turbostat: Fix warning upon failed /dev/cpu_dma_latency read tools/power turbostat: Read base_hz and bclk from CPUID.16H if available ...
2024-04-10Merge tag 'platform-drivers-x86-v6.9-2' of ↵Linus Torvalds5-10/+25
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: "Fixes: - intel/hid: Solve spurious hibernation aborts (power button release) - toshiba_acpi: Ignore 2 keys to avoid log noise during suspend/resume - intel-vbtn: Fix probe by restoring VBDL and VGBS evalutation order - lg-laptop: Fix W=1 %s null argument warning New HW Support: - acer-wmi: PH18-71 mode button and fan speed sensor - intel/hid: Lunar Lake and Arrow Lake HID IDs" * tag 'platform-drivers-x86-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: lg-laptop: fix %s null argument warning platform/x86: intel-vbtn: Update tablet mode switch at end of probe platform/x86: intel-vbtn: Use acpi_has_method to check for switch platform/x86: toshiba_acpi: Silence logging for some events platform/x86/intel/hid: Add Lunar Lake and Arrow Lake support platform/x86/intel/hid: Don't wake on 5-button releases platform/x86: acer-wmi: Add support for Acer PH18-71
2024-04-10selftests: timers: Fix valid-adjtimex signed left-shift undefined behaviorJohn Stultz1-37/+36
The struct adjtimex freq field takes a signed value who's units are in shifted (<<16) parts-per-million. Unfortunately for negative adjustments, the straightforward use of: freq = ppm << 16 trips undefined behavior warnings with clang: valid-adjtimex.c:66:6: warning: shifting a negative signed value is undefined [-Wshift-negative-value] -499<<16, ~~~~^ valid-adjtimex.c:67:6: warning: shifting a negative signed value is undefined [-Wshift-negative-value] -450<<16, ~~~~^ .. Fix it by using a multiply by (1 << 16) instead of shifting negative values in the valid-adjtimex test case. Align the values for better readability. Reported-by: Lee Jones <[email protected]> Reported-by: Muhammad Usama Anjum <[email protected]> Signed-off-by: John Stultz <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Muhammad Usama Anjum <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/lkml/[email protected]/
2024-04-10bug: Fix no-return-statement warning with !CONFIG_BUGAdrian Hunter1-1/+4
BUG() does not return, and arch implementations of BUG() use unreachable() or other non-returning code. However with !CONFIG_BUG, the default implementation is often used instead, and that does not do that. x86 always uses its own implementation, but powerpc with !CONFIG_BUG gives a build error: kernel/time/timekeeping.c: In function ‘timekeeping_debug_get_ns’: kernel/time/timekeeping.c:286:1: error: no return statement in function returning non-void [-Werror=return-type] Add unreachable() to default !CONFIG_BUG BUG() implementation. Fixes: e8e9d21a5df6 ("timekeeping: Refactor timekeeping helpers") Reported-by: Naresh Kamboju <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Linux Kernel Functional Testing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Closes: https://lore.kernel.org/all/CA+G9fYvjdZCW=7ZGxS6A_3bysjQ56YF7S-+PNLQ_8a4DKh1Bhg@mail.gmail.com/
2024-04-10Bluetooth: l2cap: Don't double set the HCI_CONN_MGMT_CONNECTED bitArchie Pusaka1-2/+1
The bit is set and tested inside mgmt_device_connected(), therefore we must not set it just outside the function. Fixes: eeda1bf97bb5 ("Bluetooth: hci_event: Fix not indicating new connection for BIG Sync") Signed-off-by: Archie Pusaka <[email protected]> Reviewed-by: Manish Mandlik <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2024-04-10Bluetooth: hci_sock: Fix not validating setsockopt user inputLuiz Augusto von Dentz1-13/+8
Check user input length before copying data. Fixes: 09572fca7223 ("Bluetooth: hci_sock: Add support for BT_{SND,RCV}BUF") Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2024-04-10Bluetooth: ISO: Fix not validating setsockopt user inputLuiz Augusto von Dentz1-24/+12
Check user input length before copying data. Fixes: ccf74f2390d6 ("Bluetooth: Add BTPROTO_ISO socket type") Fixes: 0731c5ab4d51 ("Bluetooth: ISO: Add support for BT_PKT_STATUS") Fixes: f764a6c2c1e4 ("Bluetooth: ISO: Add broadcast support") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2024-04-10Bluetooth: L2CAP: Fix not validating setsockopt user inputLuiz Augusto von Dentz1-32/+20
Check user input length before copying data. Fixes: 33575df7be67 ("Bluetooth: move l2cap_sock_setsockopt() to l2cap_sock.c") Fixes: 3ee7b7cd8390 ("Bluetooth: Add BT_MODE socket option") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2024-04-10Bluetooth: RFCOMM: Fix not validating setsockopt user inputLuiz Augusto von Dentz1-9/+5
syzbot reported rfcomm_sock_setsockopt_old() is copying data without checking user input length. BUG: KASAN: slab-out-of-bounds in copy_from_sockptr_offset include/linux/sockptr.h:49 [inline] BUG: KASAN: slab-out-of-bounds in copy_from_sockptr include/linux/sockptr.h:55 [inline] BUG: KASAN: slab-out-of-bounds in rfcomm_sock_setsockopt_old net/bluetooth/rfcomm/sock.c:632 [inline] BUG: KASAN: slab-out-of-bounds in rfcomm_sock_setsockopt+0x893/0xa70 net/bluetooth/rfcomm/sock.c:673 Read of size 4 at addr ffff8880209a8bc3 by task syz-executor632/5064 Fixes: 9f2c8a03fbb3 ("Bluetooth: Replace RFCOMM link mode with security level") Fixes: bb23c0ab8246 ("Bluetooth: Add support for deferring RFCOMM connection setup") Reported-by: syzbot <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2024-04-10Bluetooth: SCO: Fix not validating setsockopt user inputLuiz Augusto von Dentz2-13/+19
syzbot reported sco_sock_setsockopt() is copying data without checking user input length. BUG: KASAN: slab-out-of-bounds in copy_from_sockptr_offset include/linux/sockptr.h:49 [inline] BUG: KASAN: slab-out-of-bounds in copy_from_sockptr include/linux/sockptr.h:55 [inline] BUG: KASAN: slab-out-of-bounds in sco_sock_setsockopt+0xc0b/0xf90 net/bluetooth/sco.c:893 Read of size 4 at addr ffff88805f7b15a3 by task syz-executor.5/12578 Fixes: ad10b1a48754 ("Bluetooth: Add Bluetooth socket voice option") Fixes: b96e9c671b05 ("Bluetooth: Add BT_DEFER_SETUP option to sco socket") Fixes: 00398e1d5183 ("Bluetooth: Add support for BT_PKT_STATUS CMSG data for SCO connections") Fixes: f6873401a608 ("Bluetooth: Allow setting of codec for HFP offload use case") Reported-by: syzbot <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2024-04-10Bluetooth: Fix memory leak in hci_req_sync_complete()Dmitry Antipov1-1/+3
In 'hci_req_sync_complete()', always free the previous sync request state before assigning reference to a new one. Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=39ec16ff6cc18b1d066d Cc: [email protected] Fixes: f60cb30579d3 ("Bluetooth: Convert hci_req_sync family of function to new request API") Signed-off-by: Dmitry Antipov <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2024-04-10Bluetooth: hci_sync: Fix using the same interval and window for Coded PHYLuiz Augusto von Dentz1-3/+3
Coded PHY recommended intervals are 3 time bigger than the 1M PHY so this aligns with that by multiplying by 3 the values given to 1M PHY since the code already used recommended values for that. Fixes: 288c90224eec ("Bluetooth: Enable all supported LE PHY by default") Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2024-04-10Bluetooth: ISO: Don't reject BT_ISO_QOS if parameters are unsetLuiz Augusto von Dentz1-2/+8
Consider certain values (0x00) as unset and load proper default if an application has not set them properly. Fixes: 0fe8c8d07134 ("Bluetooth: Split bt_iso_qos into dedicated structures") Signed-off-by: Luiz Augusto von Dentz <[email protected]>
2024-04-10arm64: tlb: Fix TLBI RANGE operandGavin Shan1-9/+11
KVM/arm64 relies on TLBI RANGE feature to flush TLBs when the dirty pages are collected by VMM and the page table entries become write protected during live migration. Unfortunately, the operand passed to the TLBI RANGE instruction isn't correctly sorted out due to the commit 117940aa6e5f ("KVM: arm64: Define kvm_tlb_flush_vmid_range()"). It leads to crash on the destination VM after live migration because TLBs aren't flushed completely and some of the dirty pages are missed. For example, I have a VM where 8GB memory is assigned, starting from 0x40000000 (1GB). Note that the host has 4KB as the base page size. In the middile of migration, kvm_tlb_flush_vmid_range() is executed to flush TLBs. It passes MAX_TLBI_RANGE_PAGES as the argument to __kvm_tlb_flush_vmid_range() and __flush_s2_tlb_range_op(). SCALE#3 and NUM#31, corresponding to MAX_TLBI_RANGE_PAGES, isn't supported by __TLBI_RANGE_NUM(). In this specific case, -1 has been returned from __TLBI_RANGE_NUM() for SCALE#3/2/1/0 and rejected by the loop in the __flush_tlb_range_op() until the variable @scale underflows and becomes -9, 0xffff708000040000 is set as the operand. The operand is wrong since it's sorted out by __TLBI_VADDR_RANGE() according to invalid @scale and @num. Fix it by extending __TLBI_RANGE_NUM() to support the combination of SCALE#3 and NUM#31. With the changes, [-1 31] instead of [-1 30] can be returned from the macro, meaning the TLBs for 0x200000 pages in the above example can be flushed in one shoot with SCALE#3 and NUM#31. The macro TLBI_RANGE_MASK is dropped since no one uses it any more. The comments are also adjusted accordingly. Fixes: 117940aa6e5f ("KVM: arm64: Define kvm_tlb_flush_vmid_range()") Cc: [email protected] # v6.6+ Reported-by: Yihuang Yu <[email protected]> Suggested-by: Marc Zyngier <[email protected]> Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Reviewed-by: Shaoqin Huang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2024-04-10kprobes: Fix possible use-after-free issue on kprobe registrationZheng Yejian1-6/+12
When unloading a module, its state is changing MODULE_STATE_LIVE -> MODULE_STATE_GOING -> MODULE_STATE_UNFORMED. Each change will take a time. `is_module_text_address()` and `__module_text_address()` works with MODULE_STATE_LIVE and MODULE_STATE_GOING. If we use `is_module_text_address()` and `__module_text_address()` separately, there is a chance that the first one is succeeded but the next one is failed because module->state becomes MODULE_STATE_UNFORMED between those operations. In `check_kprobe_address_safe()`, if the second `__module_text_address()` is failed, that is ignored because it expected a kernel_text address. But it may have failed simply because module->state has been changed to MODULE_STATE_UNFORMED. In this case, arm_kprobe() will try to modify non-exist module text address (use-after-free). To fix this problem, we should not use separated `is_module_text_address()` and `__module_text_address()`, but use only `__module_text_address()` once and do `try_module_get(module)` which is only available with MODULE_STATE_LIVE. Link: https://lore.kernel.org/all/[email protected]/ Fixes: 28f6c37a2910 ("kprobes: Forbid probing on trampoline and BPF code areas") Cc: [email protected] Signed-off-by: Zheng Yejian <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
2024-04-10x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=nSean Christopherson1-1/+2
Initialize cpu_mitigations to CPU_MITIGATIONS_OFF if the kernel is built with CONFIG_SPECULATION_MITIGATIONS=n, as the help text quite clearly states that disabling SPECULATION_MITIGATIONS is supposed to turn off all mitigations by default. │ If you say N, all mitigations will be disabled. You really │ should know what you are doing to say so. As is, the kernel still defaults to CPU_MITIGATIONS_AUTO, which results in some mitigations being enabled in spite of SPECULATION_MITIGATIONS=n. Fixes: f43b9876e857 ("x86/retbleed: Add fine grained Kconfig knobs") Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Daniel Sneddon <[email protected]> Cc: [email protected] Cc: Linus Torvalds <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-04-10x86/topology: Don't update cpu_possible_map in topo_set_cpuids()Thomas Gleixner1-2/+5
topo_set_cpuids() updates cpu_present_map and cpu_possible map. It is invoked during enumeration and "physical hotplug" operations. In the latter case this results in a kernel crash because cpu_possible_map is marked read only after init completes. There is no reason to update cpu_possible_map in that function. During enumeration cpu_possible_map is not relevant and gets fully initialized after enumeration completed. On "physical hotplug" the bit is already set because the kernel allows only CPUs to be plugged which have been enumerated and associated to a CPU number during early boot. Remove the bogus update of cpu_possible_map. Fixes: 0e53e7b656cf ("x86/cpu/topology: Sanitize the APIC admission logic") Reported-by: Jonathan Cameron <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/87ttkc6kwx.ffs@tglx
2024-04-10LoongArch: Include linux/sizes.h in addrspace.h to prevent build errorsRandy Dunlap1-0/+1
LoongArch's include/asm/addrspace.h uses SZ_32M and SZ_16K, so add <linux/sizes.h> to provide those macros to prevent build errors: In file included from ../arch/loongarch/include/asm/io.h:11, from ../include/linux/io.h:13, from ../include/linux/io-64-nonatomic-lo-hi.h:5, from ../drivers/cxl/pci.c:4: ../include/asm-generic/io.h: In function 'ioport_map': ../arch/loongarch/include/asm/addrspace.h:124:25: error: 'SZ_32M' undeclared (first use in this function); did you mean 'PS_32M'? 124 | #define PCI_IOSIZE SZ_32M Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2024-04-10LoongArch: Update dts for Loongson-2K2000 to support GMAC/GNETHuacai Chen2-3/+42
Current dts file for Loongson-2K2000's GMAC/GNET is incomplete, both irq and phy descriptions are missing. Add them to make GMAC/GNET work. Signed-off-by: Huacai Chen <[email protected]>
2024-04-10LoongArch: Update dts for Loongson-2K2000 to support PCI-MSIHuacai Chen1-0/+3
Current dts file for Loongson-2K2000 misses the interrupt-controller & interrupt-cells descriptions in the msi-controller node, and misses the msi-parent link in the pci root node. Add them to support PCI-MSI. Signed-off-by: Huacai Chen <[email protected]>
2024-04-10LoongArch: Update dts for Loongson-2K2000 to support ISA/LPCHuacai Chen1-1/+8
Some Loongson-2K2000 platforms have ISA/LPC devices such as Super-IO, define an ISA node in the dts file to avoid access error. Also adjust the PCI io resource range to avoid confliction. Signed-off-by: Huacai Chen <[email protected]>
2024-04-10LoongArch: Update dts for Loongson-2K1000 to support ISA/LPCHuacai Chen1-0/+7
Some Loongson-2K1000 platforms have ISA/LPC devices such as Super-IO, define an ISA node in the dts file to avoid access error. Signed-off-by: Huacai Chen <[email protected]>
2024-04-10LoongArch: Make virt_addr_valid()/__virt_addr_valid() work with KFENCEHuacai Chen1-0/+4
When enabling both CONFIG_KFENCE and CONFIG_DEBUG_SG, I get the following backtraces when running LongArch kernels. [ 2.496257] kernel BUG at include/linux/scatterlist.h:187! ... [ 2.501925] Call Trace: [ 2.501950] [<9000000004ad59c4>] sg_init_one+0xac/0xc0 [ 2.502204] [<9000000004a438f8>] do_test_kpp+0x278/0x6e4 [ 2.502353] [<9000000004a43dd4>] alg_test_kpp+0x70/0xf4 [ 2.502494] [<9000000004a41b48>] alg_test+0x128/0x690 [ 2.502631] [<9000000004a3d898>] cryptomgr_test+0x20/0x40 [ 2.502775] [<90000000041b4508>] kthread+0x138/0x158 [ 2.502912] [<9000000004161c48>] ret_from_kernel_thread+0xc/0xa4 The backtrace is always similar but not exactly the same. It is always triggered from cryptomgr_test, but not always from the same test. Analysis shows that with CONFIG_KFENCE active, the address returned from kmalloc() and friends is not always below vm_map_base. It is allocated by kfence_alloc() which at least sometimes seems to get its memory from an address space above vm_map_base. This causes __virt_addr_valid() to return false for the affected objects. Let __virt_addr_valid() return 1 for kfence pool addresses, this make virt_addr_valid()/__virt_addr_valid() work with KFENCE. Reported-by: Guenter Roeck <[email protected]> Suggested-by: Guenter Roeck <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2024-04-10LoongArch: Make {virt, phys, page, pfn} translation work with KFENCEHuacai Chen4-8/+51
KFENCE changes virt_to_page() to be able to translate tlb mapped virtual addresses, but forget to change virt_to_phys()/phys_to_virt() and other translation functions as well. This patch fix it, otherwise some drivers (such as nvme and virtio-blk) cannot work with KFENCE. All {virt, phys, page, pfn} translation functions are updated: 1, virt_to_pfn()/pfn_to_virt(); 2, virt_to_page()/page_to_virt(); 3, virt_to_phys()/phys_to_virt(). DMW/TLB mapped addresses are distinguished by comparing the vaddress with vm_map_base in virt_to_xyz(), and we define WANT_PAGE_VIRTUAL in the KFENCE case for the reverse translations, xyz_to_virt(). Signed-off-by: Huacai Chen <[email protected]>
2024-04-10mm: Move lowmem_page_address() a little laterHuacai Chen1-5/+5
LoongArch will override page_to_virt() which use page_address() in the KFENCE case (by defining WANT_PAGE_VIRTUAL/HASHED_PAGE_VIRTUAL). So move lowmem_page_address() a little later to avoid such build errors: error: implicit declaration of function 'page_address'. Acked-by: Andrew Morton <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2024-04-10tools/power turbostat: v2024.04.10Len Brown3-15/+28
Much of turbostat can now run with perf, rather than using the MSR driver Some of turbostat can now run as a regular non-root user. Add some new output columns for some new GFX hardware. [This patch updates the version, but otherwise changes no function; it touches up some checkpatch issues from previous patches] Signed-off-by: Len Brown <[email protected]>
2024-04-10tools/power/turbostat: Add support for Xe sysfs knobsZhang Rui1-0/+51
Xe graphics driver uses different graphics sysfs knobs including /sys/class/drm/card0/device/tile0/gt0/gtidle/idle_residency_ms /sys/class/drm/card0/device/tile0/gt0/freq0/cur_freq /sys/class/drm/card0/device/tile0/gt0/freq0/act_freq /sys/class/drm/card0/device/tile0/gt1/gtidle/idle_residency_ms /sys/class/drm/card0/device/tile0/gt1/freq0/cur_freq /sys/class/drm/card0/device/tile0/gt1/freq0/act_freq Plus that, /sys/class/drm/card0/device/tile0/gt<n>/gtidle/name returns either gt<n>-rc or gt<n>-mc. rc is for GFX and mc is SA Media. Enhance turbostat to prefer the Xe sysfs knobs when they are available. Export gt<n>-rc via BIC_GFX_rc6/BIC_GFXMHz/BIC_GFXACTMHz. Export gt<n>-mc via BIC_SMA_mc6/BIC_SMAMHz/BIC_SMAACTMHz. Signed-off-by: Zhang Rui <[email protected]>
2024-04-10tools/power/turbostat: Add support for new i915 sysfs knobsZhang Rui1-0/+24
On Meteorlake platform, i915 driver supports the traditional graphics sysfs knobs including /sys/class/drm/card0/power/rc6_residency_ms /sys/class/drm/card0/gt_cur_freq_mhz /sys/class/drm/card0/gt_act_freq_mhz At the same time, it also supports /sys/class/drm/card0/gt/gt0/rc6_residency_ms /sys/class/drm/card0/gt/gt0/rps_cur_freq_mhz /sys/class/drm/card0/gt/gt0/rps_act_freq_mhz /sys/class/drm/card0/gt/gt1/rc6_residency_ms /sys/class/drm/card0/gt/gt1/rps_cur_freq_mhz /sys/class/drm/card0/gt/gt1/rps_act_freq_mhz gt0 is for GFX and gt1 is for SA Media. Enhance turbostat to prefer the i915 new sysfs knobs. Export gt0 via BIC_GFX_rc6/BIC_GFXMHz/BIC_GFXACTMHz. Export gt1 via BIC_SMA_mc6/BIC_SMAMHz/BIC_SMAACTMHz. Signed-off-by: Zhang Rui <[email protected]>
2024-04-10tools/power/turbostat: Introduce BIC_SAM_mc6/BIC_SAMMHz/BIC_SAMACTMHzZhang Rui2-7/+96
Graphics driver (i915/Xe) on mordern platforms splits GFX and SA Media information via different sysfs knobs. Existing BIC_GFX_rc6/BIC_GFXMHz/BIC_GFXACTMHz columns can be reused for GFX. Introduce BIC_SAM_mc6/BIC_SAMMHz/BIC_SAMACTMHz columns for SA Media. Signed-off-by: Zhang Rui <[email protected]>
2024-04-10r8169: fix LED-related deadlock on module removalHeiner Kallweit3-16/+33
Binding devm_led_classdev_register() to the netdev is problematic because on module removal we get a RTNL-related deadlock. Fix this by avoiding the device-managed LED functions. Note: We can safely call led_classdev_unregister() for a LED even if registering it failed, because led_classdev_unregister() detects this and is a no-op in this case. Fixes: 18764b883e15 ("r8169: add support for LED's on RTL8168/RTL8101") Cc: [email protected] Reported-by: Lukas Wunner <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-10timekeeping: Use READ/WRITE_ONCE() for tick_do_timer_cpuThomas Gleixner2-22/+31
tick_do_timer_cpu is used lockless to check which CPU needs to take care of the per tick timekeeping duty. This is done to avoid a thundering herd problem on jiffies_lock. The read and writes are not annotated so KCSAN complains about data races: BUG: KCSAN: data-race in tick_nohz_idle_stop_tick / tick_nohz_next_event write to 0xffffffff8a2bda30 of 4 bytes by task 0 on cpu 26: tick_nohz_idle_stop_tick+0x3b1/0x4a0 do_idle+0x1e3/0x250 read to 0xffffffff8a2bda30 of 4 bytes by task 0 on cpu 16: tick_nohz_next_event+0xe7/0x1e0 tick_nohz_get_sleep_length+0xa7/0xe0 menu_select+0x82/0xb90 cpuidle_select+0x44/0x60 do_idle+0x1c2/0x250 value changed: 0x0000001a -> 0xffffffff Annotate them with READ/WRITE_ONCE() to document the intentional data race. Reported-by: Mirsad Todorovac <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Tested-by: Sean Anderson <[email protected]> Link: https://lore.kernel.org/r/87cyqy7rt3.ffs@tglx
2024-04-10pds_core: Fix pdsc_check_pci_health function to use work threadBrett Creeley4-1/+18
When the driver notices fw_status == 0xff it tries to perform a PCI reset on itself via pci_reset_function() in the context of the driver's health thread. However, pdsc_reset_prepare calls pdsc_stop_health_thread(), which attempts to stop/flush the health thread. This results in a deadlock because the stop/flush will never complete since the driver called pci_reset_function() from the health thread context. Fix by changing the pdsc_check_pci_health_function() to queue a newly introduced pdsc_pci_reset_thread() on the pdsc's work queue. Unloading the driver in the fw_down/dead state uncovered another issue, which can be seen in the following trace: WARNING: CPU: 51 PID: 6914 at kernel/workqueue.c:1450 __queue_work+0x358/0x440 [...] RIP: 0010:__queue_work+0x358/0x440 [...] Call Trace: <TASK> ? __warn+0x85/0x140 ? __queue_work+0x358/0x440 ? report_bug+0xfc/0x1e0 ? handle_bug+0x3f/0x70 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? __queue_work+0x358/0x440 queue_work_on+0x28/0x30 pdsc_devcmd_locked+0x96/0xe0 [pds_core] pdsc_devcmd_reset+0x71/0xb0 [pds_core] pdsc_teardown+0x51/0xe0 [pds_core] pdsc_remove+0x106/0x200 [pds_core] pci_device_remove+0x37/0xc0 device_release_driver_internal+0xae/0x140 driver_detach+0x48/0x90 bus_remove_driver+0x6d/0xf0 pci_unregister_driver+0x2e/0xa0 pdsc_cleanup_module+0x10/0x780 [pds_core] __x64_sys_delete_module+0x142/0x2b0 ? syscall_trace_enter.isra.18+0x126/0x1a0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7fbd9d03a14b [...] Fix this by preventing the devcmd reset if the FW is not running. Fixes: d9407ff11809 ("pds_core: Prevent health thread from running during reset/remove") Reviewed-by: Shannon Nelson <[email protected]> Signed-off-by: Brett Creeley <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-10x86/bugs: Fix return type of spectre_bhi_state()Daniel Sneddon1-1/+1
The definition of spectre_bhi_state() incorrectly returns a const char * const. This causes the a compiler warning when building with W=1: warning: type qualifiers ignored on function return type [-Wignored-qualifiers] 2812 | static const char * const spectre_bhi_state(void) Remove the const qualifier from the pointer. Fixes: ec9404e40e8f ("x86/bhi: Add BHI mitigation knob") Reported-by: Sean Christopherson <[email protected]> Signed-off-by: Daniel Sneddon <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-04-10Merge branch 'linus' into x86/urgent, to pick up dependent commitsIngo Molnar33-87/+463
Prepare to fix aspects of the new BHI code. Signed-off-by: Ingo Molnar <[email protected]>
2024-04-10perf/x86: Fix out of range dataNamhyung Kim1-0/+1
On x86 each struct cpu_hw_events maintains a table for counter assignment but it missed to update one for the deleted event in x86_pmu_del(). This can make perf_clear_dirty_counters() reset used counter if it's called before event scheduling or enabling. Then it would return out of range data which doesn't make sense. The following code can reproduce the problem. $ cat repro.c #include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <linux/perf_event.h> #include <sys/ioctl.h> #include <sys/mman.h> #include <sys/syscall.h> struct perf_event_attr attr = { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES, .disabled = 1, }; void *worker(void *arg) { int cpu = (long)arg; int fd1 = syscall(SYS_perf_event_open, &attr, -1, cpu, -1, 0); int fd2 = syscall(SYS_perf_event_open, &attr, -1, cpu, -1, 0); void *p; do { ioctl(fd1, PERF_EVENT_IOC_ENABLE, 0); p = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd1, 0); ioctl(fd2, PERF_EVENT_IOC_ENABLE, 0); ioctl(fd2, PERF_EVENT_IOC_DISABLE, 0); munmap(p, 4096); ioctl(fd1, PERF_EVENT_IOC_DISABLE, 0); } while (1); return NULL; } int main(void) { int i; int n = sysconf(_SC_NPROCESSORS_ONLN); pthread_t *th = calloc(n, sizeof(*th)); for (i = 0; i < n; i++) pthread_create(&th[i], NULL, worker, (void *)(long)i); for (i = 0; i < n; i++) pthread_join(th[i], NULL); free(th); return 0; } And you can see the out of range data using perf stat like this. Probably it'd be easier to see on a large machine. $ gcc -o repro repro.c -pthread $ ./repro & $ sudo perf stat -A -I 1000 2>&1 | awk '{ if (length($3) > 15) print }' 1.001028462 CPU6 196,719,295,683,763 cycles # 194290.996 GHz (71.54%) 1.001028462 CPU3 396,077,485,787,730 branch-misses # 15804359784.80% of all branches (71.07%) 1.001028462 CPU17 197,608,350,727,877 branch-misses # 14594186554.56% of all branches (71.22%) 2.020064073 CPU4 198,372,472,612,140 cycles # 194681.113 GHz (70.95%) 2.020064073 CPU6 199,419,277,896,696 cycles # 195720.007 GHz (70.57%) 2.020064073 CPU20 198,147,174,025,639 cycles # 194474.654 GHz (71.03%) 2.020064073 CPU20 198,421,240,580,145 stalled-cycles-frontend # 100.14% frontend cycles idle (70.93%) 3.037443155 CPU4 197,382,689,923,416 cycles # 194043.065 GHz (71.30%) 3.037443155 CPU20 196,324,797,879,414 cycles # 193003.773 GHz (71.69%) 3.037443155 CPU5 197,679,956,608,205 stalled-cycles-backend # 1315606428.66% backend cycles idle (71.19%) 3.037443155 CPU5 198,571,860,474,851 instructions # 13215422.58 insn per cycle It should move the contents in the cpuc->assign as well. Fixes: 5471eea5d3bf ("perf/x86: Reset the dirty counter to prevent the leak for an RDPMC task") Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Kan Liang <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected]
2024-04-10drm/amdgpu: differentiate external rev id for gfx 11.5.0Yifan Zhang1-1/+4
This patch to differentiate external rev id for gfx 11.5.0. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Tim Huang <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-04-09drm/amd/display: Adjust dprefclk by down spread percentage.Zhongwei2-4/+54
[Why] OLED panels show no display for large vtotal timings. [How] Check if ss is enabled and read from lut for spread spectrum percentage. Adjust dprefclk as required. DP_DTO adjustment is for edp only. Cc: [email protected] Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Hamza Mahfooz <[email protected]> Signed-off-by: Zhongwei <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-09drm/amd/display: Set VSC SDP Colorimetry same way for MST and SSTHarry Wentland1-9/+3
The previous check for the is_vsc_sdp_colorimetry_supported flag for MST sink signals did nothing. Simplify the code and use the same check for MST and SST. Cc: [email protected] Reviewed-by: Agustin Gutierrez <[email protected]> Acked-by: Hamza Mahfooz <[email protected]> Signed-off-by: Harry Wentland <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-09drm/amd/display: Program VSC SDP colorimetry for all DP sinks >= 1.4Harry Wentland1-2/+5
In order for display colorimetry to work correctly on DP displays we need to send the VSC SDP packet. We should only do so for panels with DPCD revision greater or equal to 1.4 as older receivers might have problems with it. Cc: [email protected] Cc: Joshua Ashton <[email protected]> Cc: Xaver Hugl <[email protected]> Cc: Melissa Wen <[email protected]> Cc: Agustin Gutierrez <[email protected]> Reviewed-by: Agustin Gutierrez <[email protected]> Acked-by: Hamza Mahfooz <[email protected]> Signed-off-by: Harry Wentland <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-09drm/amd/display: fix disable otg wa logic in DCN316Fudongwang1-7/+12
[Why] Wrong logic cause screen corruption. [How] Port logic from DCN35/314. Cc: [email protected] Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Hamza Mahfooz <[email protected]> Signed-off-by: Fudongwang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-09drm/amd/display: Do not recursively call manual trigger programmingDillon Varone1-3/+0
[WHY&HOW] We should not be recursively calling the manual trigger programming function when FAMS is not in use. Cc: [email protected] Reviewed-by: Alvin Lee <[email protected]> Acked-by: Hamza Mahfooz <[email protected]> Signed-off-by: Dillon Varone <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-09drm/amd/display: always reset ODM mode in context when adding first planeWenjing Liu1-0/+9
[why] In current implemenation ODM mode is only reset when the last plane is removed from dc state. For any dc validate we will always remove all current planes and add new planes. However when switching from no planes to 1 plane, ODM mode is not reset because no planes get removed. This has caused an issue where we kept ODM combine when it should have been remove when a plane is added. The change is to reset ODM mode when adding the first plane. Cc: [email protected] Reviewed-by: Alvin Lee <[email protected]> Acked-by: Hamza Mahfooz <[email protected]> Signed-off-by: Wenjing Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-09drm/amdgpu: fix incorrect number of active RBs for gfx11Tim Huang1-1/+1
The RB bitmap should be global active RB bitmap & active RB bitmap based on active SA. Signed-off-by: Tim Huang <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-04-09drm/amd/display: Return max resolution supported by DWBAlex Hung1-4/+2
mode_config's max width x height is 4096x2160 and is higher than DWB's max resolution 3840x2160 which is returned instead. Cc: [email protected] Reviewed-by: Harry Wentland <[email protected]> Acked-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-09amd/amdkfd: sync all devices to wait all processes being evictedZhigang Luo1-11/+6
If there are more than one device doing reset in parallel, the first device will call kfd_suspend_all_processes() to evict all processes on all devices, this call takes time to finish. other device will start reset and recover without waiting. if the process has not been evicted before doing recover, it will be restored, then caused page fault. Signed-off-by: Zhigang Luo <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-09drm/amdgpu: clear set_q_mode_offs when VM changedZhenGuo Yin1-0/+1
[Why] set_q_mode_offs don't get cleared after GPU reset, nexting SET_Q_MODE packet to init shadow memory will be skiped, hence there has a page fault. [How] VM flush is needed after GPU reset, clear set_q_mode_offs when emitting VM flush. Fixes: 8bc75586ea01 ("drm/amdgpu: workaround to avoid SET_Q_MODE packets v2") Reviewed-by: Christian König <[email protected]> Signed-off-by: ZhenGuo Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-04-09drm/amdgpu: Fix VCN allocation in CPX partitionLijo Lazar1-4/+11
VCN need not be shared in CPX mode always for all GFX 9.4.3 SOC SKUs. In certain configs, VCN instance can be exclusively allocated to a partition even under CPX mode. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: James Zhu <[email protected]> Reviewed-by: Asad Kamal <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-09drm/amd/pm: fix the high voltage issue after unloadKenneth Feng4-14/+48
fix the high voltage issue after unload on smu 13.0.10 Signed-off-by: Kenneth Feng <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-09drm/amd/display: Skip on writeback when it's not applicableAlex Hung1-1/+8
[WHY] dynamic memory safety error detector (KASAN) catches and generates error messages "BUG: KASAN: slab-out-of-bounds" as writeback connector does not support certain features which are not initialized. [HOW] Skip them when connector type is DRM_MODE_CONNECTOR_WRITEBACK. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3199 Reviewed-by: Harry Wentland <[email protected]> Reviewed-by: Rodrigo Siqueira <[email protected]> Acked-by: Roman Li <[email protected]> Signed-off-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>