aboutsummaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2024-06-14Merge tag 'exynos-drm-fixes-for-v6.10-rc4' of ↵Dave Airlie3-4/+11
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes Regression fix - Fix an regression issue by adding 640x480 fallback mode for Exynos HDMI driver. Bug fix - Fix a memory leak by ensuring the duplicated EDID is properly freed in the get_modes function. Code cleanup - Remove redundant driver owner initialization since platform_driver_register() sets it automatically. Signed-off-by: Dave Airlie <[email protected]> From: Inki Dae <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-06-13net: mvpp2: use slab_build_skb for oversized framesAryan Srivastava1-1/+4
Setting frag_size to 0 to indicate kmalloc has been deprecated, use slab_build_skb directly. Fixes: ce098da1497c ("skbuff: Introduce slab_build_skb()") Signed-off-by: Aryan Srivastava <[email protected]> Reviewed-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-13Merge tag 'nvme-6.10-2024-06-13' of git://git.infradead.org/nvme into block-6.10Jens Axboe5-16/+10
Pull NVMe fixes from Keith: "nvme fixes for Linux 6.10 - Discard double free on error conditions (Chunguang) - Target Fixes (Daniel) - Namespace detachment regression fix (Keith)" * tag 'nvme-6.10-2024-06-13' of git://git.infradead.org/nvme: nvme: fix namespace removal list nvmet: always initialize cqe.result nvmet-passthru: propagate status from id override functions nvme: avoid double free special payload
2024-06-13nvme: fix namespace removal listKeith Busch1-4/+5
This function wants to move a subset of a list from one element to the tail into another list. It also needs to use the srcu synchronize instead of the regular rcu version. Do this one element at a time because that's the only to do it. Fixes: be647e2c76b27f4 ("nvme: use srcu for iterating namespace list") Reported-by: Venkat Rao Bagalkote <[email protected]> Tested-by: Venkat Rao Bagalkote <[email protected]> Signed-off-by: Keith Busch <[email protected]>
2024-06-13Merge tag 'net-6.10-rc4' of ↵Linus Torvalds18-63/+137
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth and netfilter. Slim pickings this time, probably a combination of summer, DevConf.cz, and the end of first half of the year at corporations. Current release - regressions: - Revert "igc: fix a log entry using uninitialized netdev", it traded lack of netdev name in a printk() for a crash Previous releases - regressions: - Bluetooth: L2CAP: fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ - geneve: fix incorrectly setting lengths of inner headers in the skb, confusing the drivers and causing mangled packets - sched: initialize noop_qdisc owner to avoid false-positive recursion detection (recursing on CPU 0), which bubbles up to user space as a sendmsg() error, while noop_qdisc should silently drop - netdevsim: fix backwards compatibility in nsim_get_iflink() Previous releases - always broken: - netfilter: ipset: fix race between namespace cleanup and gc in the list:set type" * tag 'net-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits) bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send() af_unix: Read with MSG_PEEK loops if the first unread byte is OOB bnxt_en: Cap the size of HWRM_PORT_PHY_QCFG forwarded response gve: Clear napi->skb before dev_kfree_skb_any() ionic: fix use after netif_napi_del() Revert "igc: fix a log entry using uninitialized netdev" net: bridge: mst: fix suspicious rcu usage in br_mst_set_state net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state net/ipv6: Fix the RT cache flush via sysctl using a previous delay net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters gve: ignore nonrelevant GSO type bits when processing TSO headers net: pse-pd: Use EOPNOTSUPP error code instead of ENOTSUPP netfilter: Use flowlabel flow key when re-routing mangled packets netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type netfilter: nft_inner: validate mandatory meta and payload tcp: use signed arithmetic in tcp_rtx_probe0_timed_out() mailmap: map Geliang's new email address mptcp: pm: update add_addr counters after connect mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID mptcp: ensure snd_una is properly initialized on connect ...
2024-06-13ice: implement AQ download pkg retryWojciech Drewek1-2/+21
ice_aqc_opc_download_pkg (0x0C40) AQ sporadically returns error due to FW issue. Fix this by retrying five times before moving to Safe Mode. Sleep for 20 ms before retrying. This was tested with the 4.40 firmware. Fixes: c76488109616 ("ice: Implement Dynamic Device Personalization (DDP) download") Reviewed-by: Michal Swiatkowski <[email protected]> Signed-off-by: Wojciech Drewek <[email protected]> Reviewed-by: Brett Creeley <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2024-06-13ice: fix 200G link speed message logPaul Greenwalt1-0/+3
Commit 24407a01e57c ("ice: Add 200G speed/phy type use") added support for 200G PHY speeds, but did not include 200G link speed message support. As a result the driver incorrectly reports Unknown for 200G link speed. Fix this by adding 200G support to ice_print_link_msg(). Fixes: 24407a01e57c ("ice: Add 200G speed/phy type use") Reviewed-by: Michal Swiatkowski <[email protected]> Signed-off-by: Paul Greenwalt <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2024-06-13ice: avoid IRQ collision to fix init failure on ACPI S3 resumeEn-Wei Wu1-1/+6
A bug in https://bugzilla.kernel.org/show_bug.cgi?id=218906 describes that irdma would break and report hardware initialization failed after suspend/resume with Intel E810 NIC (tested on 6.9.0-rc5). The problem is caused due to the collision between the irq numbers requested in irdma and the irq numbers requested in other drivers after suspend/resume. The irq numbers used by irdma are derived from ice's ice_pf->msix_entries which stores mappings between MSI-X index and Linux interrupt number. It's supposed to be cleaned up when suspend and rebuilt in resume but it's not, causing irdma using the old irq numbers stored in the old ice_pf->msix_entries to request_irq() when resume. And eventually collide with other drivers. This patch fixes this problem. On suspend, we call ice_deinit_rdma() to clean up the ice_pf->msix_entries (and free the MSI-X vectors used by irdma if we've dynamically allocated them). On resume, we call ice_init_rdma() to rebuild the ice_pf->msix_entries (and allocate the MSI-X vectors if we would like to dynamically allocate them). Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA") Tested-by: Cyrus Lien <[email protected]> Signed-off-by: En-Wei Wu <[email protected]> Reviewed-by: Wojciech Drewek <[email protected]> Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]>
2024-06-13bnxt_en: Adjust logging of firmware messages in case of released token in ↵Aleksandr Mishin1-1/+1
__hwrm_send() In case of token is released due to token->state == BNXT_HWRM_DEFERRED, released token (set to NULL) is used in log messages. This issue is expected to be prevented by HWRM_ERR_CODE_PF_UNAVAILABLE error code. But this error code is returned by recent firmware. So some firmware may not return it. This may lead to NULL pointer dereference. Adjust this issue by adding token pointer check. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 8fa4219dba8e ("bnxt_en: add dynamic debug support for HWRM messages") Suggested-by: Michael Chan <[email protected]> Signed-off-by: Aleksandr Mishin <[email protected]> Reviewed-by: Wojciech Drewek <[email protected]> Reviewed-by: Michael Chan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-13bnxt_en: Cap the size of HWRM_PORT_PHY_QCFG forwarded responseMichael Chan2-2/+61
Firmware interface 1.10.2.118 has increased the size of HWRM_PORT_PHY_QCFG response beyond the maximum size that can be forwarded. When the VF's link state is not the default auto state, the PF will need to forward the response back to the VF to indicate the forced state. This regression may cause the VF to fail to initialize. Fix it by capping the HWRM_PORT_PHY_QCFG response to the maximum 96 bytes. The SPEEDS2_SUPPORTED flag needs to be cleared because the new speeds2 fields are beyond the legacy structure. Also modify bnxt_hwrm_fwd_resp() to print a warning if the message size exceeds 96 bytes to make this failure more obvious. Fixes: 84a911db8305 ("bnxt_en: Update firmware interface to 1.10.2.118") Reviewed-by: Somnath Kotur <[email protected]> Reviewed-by: Pavan Chebbi <[email protected]> Signed-off-by: Michael Chan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-13gve: Clear napi->skb before dev_kfree_skb_any()Ziwei Xiao1-3/+5
gve_rx_free_skb incorrectly leaves napi->skb referencing an skb after it is freed with dev_kfree_skb_any(). This can result in a subsequent call to napi_get_frags returning a dangling pointer. Fix this by clearing napi->skb before the skb is freed. Fixes: 9b8dd5e5ea48 ("gve: DQO: Add RX path") Cc: [email protected] Reported-by: Shailend Chand <[email protected]> Signed-off-by: Ziwei Xiao <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Shailend Chand <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-13ionic: fix use after netif_napi_del()Taehee Yoo1-3/+1
When queues are started, netif_napi_add() and napi_enable() are called. If there are 4 queues and only 3 queues are used for the current configuration, only 3 queues' napi should be registered and enabled. The ionic_qcq_enable() checks whether the .poll pointer is not NULL for enabling only the using queue' napi. Unused queues' napi will not be registered by netif_napi_add(), so the .poll pointer indicates NULL. But it couldn't distinguish whether the napi was unregistered or not because netif_napi_del() doesn't reset the .poll pointer to NULL. So, ionic_qcq_enable() calls napi_enable() for the queue, which was unregistered by netif_napi_del(). Reproducer: ethtool -L <interface name> rx 1 tx 1 combined 0 ethtool -L <interface name> rx 0 tx 0 combined 1 ethtool -L <interface name> rx 0 tx 0 combined 4 Splat looks like: kernel BUG at net/core/dev.c:6666! Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16 Workqueue: events ionic_lif_deferred_work [ionic] RIP: 0010:napi_enable+0x3b/0x40 Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28 RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20 FS: 0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <TASK> ? die+0x33/0x90 ? do_trap+0xd9/0x100 ? napi_enable+0x3b/0x40 ? do_error_trap+0x83/0xb0 ? napi_enable+0x3b/0x40 ? napi_enable+0x3b/0x40 ? exc_invalid_op+0x4e/0x70 ? napi_enable+0x3b/0x40 ? asm_exc_invalid_op+0x16/0x20 ? napi_enable+0x3b/0x40 ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] process_one_work+0x145/0x360 worker_thread+0x2bb/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0xcc/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2d/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling") Signed-off-by: Taehee Yoo <[email protected]> Reviewed-by: Brett Creeley <[email protected]> Reviewed-by: Shannon Nelson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-13Revert "igc: fix a log entry using uninitialized netdev"Sasha Neftin1-3/+2
This reverts commit 86167183a17e03ec77198897975e9fdfbd53cb0b. igc_ptp_init() needs to be called before igc_reset(), otherwise kernel crash could be observed. Following the corresponding discussion [1] and [2] revert this commit. Link: https://lore.kernel.org/all/[email protected]/ [1] Link: https://lore.kernel.org/all/[email protected]/ [2] Fixes: 86167183a17e ("igc: fix a log entry using uninitialized netdev") Signed-off-by: Sasha Neftin <[email protected]> Tested-by: Naama Meir <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-13drm/xe: move disable_c6 callRiana Tauro2-6/+7
disable c6 called in guc_pc_fini_hw is unreachable. GuC PC init returns earlier if skip_guc_pc is true and never registers the finish call thus making disable_c6 unreachable. move this call to gt idle. v2: rebase v3: add fixes tag (Himal) Fixes: 975e4a3795d4 ("drm/xe: Manually setup C6 when skip_guc_pc is set") Signed-off-by: Riana Tauro <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Rodrigo Vivi <[email protected]> (cherry picked from commit 6800e63cf97bae62bca56d8e691544540d945f53) Signed-off-by: Thomas Hellström <[email protected]>
2024-06-13drm/xe: flush engine buffers before signalling user fence on all enginesAndrzej Hajda1-2/+16
Tests show that user fence signalling requires kind of write barrier, otherwise not all writes performed by the workload will be available to userspace. It is already done for render and compute, we need it also for the rest: video, gsc, copy. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Andrzej Hajda <[email protected]> Reviewed-by: Thomas Hellström <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 3ad7d18c5dad75ed38098c7cc3bc9594b4701399) Signed-off-by: Thomas Hellström <[email protected]>
2024-06-13drm/xe/pf: Assert LMEM provisioning is done only on DGFXMichal Wajdeczko1-2/+13
The Local Memory (aka VRAM) is only available on DGFX platforms. We shouldn't attempt to provision VFs with LMEM or attempt to update the LMTT on non-DGFX platforms. Add missing asserts that would enforce that and fix release code that could crash on iGFX due to uninitialized LMTT. Fixes: 0698ff57bf32 ("drm/xe/pf: Update the LMTT when freeing VF GT config") Signed-off-by: Michal Wajdeczko <[email protected]> Cc: Piotr Piórkowski <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit b321cb83a375bcc18cd0a4b62bdeaf6905cca769) Signed-off-by: Thomas Hellström <[email protected]>
2024-06-13drm/xe/xe_gt_idle: use GT forcewake domain assertionRiana Tauro1-1/+1
The rc6 registers used in disable_c6 function belong to the GT forcewake domain. Hence change the forcewake assertion to check GT forcewake domain. v2: add fixes tag (Himal) Fixes: 975e4a3795d4 ("drm/xe: Manually setup C6 when skip_guc_pc is set") Signed-off-by: Riana Tauro <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Reviewed-by: Himal Prasad Ghimiray <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Rodrigo Vivi <[email protected]> (cherry picked from commit 21b708554648177a0078962c31629bce31ef5d83) Signed-off-by: Thomas Hellström <[email protected]>
2024-06-13ACPI: EC: Evaluate orphan _REG under EC deviceRafael J. Wysocki4-5/+62
After starting to install the EC address space handler at the ACPI namespace root, if there is an "orphan" _REG method in the EC device's scope, it will not be evaluated any more. This breaks EC operation regions on some systems, like Asus gu605. To address this, use a wrapper around an existing ACPICA function to look for an "orphan" _REG method in the EC device scope and evaluate it if present. Fixes: 60fa6ae6e6d0 ("ACPI: EC: Install address space handler at the namespace root") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218945 Reported-by: VitaliiT <[email protected]> Tested-by: VitaliiT <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-06-13iommu/amd: Fix panic accessing amd_iommu_enable_faultingDimitri Sivanich1-1/+1
This fixes a bug introduced by commit d74169ceb0d2 ("iommu/vt-d: Allocate DMAR fault interrupts locally"). The panic happens when amd_iommu_enable_faulting is called from CPUHP_AP_ONLINE_DYN context. Fixes: d74169ceb0d2 ("iommu/vt-d: Allocate DMAR fault interrupts locally") Signed-off-by: Dimitri Sivanich <[email protected]> Tested-by: Yi Zhang <[email protected]> Reviewed-by: Jerry Snitselaar <[email protected]> Reviewed-by: Vasant Hegde <[email protected]> Link: https://lore.kernel.org/r/ZljHE/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-06-12Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linuxLinus Torvalds1-2/+9
Pull ARM and clkdev fixes from Russell King: - Fix clkdev - erroring out on long strings causes boot failures, so don't do this. Still warn about the over-sized strings (which will never match and thus their registration with clkdev is useless) - Fix for ftrace with frame pointer unwinder with recent GCC changing the way frames are stacked. * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: ARM: 9405/1: ftrace: Don't assume stack frames are contiguous in memory clkdev: don't fail clkdev_alloc() if over-sized
2024-06-12vfio/pci: Insert full vma on mmap'd MMIO faultAlex Williamson1-2/+17
In order to improve performance of typical scenarios we can try to insert the entire vma on fault. This accelerates typical cases, such as when the MMIO region is DMA mapped by QEMU. The vfio_iommu_type1 driver will fault in the entire DMA mapped range through fixup_user_fault(). In synthetic testing, this improves the time required to walk a PCI BAR mapping from userspace by roughly 1/3rd. This is likely an interim solution until vmf_insert_pfn_{pmd,pud}() gain support for pfnmaps. Suggested-by: Yan Zhao <[email protected]> Link: https://lore.kernel.org/all/Zl6XdUkt%[email protected]/ Reviewed-by: Yan Zhao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2024-06-12nbd: Remove __force castsChristoph Hellwig1-29/+22
Make it again possible for sparse to verify that blk_status_t and Unix error codes are used in the proper context by making nbd_send_cmd() return a blk_status_t instead of an integer. No functionality has been changed. Signed-off-by: Christoph Hellwig <[email protected]> [ bvanassche: added description and made two small formatting changes ] Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-06-12regulator: axp20x: AXP717: fix LDO supply rails and off-by-onesAndre Przywara1-14/+19
The X-Powers AXP717 PMIC has separate input supply pins for each group of LDOs, so they are not all using the same DCDC1 input, as described currently. Replace the "supply" member of each LDO description with the respective group supply name, so that the supply dependencies can be correctly described in the devicetree. Also fix two off-by-ones in the regulator macros, after some double checking the numbers against the datasheet. This uncovered a bug in the datasheet: add a comment to document this. Fixes: d2ac3df75c3a ("regulator: axp20x: add support for the AXP717") Signed-off-by: Andre Przywara <[email protected]> Reviewed-by: John Watts <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2024-06-12nvmet: always initialize cqe.resultDaniel Wagner3-9/+1
The spec doesn't mandate that the first two double words (aka results) for the command queue entry need to be set to 0 when they are not used (not specified). Though, the target implemention returns 0 for TCP and FC but not for RDMA. Let's make RDMA behave the same and thus explicitly initializing the result field. This prevents leaking any data from the stack. Signed-off-by: Daniel Wagner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Keith Busch <[email protected]>
2024-06-12nvmet-passthru: propagate status from id override functionsDaniel Wagner1-3/+3
The id override functions return a status which is not propagated to the caller. Fixes: c1fef73f793b ("nvmet: add passthru code to process commands") Signed-off-by: Daniel Wagner <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Keith Busch <[email protected]>
2024-06-12nvme: avoid double free special payloadChunguang Xu1-0/+1
If a discard request needs to be retried, and that retry may fail before a new special payload is added, a double free will result. Clear the RQF_SPECIAL_LOAD when the request is cleaned. Signed-off-by: Chunguang Xu <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Max Gurtovoy <[email protected]> Signed-off-by: Keith Busch <[email protected]>
2024-06-12thermal/drivers/mediatek/lvts_thermal: Return error in case of invalid efuse ↵Julien Panis1-1/+5
data This patch prevents from registering thermal entries and letting the driver misbehave if efuse data is invalid. A device is not properly calibrated if the golden temperature is zero. Fixes: f5f633b18234 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver") Signed-off-by: Julien Panis <[email protected]> Reviewed-by: Nicolas Pitre <[email protected]> Reviewed-by: AngeloGioacchino Del Regno <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Daniel Lezcano <[email protected]>
2024-06-12block: unmap and free user mapped integrity via submitterAnuj Gupta1-4/+11
The user mapped intergity is copied back and unpinned by bio_integrity_free which is a low-level routine. Do it via the submitter rather than doing it in the low-level block layer code, to split the submitter side from the consumer side of the bio. Signed-off-by: Anuj Gupta <[email protected]> Signed-off-by: Kanchan Joshi <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Reviewed-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-06-12mailmap: Add my outdated addresses to the map fileAndy Shevchenko1-1/+1
There is a couple of outdated addresses that are still visible in the Git history, add them to .mailmap. While at it, replace one in the comment. Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2024-06-12i2c: designware: Fix the functionality flags of the slave-only interfaceJean Delvare1-1/+1
When an I2C adapter acts only as a slave, it should not claim to support I2C master capabilities. Fixes: 5b6d721b266a ("i2c: designware: enable SLAVE in platform module") Signed-off-by: Jean Delvare <[email protected]> Cc: Luis Oliveira <[email protected]> Cc: Jarkko Nikula <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Mika Westerberg <[email protected]> Cc: Jan Dabros <[email protected]> Cc: Andi Shyti <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Acked-by: Jarkko Nikula <[email protected]> Tested-by: Jarkko Nikula <[email protected]> Signed-off-by: Andi Shyti <[email protected]>
2024-06-12i2c: at91: Fix the functionality flags of the slave-only interfaceJean Delvare1-2/+1
When an I2C adapter acts only as a slave, it should not claim to support I2C master capabilities. Fixes: 9d3ca54b550c ("i2c: at91: added slave mode support") Signed-off-by: Jean Delvare <[email protected]> Cc: Juergen Fitschen <[email protected]> Cc: Ludovic Desroches <[email protected]> Cc: Codrin Ciubotariu <[email protected]> Cc: Andi Shyti <[email protected]> Cc: Nicolas Ferre <[email protected]> Cc: Alexandre Belloni <[email protected]> Cc: Claudiu Beznea <[email protected]> Signed-off-by: Andi Shyti <[email protected]>
2024-06-12regulator: bd71815: fix ramp valuesKalle Niemi1-1/+1
Ramp values are inverted. This caused wrong values written to register when ramp values were defined in device tree. Invert values in table to fix this. Signed-off-by: Kalle Niemi <[email protected]> Fixes: 1aad39001e85 ("regulator: Support ROHM BD71815 regulators") Reviewed-by: Matti Vaittinen <[email protected]> Link: https://lore.kernel.org/r/ZmmJXtuVJU6RgQAH@latitude5580 Signed-off-by: Mark Brown <[email protected]>
2024-06-12cpufreq: intel_pstate: Check turbo_is_disabled() in store_no_turbo()Rafael J. Wysocki1-7/+12
After recent changes in intel_pstate, global.turbo_disabled is only set at the initialization time and never changed. However, it turns out that on some systems the "turbo disabled" bit in MSR_IA32_MISC_ENABLE, the initial state of which is reflected by global.turbo_disabled, can be flipped later and there should be a way to take that into account (other than checking that MSR every time the driver runs which is costly and useless overhead on the vast majority of systems). For this purpose, notice that before the changes in question, store_no_turbo() contained a turbo_is_disabled() check that was used for updating global.turbo_disabled if the "turbo disabled" bit in MSR_IA32_MISC_ENABLE had been flipped and that functionality can be restored. Then, users will be able to reset global.turbo_disabled by writing 0 to no_turbo which used to work before on systems with flipping "turbo disabled" bit. This guarantees the driver state to remain in sync, but READ_ONCE() annotations need to be added in two places where global.turbo_disabled is accessed locklessly, so modify the driver to make that happen. Fixes: 0940f1a8011f ("cpufreq: intel_pstate: Do not update global.turbo_disabled after initialization") Closes: https://lore.kernel.org/linux-pm/[email protected] Suggested-by: Srinivas Pandruvada <[email protected]> Reported-by: Xi Ruoyao <[email protected]> Tested-by: Xi Ruoyao <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-06-12wifi: iwlwifi: scan: correctly check if PSC listen period is neededAyala Beker1-1/+1
The flags variable is incorrectly checked while it is still cleared and has not been assigned any value yet. Fix it. Fixes: a615323f7f90 ("wifi: iwlwifi: mvm: always apply 6 GHz probe limitations") Signed-off-by: Ayala Beker <[email protected]> Reviewed-by: Benjamin Berg <[email protected]> Signed-off-by: Miri Korenblit <[email protected]> Link: https://msgid.link/20240605140556.291c33f9a283.Id651fe69828aebce177b49b2316c5780906f1b37@changeid Signed-off-by: Johannes Berg <[email protected]>
2024-06-12wifi: iwlwifi: mvm: fix ROC version checkShaul Triebitz1-1/+1
For using the ROC command, check that the ROC version is *greater or equal* to 3, rather than *equal* to 3. The ROC version was added to the TLV starting from version 3. Fixes: 67ac248e4db0 ("wifi: iwlwifi: mvm: implement ROC version 3") Signed-off-by: Shaul Triebitz <[email protected]> Signed-off-by: Miri Korenblit <[email protected]> Link: https://msgid.link/20240605140327.93d86cd188ad.Iceadef5a2f3cfa4a127e94a0405eba8342ec89c1@changeid Signed-off-by: Johannes Berg <[email protected]>
2024-06-12wifi: iwlwifi: mvm: unlock mvm mutexShaul Triebitz1-0/+2
Unlock the mvm mutex before returning from a function with the mutex locked. Fixes: a1efeb823084 ("wifi: iwlwifi: mvm: Block EMLSR when a p2p/softAP vif is active") Signed-off-by: Shaul Triebitz <[email protected]> Signed-off-by: Miri Korenblit <[email protected]> Link: https://msgid.link/20240605140327.96cb956db4af.Ib468cbad38959910977b5581f6111ab0afae9880@changeid Signed-off-by: Johannes Berg <[email protected]>
2024-06-12drm/mediatek: Call drm_atomic_helper_shutdown() at shutdown timeDouglas Anderson1-0/+8
Based on grepping through the source code this driver appears to be missing a call to drm_atomic_helper_shutdown() at system shutdown time. Among other things, this means that if a panel is in use that it won't be cleanly powered off at system shutdown time. The fact that we should call drm_atomic_helper_shutdown() in the case of OS shutdown/restart comes straight out of the kernel doc "driver instance overview" in drm_drv.c. This driver users the component model and shutdown happens in the base driver. The "drvdata" for this driver will always be valid if shutdown() is called and as of commit 2a073968289d ("drm/atomic-helper: drm_atomic_helper_shutdown(NULL) should be a noop") we don't need to confirm that "drm" is non-NULL. Suggested-by: Maxime Ripard <[email protected]> Reviewed-by: Maxime Ripard <[email protected]> Reviewed-by: Fei Shao <[email protected]> Tested-by: Fei Shao <[email protected]> Signed-off-by: Douglas Anderson <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/20240611102744.v2.1.I2b014f90afc4729b6ecc7b5ddd1f6dedcea4625b@changeid
2024-06-12drm: renesas: shmobile: Call drm_atomic_helper_shutdown() at shutdown timeDouglas Anderson1-0/+8
Based on grepping through the source code, this driver appears to be missing a call to drm_atomic_helper_shutdown() at system shutdown time. This is important because drm_atomic_helper_shutdown() will cause panels to get disabled cleanly which may be important for their power sequencing. Future changes will remove any custom powering off in individual panel drivers so the DRM drivers need to start getting this right. The fact that we should call drm_atomic_helper_shutdown() in the case of OS shutdown comes straight out of the kernel doc "driver instance overview" in drm_drv.c. [geert: shmob_drm_remove() already calls drm_atomic_helper_shutdown] Suggested-by: Maxime Ripard <[email protected]> Signed-off-by: Douglas Anderson <[email protected]> Link: https://lore.kernel.org/r/20230901164111.RFT.15.Iaf638a1d4c8b3c307a6192efabb4cbb06b195f15@changeid [geert: s/drm_helper_force_disable_all/drm_atomic_helper_shutdown/] Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Laurent Pinchart <[email protected]> Reviewed-by: Sui Jingfeng <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/17c6a5a668e5975f871b77fb1fca6711a0799d9e.1718176895.git.geert+renesas@glider.be
2024-06-12xhci: Handle TD clearing for multiple streams caseHector Martin2-11/+44
When multiple streams are in use, multiple TDs might be in flight when an endpoint is stopped. We need to issue a Set TR Dequeue Pointer for each, to ensure everything is reset properly and the caches cleared. Change the logic so that any N>1 TDs found active for different streams are deferred until after the first one is processed, calling xhci_invalidate_cancelled_tds() again from xhci_handle_cmd_set_deq() to queue another command until we are done with all of them. Also change the error/"should never happen" paths to ensure we at least clear any affected TDs, even if we can't issue a command to clear the hardware cache, and complain loudly with an xhci_warn() if this ever happens. This problem case dates back to commit e9df17eb1408 ("USB: xhci: Correct assumptions about number of rings per endpoint.") early on in the XHCI driver's life, when stream support was first added. It was then identified but not fixed nor made into a warning in commit 674f8438c121 ("xhci: split handling halted endpoints into two steps"), which added a FIXME comment for the problem case (without materially changing the behavior as far as I can tell, though the new logic made the problem more obvious). Then later, in commit 94f339147fc3 ("xhci: Fix failure to give back some cached cancelled URBs."), it was acknowledged again. [Mathias: commit 94f339147fc3 ("xhci: Fix failure to give back some cached cancelled URBs.") was a targeted regression fix to the previously mentioned patch. Users reported issues with usb stuck after unmounting/disconnecting UAS devices. This rolled back the TD clearing of multiple streams to its original state.] Apparently the commit author was aware of the problem (yet still chose to submit it): It was still mentioned as a FIXME, an xhci_dbg() was added to log the problem condition, and the remaining issue was mentioned in the commit description. The choice of making the log type xhci_dbg() for what is, at this point, a completely unhandled and known broken condition is puzzling and unfortunate, as it guarantees that no actual users would see the log in production, thereby making it nigh undebuggable (indeed, even if you turn on DEBUG, the message doesn't really hint at there being a problem at all). It took me *months* of random xHC crashes to finally find a reliable repro and be able to do a deep dive debug session, which could all have been avoided had this unhandled, broken condition been actually reported with a warning, as it should have been as a bug intentionally left in unfixed (never mind that it shouldn't have been left in at all). > Another fix to solve clearing the caches of all stream rings with > cancelled TDs is needed, but not as urgent. 3 years after that statement and 14 years after the original bug was introduced, I think it's finally time to fix it. And maybe next time let's not leave bugs unfixed (that are actually worse than the original bug), and let's actually get people to review kernel commits please. Fixes xHC crashes and IOMMU faults with UAS devices when handling errors/faults. Easiest repro is to use `hdparm` to mark an early sector (e.g. 1024) on a disk as bad, then `cat /dev/sdX > /dev/null` in a loop. At least in the case of JMicron controllers, the read errors end up having to cancel two TDs (for two queued requests to different streams) and the one that didn't get cleared properly ends up faulting the xHC entirely when it tries to access DMA pages that have since been unmapped, referred to by the stale TDs. This normally happens quickly (after two or three loops). After this fix, I left the `cat` in a loop running overnight and experienced no xHC failures, with all read errors recovered properly. Repro'd and tested on an Apple M1 Mac Mini (dwc3 host). On systems without an IOMMU, this bug would instead silently corrupt freed memory, making this a security bug (even on systems with IOMMUs this could silently corrupt memory belonging to other USB devices on the same controller, so it's still a security bug). Given that the kernel autoprobes partition tables, I'm pretty sure a malicious USB device pretending to be a UAS device and reporting an error with the right timing could deliberately trigger a UAF and write to freed memory, with no user action. [Mathias: Commit message and code comment edit, original at:] https://lore.kernel.org/linux-usb/[email protected]/ Fixes: e9df17eb1408 ("USB: xhci: Correct assumptions about number of rings per endpoint.") Fixes: 94f339147fc3 ("xhci: Fix failure to give back some cached cancelled URBs.") Fixes: 674f8438c121 ("xhci: split handling halted endpoints into two steps") Cc: [email protected] Cc: [email protected] Reviewed-by: Neal Gompa <[email protected]> Signed-off-by: Hector Martin <[email protected]> Signed-off-by: Mathias Nyman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-06-12xhci: Apply broken streams quirk to Etron EJ188 xHCI hostKuangyi Chiang1-1/+3
As described in commit 8f873c1ff4ca ("xhci: Blacklist using streams on the Etron EJ168 controller"), EJ188 have the same issue as EJ168, where Streams do not work reliable on EJ188. So apply XHCI_BROKEN_STREAMS quirk to EJ188 as well. Cc: [email protected] Signed-off-by: Kuangyi Chiang <[email protected]> Signed-off-by: Mathias Nyman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-06-12xhci: Apply reset resume quirk to Etron EJ188 xHCI hostKuangyi Chiang1-0/+5
As described in commit c877b3b2ad5c ("xhci: Add reset on resume quirk for asrock p67 host"), EJ188 have the same issue as EJ168, where completely dies on resume. So apply XHCI_RESET_ON_RESUME quirk to EJ188 as well. Cc: [email protected] Signed-off-by: Kuangyi Chiang <[email protected]> Signed-off-by: Mathias Nyman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-06-12xhci: Set correct transferred length for cancelled bulk transfersMathias Nyman1-3/+2
The transferred length is set incorrectly for cancelled bulk transfer TDs in case the bulk transfer ring stops on the last transfer block with a 'Stop - Length Invalid' completion code. length essentially ends up being set to the requested length: urb->actual_length = urb->transfer_buffer_length Length for 'Stop - Length Invalid' cases should be the sum of all TRB transfer block lengths up to the one the ring stopped on, _excluding_ the one stopped on. Fix this by always summing up TRB lengths for 'Stop - Length Invalid' bulk cases. This issue was discovered by Alan Stern while debugging https://bugzilla.kernel.org/show_bug.cgi?id=218890, but does not solve that bug. Issue is older than 4.10 kernel but fix won't apply to those due to major reworks in that area. Tested-by: Pierre Tomon <[email protected]> Cc: [email protected] # v4.10+ Cc: Alan Stern <[email protected]> Signed-off-by: Mathias Nyman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2024-06-11net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs ↵Xiaolei Wang1-14/+11
parameters The current cbs parameter depends on speed after uplinking, which is not needed and will report a configuration error if the port is not initially connected. The UAPI exposed by tc-cbs requires userspace to recalculate the send slope anyway, because the formula depends on port_transmit_rate (see man tc-cbs), which is not an invariant from tc's perspective. Therefore, we use offload->sendslope and offload->idleslope to derive the original port_transmit_rate from the CBS formula. Fixes: 1f705bc61aee ("net: stmmac: Add support for CBS QDISC") Signed-off-by: Xiaolei Wang <[email protected]> Reviewed-by: Wojciech Drewek <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-11gve: ignore nonrelevant GSO type bits when processing TSO headersJoshua Washington1-15/+5
TSO currently fails when the skb's gso_type field has more than one bit set. TSO packets can be passed from userspace using PF_PACKET, TUNTAP and a few others, using virtio_net_hdr (e.g., PACKET_VNET_HDR). This includes virtualization, such as QEMU, a real use-case. The gso_type and gso_size fields as passed from userspace in virtio_net_hdr are not trusted blindly by the kernel. It adds gso_type |= SKB_GSO_DODGY to force the packet to enter the software GSO stack for verification. This issue might similarly come up when the CWR bit is set in the TCP header for congestion control, causing the SKB_GSO_TCP_ECN gso_type bit to be set. Fixes: a57e5de476be ("gve: DQO: Add TX path") Signed-off-by: Joshua Washington <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Suggested-by: Eric Dumazet <[email protected]> Acked-by: Andrei Vagin <[email protected]> v2 - Remove unnecessary comments, remove line break between fixes tag and signoffs. v3 - Add back unrelated empty line removal. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-11scsi: mpi3mr: Fix ATA NCQ priority supportDamien Le Moal5-28/+87
The function mpi3mr_qcmd() of the mpi3mr driver is able to indicate to the HBA if a read or write command directed at an ATA device should be translated to an NCQ read/write command with the high prioiryt bit set when the request uses the RT priority class and the user has enabled NCQ priority through sysfs. However, unlike the mpt3sas driver, the mpi3mr driver does not define the sas_ncq_prio_supported and sas_ncq_prio_enable sysfs attributes, so the ncq_prio_enable field of struct mpi3mr_sdev_priv_data is never actually set and NCQ Priority cannot ever be used. Fix this by defining these missing atributes to allow a user to check if an ATA device supports NCQ priority and to enable/disable the use of NCQ priority. To do this, lift the function scsih_ncq_prio_supp() out of the mpt3sas driver and make it the generic SCSI SAS transport function sas_ata_ncq_prio_supported(). Nothing in that function is hardware specific, so this function can be used in both the mpt3sas driver and the mpi3mr driver. Reported-by: Scott McCoy <[email protected]> Fixes: 023ab2a9b4ed ("scsi: mpi3mr: Add support for queue command processing") Cc: [email protected] Signed-off-by: Damien Le Moal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Niklas Cassel <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2024-06-11scsi: ufs: core: Quiesce request queues before checking pending cmdsZiqi Chen1-3/+3
In ufshcd_clock_scaling_prepare(), after SCSI layer is blocked, ufshcd_pending_cmds() is called to check whether there are pending transactions or not. And only if there are no pending transactions can we proceed to kickstart the clock scaling sequence. ufshcd_pending_cmds() traverses over all SCSI devices and calls sbitmap_weight() on their budget_map. sbitmap_weight() can be broken down to three steps: 1. Calculate the nr outstanding bits set in the 'word' bitmap. 2. Calculate the nr outstanding bits set in the 'cleared' bitmap. 3. Subtract the result from step 1 by the result from step 2. This can lead to a race condition as outlined below: Assume there is one pending transaction in the request queue of one SCSI device, say sda, and the budget token of this request is 0, the 'word' is 0x1 and the 'cleared' is 0x0. 1. When step 1 executes, it gets the result as 1. 2. Before step 2 executes, block layer tries to dispatch a new request to sda. Since the SCSI layer is blocked, the request cannot pass through SCSI but the block layer would do budget_get() and budget_put() to sda's budget map regardless, so the 'word' has become 0x3 and 'cleared' has become 0x2 (assume the new request got budget token 1). 3. When step 2 executes, it gets the result as 1. 4. When step 3 executes, it gets the result as 0, meaning there is no pending transactions, which is wrong. Thread A Thread B ufshcd_pending_cmds() __blk_mq_sched_dispatch_requests() | | sbitmap_weight(word) | | scsi_mq_get_budget() | | | scsi_mq_put_budget() | | sbitmap_weight(cleared) ... When this race condition happens, the clock scaling sequence is started with transactions still in flight, leading to subsequent hibernate enter failure, broken link, task abort and back to back error recovery. Fix this race condition by quiescing the request queues before calling ufshcd_pending_cmds() so that block layer won't touch the budget map when ufshcd_pending_cmds() is working on it. In addition, remove the SCSI layer blocking/unblocking to reduce redundancies and latencies. Fixes: 8d077ede48c1 ("scsi: ufs: Optimize the command queueing code") Co-developed-by: Can Guo <[email protected]> Signed-off-by: Can Guo <[email protected]> Signed-off-by: Ziqi Chen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2024-06-11scsi: core: Disable CDL by defaultDamien Le Moal1-0/+7
For SCSI devices supporting the Command Duration Limits feature set, the user can enable/disable this feature use through the sysfs device attribute "cdl_enable". This attribute modification triggers a call to scsi_cdl_enable() to enable and disable the feature for ATA devices and set the scsi device cdl_enable field to the user provided bool value. For SCSI devices supporting CDL, the feature set is always enabled and scsi_cdl_enable() is reduced to setting the cdl_enable field. However, for ATA devices, a drive may spin-up with the CDL feature enabled by default. But the SCSI device cdl_enable field is always initialized to false (CDL disabled), regardless of the actual device CDL feature state. For ATA devices managed by libata (or libsas), libata-core always disables the CDL feature set when the device is attached, thus syncing the state of the CDL feature on the device and of the SCSI device cdl_enable field. However, for ATA devices connected to a SAS HBA, the CDL feature is not disabled on scan for ATA devices that have this feature enabled by default, leading to an inconsistent state of the feature on the device with the SCSI device cdl_enable field. Avoid this inconsistency by adding a call to scsi_cdl_enable() in scsi_cdl_check() to make sure that the device-side state of the CDL feature set always matches the scsi device cdl_enable field state. This implies that CDL will always be disabled for ATA devices connected to SAS HBAs, which is consistent with libata/libsas initialization of the device. Reported-by: Scott McCoy <[email protected]> Fixes: 1b22cfb14142 ("scsi: core: Allow enabling and disabling command duration limits") Cc: [email protected] Signed-off-by: Damien Le Moal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Niklas Cassel <[email protected]> Reviewed-by: Igor Pylypiv <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2024-06-11drm/nouveau: remove unused struct 'init_exec'Dr. David Alan Gilbert1-5/+0
'init_exec' is unused since commit cb75d97e9c77 ("drm/nouveau: implement devinit subdev, and new init table parser") Remove it. Signed-off-by: Dr. David Alan Gilbert <[email protected]> Acked-by: Danilo Krummrich <[email protected]> Signed-off-by: Danilo Krummrich <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-06-11cpufreq: amd-pstate: change cpu freq transition delay for some modelsXiaojian Du1-2/+7
Some of AMD ZEN4 APU/CPU have support for adjusting the CPU core clock more quickly and presicely according to CPU work loading. This is advertised by the Fast CPPC x86 feature. This change will only be effective in the *passive mode* of AMD pstate driver. From the test results of different transition delay values, 600us is chosen to make a balance between performance and power consumption. Some test results on AMD Ryzen 7840HS(Phoenix) APU: 1. Tbench (Energy less is better, Throughput more is better, PPW--Performance per Watt more is better) ============= =================== ============== =============== ============== =============== ============== =============== =============== Trans Delay Tbench governor:schedutil, 3-iterations average ============= =================== ============== =============== ============== =============== ============== =============== =============== 1000us Clients 1 2 4 8 12 16 32 Energy/Joules 2010 2804 8768 17171 16170 15132 15027 Throughput/(MB/s) 114 259 1041 3010 3135 4851 4605 PPW 0.0567 0.0923 0.1187 0.1752 0.1938 0.3205 0.3064 600us Clients 1 2 4 8 12 16 32 Energy/Joules 2115 (5.22%) 2388 (-14.84%) 10700(22.03%) 16716 (-2.65%) 15939 (-1.43%) 15053 (-0.52%) 15083 (0.37% ) Throughput/(MB/s) 122 (7.02%) 234 (-9.65% ) 1188 (14.12%) 3003 (-0.23%) 3143 (0.26% ) 4842 (-0.19%) 4603 (-0.04%) PPW 0.0576(1.59%) 0.0979(6.07% ) 0.111(-6.49%) 0.1796(2.51% ) 0.1971(1.70% ) 0.3216(0.34% ) 0.3051(-0.42%) ============= =================== ============== ================ ============= =============== ============== =============== =============== 2.Dbench (Energy less is better, Throughput more is better, PPW--Performance per Watt more is better) ============= =================== ============== =============== ============== =============== ============== =============== =============== Trans Delay Dbench governor:schedutil, 3-iterations average ============= =================== ============== =============== ============== =============== ============== =============== =============== 1000us Clients 1 2 4 8 12 16 32 Energy/Joules 4890 3779 3567 5157 5611 6500 8163 Throughput/(MB/s) 327 167 220 577 775 938 1397 PPW 0.0668 0.0441 0.0616 0.1118 0.1381 0.1443 0.1711 600us Clients 1 2 4 8 12 16 32 Energy/Joules 4915 (0.51%) 4912 (29.98%) 3506 (-1.71%) 4907 (-4.85% ) 5011 (-10.69%) 5672 (-12.74%) 8141 (-0.27%) Throughput/(MB/s) 348 (6.42%) 284 (70.06%) 220 (0.00% ) 518 (-10.23%) 712 (-8.13% ) 854 (-8.96% ) 1475 (5.58% ) PPW 0.0708(5.99%) 0.0578(31.07%) 0.0627(1.79% ) 0.1055(-5.64% ) 0.142(2.82% ) 0.1505(4.30% ) 0.1811(5.84% ) ============= =================== ============== =============== ============== =============== ============== =============== =============== 3.Hackbench(less time is better) ============= =========================== ========================== hackbench governor:schedutil ============= =========================== ========================== Trans Delay Process Mode Ave time(s) Thread Mode Ave time(s) 1000us 14.484 14.484 600us 14.418(-0.46%) 15.41(+6.39%) ============= =========================== ========================== 4.Perf_sched_bench(less time is better) ============= =================== ============== ============== ============== =============== =============== ============= Trans Delay perf_sched_bench governor:schedutil ============= =================== ============== ============== ============== =============== =============== ============= 1000us Groups 1 2 4 8 12 24 AveTime(s) 1.64 2.851 5.878 11.636 16.093 26.395 600us Groups 1 2 4 8 12 24 AveTime(s) 1.69(3.05%) 2.845(-0.21%) 5.843(-0.60%) 11.576(-0.52%) 16.092(-0.01%) 26.32(-0.28%) ============= ================== ============== ============== ============== =============== =============== ============== 5.Sysbench(higher is better) ============= ================== ============== ================= ============== ================ =============== ================= Sysbench governor:schedutil ============= ================== ============== ================= ============== ================ =============== ================= 1000us Thread 1 2 4 8 12 24 Ave events 6020.98 12273.39 24119.82 46171.57 47074.37 47831.72 600us Thread 1 2 4 8 12 24 Ave events 6154.82(2.22%) 12271.63(-0.01%) 24392.5(1.13%) 46117.64(-0.12%) 46852.19(-0.47%) 47678.92(-0.32%) ============= ================== ============== ================= ============== ================ =============== ================= In conclusion, a shorter transition delay of cpu clock will make a quite positive effect to improve PPW on Dbench test, in the meanwhile, keep stable performance on Tbench, Hackbench, Perf_sched_bench and Sysbench. Signed-off-by: Xiaojian Du <[email protected]> Reviewed-by: Perry Yuan <[email protected]> Acked-by: Mario Limonciello <[email protected]>
2024-06-11thermal: gov_step_wise: Restore passive polling managementRafael J. Wysocki1-0/+17
Consider a thermal zone with one passive trip point, a cooling device with 3 states (0, 1, 2) bound to it, passive polling enabled (nonzero passive_delay_jiffies) and no regular polling (polling_delay_jiffies equal to 0) that is managed by the Step-Wise governor. Suppose that the initial state of the cooling device is 0 and the zone temperature is below the trip point to start with. When the trip point is crossed, tz->passive is incremented by the thermal core and the governor's .manage() callback is invoked. It sets 'throttle' to 'true' for the trip in question and get_target_state() returns 1 for the instance corresponding to the cooling device (say that 'upper' and 'lower' are set to 2 and 0 for it, respectively), so its state changes to 1. Passive polling is still active for the zone, so next time the temperature is updated, the governor's .manage() callback will be invoked again. If the temperature is still rising, it will change the state of the cooling device to 2. Now suppose that next time the zone temperature is updated, it falls below the trip point, so tz->passive is decremented for the zone (say it becomes 0 then) and the governor's .manage() callbacks runs. It finds that the temperature trend for the zone is 'falling' and 'throttle' will be set to 'false' for the trip in question, so the cooling device's state will be changed to 1. However, because tz->polling is 0 for the zone, the governor's .manage() callback may not be invoked again for a long time and the cooling device's state will not be reset back to 0. This can happen because commit 042a3d80f118 ("thermal: core: Move passive polling management to the core") removed passive polling management from the Step-Wise governor. Before that change, thermal_zone_trip_update() would bump up tz->passive when changing the target state for a thermal instance from "no target" to a specific value and it would drop tz->passive when changing it back to "no target" which would cause passive polling to be active for the zone until the governor has reset the states of all cooling devices. In particular, in the example above tz->passive would be incremented when changing the state of the cooling device from 0 to 1 and then it would be still nonzero when the state of the cooling device was changed from 2 to 1. To prevent this problem from occurring, restore the passive polling management in the Step-Wise governor by partially reverting the commit in question and update the comment in the restored code to explain its role more clearly. Fixes: 042a3d80f118 ("thermal: core: Move passive polling management to the core") Closes: https://lore.kernel.org/linux-pm/[email protected] Reported-by: Johan Hovold <[email protected]> Tested-by: Johan Hovold <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>