aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-03-06md/raid5: use the atomic queue limit update APIsChristoph Hellwig1-65/+65
Build the queue limits outside the queue and apply them using queue_limits_set. To make the code more obvious also split the queue limits handling into separate helpers. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed--by: Song Liu <[email protected]> Tested-by: Song Liu <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-03-06md/raid1: use the atomic queue limit update APIsChristoph Hellwig1-9/+16
Build the queue limits outside the queue and apply them using queue_limits_set. To make the code more obvious also split the queue limits handling into a separate helper function. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed--by: Song Liu <[email protected]> Tested-by: Song Liu <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-03-06md/raid0: use the atomic queue limit update APIsChristoph Hellwig1-15/+20
Build the queue limits outside the queue and apply them using queue_limits_set. To make the code more obvious also split the queue limits handling into a separate helper function. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed--by: Song Liu <[email protected]> Tested-by: Song Liu <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-03-06md: add queue limit helpersChristoph Hellwig2-0/+48
Add a few helpers that wrap the block queue limits API for use in MD. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed--by: Song Liu <[email protected]> Tested-by: Song Liu <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-03-06md: add a mddev_is_dm helperChristoph Hellwig6-35/+38
Add a helper to check for a DM-mapped MD device instead of using the obfuscated ->gendisk or ->queue NULL checks. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed--by: Song Liu <[email protected]> Tested-by: Song Liu <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-03-06md: add a mddev_add_trace_msg helperChristoph Hellwig6-29/+28
Add a small wrapper around blk_add_trace_msg that hides some argument dereferences and the check for a DM-mapped MD device. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed--by: Song Liu <[email protected]> Tested-by: Song Liu <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-03-06md: add a mddev_trace_remap helperChristoph Hellwig6-37/+17
Add a helper to trace bio remapping that hides some argument dereferences and the check for a DM-mapped MD device. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed--by: Song Liu <[email protected]> Tested-by: Song Liu <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-03-06Merge tag 'vfs-6.8-release.fixes' of ↵Linus Torvalds4-55/+64
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Get rid of copy_mc flag in iov_iter which really only makes sense for the core dumping code so move it out of the generic iov iter code and make it coredump's problem. See the detailed commit description. - Revert fs/aio: Make io_cancel() generate completions again The initial fix here was predicated on the assumption that calling ki_cancel() didn't complete aio requests. However, that turned out to be wrong since the two drivers that actually make use of this set a cancellation function that performs the cancellation correctly. So revert this change. - Ensure that the test for IOCB_AIO_RW always happens before the read from ki_ctx. * tag 'vfs-6.8-release.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: iov_iter: get rid of 'copy_mc' flag fs/aio: Check IOCB_AIO_RW before the struct aio_kiocb conversion Revert "fs/aio: Make io_cancel() generate completions again"
2024-03-06Merge tag 'arm-fixes-6.8-3' of ↵Linus Torvalds16-77/+35
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "These should be the final fixes for the soc tree for 6.8, as usual they mostly deal wtih dts files: - Qualcomm fixes for pcie4 on sc8280xp, a revert of msm8996 mpm support, sm6115 interconnect and sm8650 gpio. - Two fixes for Tegra234 ethernet - A Makefile fix to actually build the allwinner based orange pi zero 2w device tree - Fixes for clocks and reset on imx8mp and a DSI display regression on imx7. The non-DT fixes are: - Firmware fixes addressing a kernel panic in op-tee and a minor regression in microchip/riscv. - A defconfig change to bring back backlight support after a Kconfig change" * tag 'arm-fixes-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: firmware: microchip: Fix over-requested allocation size tee: optee: Fix kernel panic caused by incorrect error handling Revert "arm64: dts: qcom: msm8996: Hook up MPM" arm64: dts: qcom: sc8280xp-x13s: limit pcie4 link speed arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed arm64: dts: imx8mp: Fix LDB clocks property arm64: dts: imx8mp: Fix TC9595 reset GPIO on DH i.MX8M Plus DHCOM SoM MAINTAINERS: Use a proper mailinglist for NXP i.MX development ARM: dts: imx7: remove DSI port endpoints arm64: dts: allwinner: h616: Add Orange Pi Zero 2W to Makefile ARM: imx_v6_v7_defconfig: Restore CONFIG_BACKLIGHT_CLASS_DEVICE arm64: tegra: Fix Tegra234 MGBE power-domains arm64: tegra: Set the correct PHY mode for MGBE arm64: dts: qcom: sm6115: Fix missing interconnect-names arm64: dts: qcom: sm8650-mtp: add gpio74 as reserved gpio arm64: dts: qcom: sm8650-qrd: add gpio74 as reserved gpio
2024-03-06Merge tag 'v6.8-p6' of ↵Linus Torvalds2-19/+19
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "Fix potential use-after-frees in rk3288 and sun8i-ce" * tag 'v6.8-p6' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: rk3288 - Fix use after free in unprepare crypto: sun8i-ce - Fix use after free in unprepare
2024-03-06bcache: move calculation of stripe_size and io_opt into bcache_device_initChristoph Hellwig1-6/+5
bcache currently calculates the stripe size for the non-cached_dev case directly in bcache_device_init, but for the cached_dev case it does it in the caller. Consolidate it in one places, which also enables setting the io_opt queue_limit before allocating the gendisk so that it can be passed in instead of changing the limit just after the allocation. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Coly Li <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06virtio_blk: Do not use disk_set_max_open/active_zones()Damien Le Moal1-2/+2
In virtblk_read_zoned_limits(), setting a zoned block device maximum number of open and active zones using the functions disk_set_max_open_zones() and disk_set_max_active_zones() is incorrect as setting the limits for the request queue is now done atomically when the gendisk is created (with blk_mq_alloc_disk()). The value set by the disk_set_max_open/active_zones() functions will be overwritten. Fix this by setting the maximum number of open and active zones directly in the queue_limits structure passed to virtblk_read_zoned_limits(). Fixes: 8b837256560c ("virtio_blk: pass queue_limits to blk_mq_alloc_disk") Signed-off-by: Damien Le Moal <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06aoe: fix the potential use-after-free problem in aoecmd_cfg_pktsChun-Yi Lee2-6/+7
This patch is against CVE-2023-6270. The description of cve is: A flaw was found in the ATA over Ethernet (AoE) driver in the Linux kernel. The aoecmd_cfg_pkts() function improperly updates the refcnt on `struct net_device`, and a use-after-free can be triggered by racing between the free on the struct and the access through the `skbtxq` global queue. This could lead to a denial of service condition or potential code execution. In aoecmd_cfg_pkts(), it always calls dev_put(ifp) when skb initial code is finished. But the net_device ifp will still be used in later tx()->dev_queue_xmit() in kthread. Which means that the dev_put(ifp) should NOT be called in the success path of skb initial code in aoecmd_cfg_pkts(). Otherwise tx() may run into use-after-free because the net_device is freed. This patch removed the dev_put(ifp) in the success path in aoecmd_cfg_pkts(), and added dev_put() after skb xmit in tx(). Link: https://nvd.nist.gov/vuln/detail/CVE-2023-6270 Fixes: 7562f876cd93 ("[NET]: Rework dev_base via list_head (v3)") Signed-off-by: Chun-Yi Lee <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06block: move capacity validation to blkpg_do_ioctl()Li Lingfeng2-12/+8
Commit 6d4e80db4ebe ("block: add capacity validation in bdev_add_partition()") add check of partition's start and end sectors to prevent exceeding the size of the disk when adding partitions. However, there is still no check for resizing partitions now. Move the check to blkpg_do_ioctl() to cover resizing partitions. Signed-off-by: Li Lingfeng <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06block: prevent division by zero in blk_rq_stat_sum()Roman Smirnov1-1/+1
The expression dst->nr_samples + src->nr_samples may have zero value on overflow. It is necessary to add a check to avoid division by zero. Found by Linux Verification Center (linuxtesting.org) with Svace. Signed-off-by: Roman Smirnov <[email protected]> Reviewed-by: Sergey Shtylyov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06drbd: atomically update queue limits in drbd_reconsider_queue_parametersChristoph Hellwig1-73/+46
Switch drbd_reconsider_queue_parameters to set up the queue parameters in an on-stack queue_limits structure and apply the atomically. Remove various helpers that have become so trivial that they can be folded into drbd_reconsider_queue_parameters. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06drbd: split out a drbd_discard_supported helperChristoph Hellwig1-8/+17
Add a helper to check if discard is supported for a given connection / backing device combination. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Philipp Reisner <[email protected]> Reviewed-by: Lars Ellenberg <[email protected]> Tested-by: Christoph Böhmwalder <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06drbd: don't set max_write_zeroes_sectors in decide_on_discard_supportChristoph Hellwig1-1/+0
fixup_write_zeroes always overrides the max_write_zeroes_sectors value a little further down the callchain, so don't bother to setup a limit in decide_on_discard_support. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Philipp Reisner <[email protected]> Reviewed-by: Lars Ellenberg <[email protected]> Tested-by: Christoph Böhmwalder <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06drbd: merge drbd_setup_queue_param into drbd_reconsider_queue_parametersChristoph Hellwig1-34/+22
drbd_setup_queue_param is only called by drbd_reconsider_queue_parameters and there is no really clear boundary of responsibilities between the two. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Philipp Reisner <[email protected]> Reviewed-by: Lars Ellenberg <[email protected]> Tested-by: Christoph Böhmwalder <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06drbd: refactor the backing dev max_segments calculationChristoph Hellwig1-8/+17
Factor out a drbd_backing_dev_max_segments helper that checks the backing device limitation. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Philipp Reisner <[email protected]> Reviewed-by: Lars Ellenberg <[email protected]> Tested-by: Christoph Böhmwalder <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06drbd: refactor drbd_reconsider_queue_parametersChristoph Hellwig1-35/+49
Split out a drbd_max_peer_bio_size helper for the peer I/O size, and condense the various checks to a nested min3(..., max())) instead of using a lot of local variables. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06drbd: pass the max_hw_sectors limit to blk_alloc_diskChristoph Hellwig1-4/+9
Pass a queue_limits structure with the max_hw_sectors limit to blk_alloc_disk instead of updating the limit on the allocated gendisk. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06sed-opal: Remove the ret variable from the functionLi kunyu1-4/+2
The ret variable in the function has not yet been effective and can be removed. Signed-off-by: Li kunyu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06sed-opal: Remove unnecessary ‘0’ values from retLi kunyu1-2/+2
ret is assigned first, so it does not need to initialize the assignment. Signed-off-by: Li kunyu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06sed-opal: Remove unnecessary ‘0’ values from errLi zeming1-1/+1
err is assigned first, so it does not need to initialize the assignment. Signed-off-by: Li zeming <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06sed-opal: Remove unnecessary ‘0’ values from errorLi zeming1-2/+2
error is assigned first, so it does not need to initialize the assignment. Signed-off-by: Li zeming <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06block: make block_class constantRicardo B. Marliere3-3/+3
Since commit 43a7206b0963 ("driver core: class: make class_register() take a const *"), the driver core allows for struct class to be in read-only memory, so move the block_class structure to be declared at build time placing it into read-only memory, instead of having to be dynamically allocated at boot time. Cc: Greg Kroah-Hartman <[email protected]> Suggested-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Ricardo B. Marliere <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06Merge tag 'md-6.9-20240305' of ↵Jens Axboe4-40/+208
https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.9/block Pull MD fixes from Song: "This set fixes two issues: 1. dmraid regression since 6.7 kernels. This issue was initially reported in [1]. This set of fix has been reviewed and tested by md and dm folks. 2. raid5 hang since 6.7 kernel, reported in [2]. We haven't got a better fix for this issue yet. This revert is a workaround. It has been applied to 6.7 stable kernels [3], and proved to be affective. We will look more into this issue for a better fix. [1] https://lore.kernel.org/linux-raid/[email protected]/ [2] https://lore.kernel.org/linux-raid/[email protected]/ [3] 87165c64fe1a in linux-6.7.y branch." * tag 'md-6.9-20240305' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: dm-raid: fix lockdep waring in "pers->hot_add_disk" dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape dm-raid: add a new helper prepare_suspend() in md_personality md/dm-raid: don't call md_reap_sync_thread() directly dm-raid: really frozen sync_thread during suspend md: add a new helper reshape_interrupted() md: export helper md_is_rdwr() md: export helpers to stop sync_thread md: don't clear MD_RECOVERY_FROZEN for new dm-raid until resume Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d""
2024-03-06dasd: use the atomic queue limits APIChristoph Hellwig2-18/+24
Pass the constant limits directly to blk_mq_alloc_disk, set the nonrot flag there as well, and then use the commit API to change the transfer size and logical block size dependent values. This relies on the assumption that no I/O can be pending before the devices moves into the ready state and doesn't need extra freezing for changes to the queue limits. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Stefan Haberland <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06dasd: move queue setup to common codeChristoph Hellwig5-77/+42
Most of the code in setup_blk_queue is shared between all disciplines. Move it to common code and leave a method to query the maximum number of transferable blocks, and a flag to indicate discard support. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Stefan Haberland <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06dasd: cleamup dasd_state_basic_to_readyChristoph Hellwig1-28/+26
Reflow dasd_state_basic_to_ready a bit to make it easier to modify. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Stefan Haberland <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06block: Fix page refcounts for unaligned buffers in __bio_release_pages()Tony Battersby1-3/+4
Fix an incorrect number of pages being released for buffers that do not start at the beginning of a page. Fixes: 1b151e2435fc ("block: Remove special-casing of compound pages") Cc: [email protected] Signed-off-by: Tony Battersby <[email protected]> Tested-by: Greg Edwards <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-06Revert "drm/udl: Add ARGB8888 as a format"Douglas Anderson1-1/+0
This reverts commit 95bf25bb9ed5dedb7fb39f76489f7d6843ab0475. Apparently there was a previous discussion about emulation of formats and it was decided XRGB8888 was the only format to support for legacy userspace [1]. Remove ARGB8888. Userspace needs to be fixed to accept XRGB8888. [1] https://lore.kernel.org/r/[email protected] Acked-by: Thomas Zimmermann <[email protected]> Reviewed-by: Javier Martinez Canillas <[email protected]> Signed-off-by: Douglas Anderson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/20240306063721.1.I4a32475190334e1fa4eef4700ecd2787a43c94b5@changeid
2024-03-06phy: qcom-qmp-combo: fix type-c switch registrationJohan Hovold1-4/+4
Due to a long-standing issue in driver core, drivers may not probe defer after having registered child devices to avoid triggering a probe deferral loop (see fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER")). Move registration of the typec switch to after looking up clocks and other resources. Note that PHY creation can in theory also trigger a probe deferral when a 'phy' supply is used. This does not seem to affect the QMP PHY driver but the PHY subsystem should be reworked to address this (i.e. by separating initialisation and registration of the PHY). Fixes: 2851117f8f42 ("phy: qcom-qmp-combo: Introduce orientation switching") Cc: [email protected] # 6.5 Cc: Bjorn Andersson <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Reviewed-by: Bjorn Andersson <[email protected]> Reviewed-by: Dmitry Baryshkov <[email protected]> Acked-by: Vinod Koul <[email protected]> Acked-by: Neil Armstrong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Vinod Koul <[email protected]>
2024-03-06phy: qcom-qmp-combo: fix drm bridge registrationJohan Hovold1-4/+4
Due to a long-standing issue in driver core, drivers may not probe defer after having registered child devices to avoid triggering a probe deferral loop (see fbc35b45f9f6 ("Add documentation on meaning of -EPROBE_DEFER")). This could potentially also trigger a bug in the DRM bridge implementation which does not expect bridges to go away even if device links may avoid triggering this (when enabled). Move registration of the DRM aux bridge to after looking up clocks and other resources. Note that PHY creation can in theory also trigger a probe deferral when a 'phy' supply is used. This does not seem to affect the QMP PHY driver but the PHY subsystem should be reworked to address this (i.e. by separating initialisation and registration of the PHY). Fixes: 35921910bbd0 ("phy: qcom: qmp-combo: switch to DRM_AUX_BRIDGE") Fixes: 1904c3f578dc ("phy: qcom-qmp-combo: Introduce drm_bridge") Cc: [email protected] # 6.5 Cc: Bjorn Andersson <[email protected]> Cc: Dmitry Baryshkov <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Reviewed-by: Neil Armstrong <[email protected]> Reviewed-by: Bjorn Andersson <[email protected]> Reviewed-by: Dmitry Baryshkov <[email protected]> Acked-by: Vinod Koul <[email protected]> Acked-by: Neil Armstrong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Vinod Koul <[email protected]>
2024-03-06nvme: clear caller pointer on identify failureKeith Busch1-1/+4
The memory allocated for the identification is freed on failure. Set it to NULL so the caller doesn't have a pointer to that freed address. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Keith Busch <[email protected]>
2024-03-06nvme: host: fix double-free of struct nvme_id_ns in ns_update_nuse()Shin'ichiro Kawasaki1-5/+2
When nvme_identify_ns() fails, it frees the pointer to the struct nvme_id_ns before it returns. However, ns_update_nuse() calls kfree() for the pointer even when nvme_identify_ns() fails. This results in KASAN double-free, which was observed with blktests nvme/045 with proposed patches [1] on the kernel v6.8-rc7. Fix the double-free by skipping kfree() when nvme_identify_ns() fails. Link: https://lore.kernel.org/linux-block/[email protected]/ [1] Fixes: a1a825ab6a60 ("nvme: add csi, ms and nuse to sysfs") Signed-off-by: Shin'ichiro Kawasaki <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Daniel Wagner <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Keith Busch <[email protected]>
2024-03-06timer/migration: Fix quick check reporting late expiryFrederic Weisbecker1-9/+15
When a CPU is the last active in the hierarchy and it tries to enter into idle, the quick check looking up the next event towards cpuidle heuristics may report a too late expiry, such as in the following scenario: [GRP1:0] migrator = NONE active = NONE nextevt = T0:0, T0:1 / \ [GRP0:0] [GRP0:1] migrator = NONE migrator = NONE active = NONE active = NONE nextevt = T0, T1 nextevt = T2 / \ / \ 0 1 2 3 idle idle idle idle 0) The whole system is idle, and CPU 0 was the last migrator. CPU 0 has a timer (T0), CPU 1 has a timer (T1) and CPU 2 has a timer (T2). The expire order is T0 < T1 < T2. [GRP1:0] migrator = GRP0:0 active = GRP0:0 nextevt = T0:0(i), T0:1 / \ [GRP0:0] [GRP0:1] migrator = CPU0 migrator = NONE active = CPU0 active = NONE nextevt = T0(i), T1 nextevt = T2 / \ / \ 0 1 2 3 active idle idle idle 1) CPU 0 becomes active. The (i) means a now ignored timer. [GRP1:0] migrator = GRP0:0 active = GRP0:0 nextevt = T0:1 / \ [GRP0:0] [GRP0:1] migrator = CPU0 migrator = NONE active = CPU0 active = NONE nextevt = T1 nextevt = T2 / \ / \ 0 1 2 3 active idle idle idle 2) CPU 0 handles remote. No timer actually expired but ignored timers have been cleaned out and their sibling's timers haven't been propagated. As a result the top level's next event is T2 and not T1. 3) CPU 0 tries to enter idle without any global timer enqueued and calls tmigr_quick_check(). The expiry of T2 is returned instead of the expiry of T1. When the quick check returns an expiry that is too late, the cpuidle governor may pick up a C-state that is too deep. This may be result into undesired CPU wake up latency if the next timer is actually close enough. Fix this with assuming that expiries aren't sorted top-down while performing the quick check. Pick up instead the earliest encountered one while walking up the hierarchy. 7ee988770326 ("timers: Implement the hierarchical pull model") Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-03-06drm/i915/panelreplay: Move out psr_init_dpcd() from init_connector()Animesh Manna2-3/+3
Move psr_init_dpcd() from init-connector to connector-detect function. The dpcd probe for checking panel replay capability for external dp connector is causing delay during boot which can be optimized by moving dpcd probe to connector specific detect(). v1: Initial version. v2: Add details in commit description. [Jani] Suggested-by: Ville Syrjälä <[email protected]> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10284 Signed-off-by: Animesh Manna <[email protected]> Fixes: cceeaa312d39 ("drm/i915/panelreplay: Enable panel replay dpcd initialization for DP") Reviewed-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 1cca19bf296fae0636a637b48d195ac6b4d430c9) Signed-off-by: Joonas Lahtinen <[email protected]>
2024-03-06x86/topology: Ignore non-present APIC IDs in a present packageThomas Gleixner1-9/+30
Borislav reported that one of his systems has a broken MADT table which advertises eight present APICs and 24 non-present APICs in the same package. The non-present ones are considered hot-pluggable by the topology evaluation code, which is obviously bogus as there is no way to hot-plug within the same package. As the topology evaluation code accounts for hot-pluggable CPUs in a package, the maximum number of cores per package is computed wrong, which in turn causes the uncore performance counter driver to access non-existing MSRs. It will probably confuse other entities which rely on the maximum number of cores and threads per package too. Cure this by ignoring hot-pluggable APIC IDs within a present package. In theory it would be reasonable to just do this unconditionally, but then there is this thing called reality^Wvirtualization which ruins everything. Virtualization is the only existing user of "physical" hotplug and the virtualization tools allow the above scenario. Whether that is actually in use or not is unknown. As it can be argued that the virtualization case is not affected by the issues which exposed the reported problem, allow the bogosity if the kernel determined that it is running in a VM for now. Fixes: 89b0f15f408f ("x86/cpu/topology: Get rid of cpuinfo::x86_max_cores") Reported-by: Borislav Petkov (AMD) <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/87a5nbvccx.ffs@tglx
2024-03-06firewire: ohci: prevent leak of left-over IRQ on unbindEdmund Raile1-0/+2
Commit 5a95f1ded28691e6 ("firewire: ohci: use devres for requested IRQ") also removed the call to free_irq() in pci_remove(), leading to a leftover irq of devm_request_irq() at pci_disable_msi() in pci_remove() when unbinding the driver from the device remove_proc_entry: removing non-empty directory 'irq/136', leaking at least 'firewire_ohci' Call Trace: ? remove_proc_entry+0x19c/0x1c0 ? __warn+0x81/0x130 ? remove_proc_entry+0x19c/0x1c0 ? report_bug+0x171/0x1a0 ? console_unlock+0x78/0x120 ? handle_bug+0x3c/0x80 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? remove_proc_entry+0x19c/0x1c0 unregister_irq_proc+0xf4/0x120 free_desc+0x3d/0xe0 ? kfree+0x29f/0x2f0 irq_free_descs+0x47/0x70 msi_domain_free_locked.part.0+0x19d/0x1d0 msi_domain_free_irqs_all_locked+0x81/0xc0 pci_free_msi_irqs+0x12/0x40 pci_disable_msi+0x4c/0x60 pci_remove+0x9d/0xc0 [firewire_ohci 01b483699bebf9cb07a3d69df0aa2bee71db1b26] pci_device_remove+0x37/0xa0 device_release_driver_internal+0x19f/0x200 unbind_store+0xa1/0xb0 remove irq with devm_free_irq() before pci_disable_msi() also remove it in fail_msi: of pci_probe() as this would lead to an identical leak Cc: [email protected] Fixes: 5a95f1ded28691e6 ("firewire: ohci: use devres for requested IRQ") Signed-off-by: Edmund Raile <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Sakamoto <[email protected]>
2024-03-06drm/i915/dp: Fix connector DSC HW state readoutImre Deak5-7/+29
The DSC HW state of DP connectors is read out during driver loading and system resume in intel_modeset_update_connector_atomic_state(). This function is called for all connectors though and so the state of DSI connectors will also get updated incorrectly, triggering a WARN there wrt. the DSC decompression AUX device. Fix the above by moving the DSC state readout to a new DP connector specific sync_state() hook. This is anyway the logical place to update the connector object's state vs. the connector's atomic state. Fixes: b2608c6b3212 ("drm/i915/dp_mst: Enable MST DSC decompression for all streams") Reported-and-tested-by: Drew Davenport <[email protected]> Closes: https://lore.kernel.org/all/[email protected] Reviewed-by: Ankit Nautiyal <[email protected]> Signed-off-by: Imre Deak <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit a62e145981500996ea76af3d740ce0c0d74c5be0) Signed-off-by: Joonas Lahtinen <[email protected]>
2024-03-06drm/i915/selftests: Fix dependency of some timeouts on HZJanusz Krzysztofik1-2/+4
Third argument of i915_request_wait() accepts a timeout value in jiffies. Most users pass either a simple HZ based expression, or a result of msecs_to_jiffies(), or MAX_SCHEDULE_TIMEOUT, or a very small number not exceeding 4 if applicable as that value. However, there is one user -- intel_selftest_wait_for_rq() -- that passes a WAIT_FOR_RESET_TIME symbol, defined as a large constant value that most probably represents a desired timeout in ms. While that usage results in the intended value of timeout on usual x86_64 kernel configurations, it is not portable across different architectures and custom kernel configs. Rename the symbol to clearly indicate intended units and convert it to jiffies before use. Fixes: 3a4bfa091c46 ("drm/i915/selftest: Fix workarounds selftest for GuC submission") Signed-off-by: Janusz Krzysztofik <[email protected]> Cc: Rahul Kumar Singh <[email protected]> Cc: John Harrison <[email protected]> Cc: Matthew Brost <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 6ee3f54b880c91ab2e244eb4ffd4bfed37832b25) Signed-off-by: Joonas Lahtinen <[email protected]>
2024-03-06net/rds: fix WARNING in rds_conn_connect_if_downEdward Adam Davis2-5/+4
If connection isn't established yet, get_mr() will fail, trigger connection after get_mr(). Fixes: 584a8279a44a ("RDS: RDMA: return appropriate error on rdma map failures") Reported-and-tested-by: [email protected] Signed-off-by: Edward Adam Davis <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-03-06libceph: init the cursor when preparing sparse read in msgr2Xiubo Li1-0/+3
The cursor is no longer initialized in the OSD client, causing the sparse read state machine to fall into an infinite loop. The cursor should be initialized in IN_S_PREPARE_SPARSE_DATA state. [ idryomov: use msg instead of con->in_msg, changelog ] Link: https://tracker.ceph.com/issues/64607 Fixes: 8e46a2d068c9 ("libceph: just wait for more data to be available on the socket") Signed-off-by: Xiubo Li <[email protected]> Reviewed-by: Ilya Dryomov <[email protected]> Tested-by: Luis Henriques <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2024-03-06Merge branch '100GbE' of ↵David S. Miller9-22/+15
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-03-05 (idpf, ice, i40e, igc, e1000e) This series contains updates to idpf, ice, i40e, igc and e1000e drivers. Emil disables local BH on NAPI schedule for proper handling of softirqs on idpf. Jake stops reporting of virtchannel RSS option which in unsupported on ice. Rand Deeb adds null check to prevent possible null pointer dereference on ice. Michal Schmidt moves DPLL mutex initialization to resolve uninitialized mutex usage for ice. Jesse fixes incorrect variable usage for calculating Tx stats on ice. Ivan Vecera corrects logic for firmware equals check on i40e. Florian Kauer prevents memory corruption for XDP_REDIRECT on igc. Sasha reverts an incorrect use of FIELD_GET which caused a regression for Wake on LAN on e1000e. ==================== Signed-off-by: David S. Miller <[email protected]>
2024-03-06iov_iter: get rid of 'copy_mc' flagLinus Torvalds3-42/+42
This flag is only set by one single user: the magical core dumping code that looks up user pages one by one, and then writes them out using their kernel addresses (by using a BVEC_ITER). That actually ends up being a huge problem, because while we do use copy_mc_to_kernel() for this case and it is able to handle the possible machine checks involved, nothing else is really ready to handle the failures caused by the machine check. In particular, as reported by Tong Tiangen, we don't actually support fault_in_iov_iter_readable() on a machine check area. As a result, the usual logic for writing things to a file under a filesystem lock, which involves doing a copy with page faults disabled and then if that fails trying to fault pages in without holding the locks with fault_in_iov_iter_readable() does not work at all. We could decide to always just make the MC copy "succeed" (and filling the destination with zeroes), and that would then create a core dump file that just ignores any machine checks. But honestly, this single special case has been problematic before, and means that all the normal iov_iter code ends up slightly more complex and slower. See for example commit c9eec08bac96 ("iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()") where David Howells re-organized the code just to avoid having to check the 'copy_mc' flags inside the inner iov_iter loops. So considering that we have exactly one user, and that one user is a non-critical special case that doesn't actually ever trigger in real life (Tong found this with manual error injection), the sane solution is to just decide that the onus on handling the machine check lines on that user instead. Ergo, do the copy_mc_to_kernel() in the core dump logic itself, copying the user data to a stable kernel page before writing it out. Fixes: f1982740f5e7 ("iov_iter: Convert iterate*() to inline funcs") Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Tong Tiangen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/[email protected]/ Tested-by: David Howells <[email protected]> Reviewed-by: David Howells <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Reported-by: Tong Tiangen <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-03-06Merge branch 'Improve packet offload for dual stack'Steffen Klassert2-2/+8
Mike Yu says: ==================== In the XFRM stack, whether a packet is forwarded to the IPv4 or IPv6 stack depends on the family field of the matched SA. This does not completely work for IPsec packet offload in some scenario, for example, sending an IPv6 packet that will be encrypted and encapsulated as an IPv4 packet in HW. Here are the patches to make IPsec packet offload work on the mentioned scenario. ==================== Signed-off-by: Steffen Klassert <[email protected]>
2024-03-06RAS/AMD/FMPM: Fix off by one when unwinding on errorDan Carpenter1-1/+1
Decrement the index variable i before the first iteration when freeing the remaining elements on error. Depending on where this fails it could free something from one element beyond the end of the fru_records[] array. [ bp: Massage commit message. ] Fixes: 6f15e617cc99 ("RAS: Introduce a FRU memory poison manager") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-03-06x86/nmi: Drop unused declaration of proc_nmi_enabled()Thomas Weißschuh1-3/+0
The declaration is unused as the definition got deleted. Fixes: 5f2b0ba4d94b ("x86, nmi_watchdog: Remove the old nmi_watchdog"). Signed-off-by: Thomas Weißschuh <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/[email protected]