aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-08-24madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio ↵Yin Fengwei1-1/+1
for sharing check Commit fc986a38b670 ("mm: huge_memory: convert madvise_free_huge_pmd to use a folio") replaced the page_mapcount() with folio_mapcount() to check whether the folio is shared by other mapping. It's not correct for large folios. folio_mapcount() returns the total mapcount of large folio which is not suitable to detect whether the folio is shared. Use folio_estimated_sharers() which returns a estimated number of shares. That means it's not 100% correct. It should be OK for madvise case here. User-visible effects is that the THP is skipped when user call madvise. But the correct behavior is THP should be split and processed then. NOTE: this change is a temporary fix to reduce the user-visible effects before the long term fix from David is ready. Link: https://lkml.kernel.org/r/[email protected] Fixes: fc986a38b670 ("mm: huge_memory: convert madvise_free_huge_pmd to use a folio") Signed-off-by: Yin Fengwei <[email protected]> Reviewed-by: Yu Zhao <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Vishal Moola (Oracle) <[email protected]> Cc: Yang Shi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-24madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against ↵Yin Fengwei1-2/+2
large folio for sharing check Patch series "don't use mapcount() to check large folio sharing", v2. In madvise_cold_or_pageout_pte_range() and madvise_free_pte_range(), folio_mapcount() is used to check whether the folio is shared. But it's not correct as folio_mapcount() returns total mapcount of large folio. Use folio_estimated_sharers() here as the estimated number is enough. This patchset will fix the cases: User space application call madvise() with MADV_FREE, MADV_COLD and MADV_PAGEOUT for specific address range. There are THP mapped to the range. Without the patchset, the THP is skipped. With the patch, the THP will be split and handled accordingly. David reported the cow self test skip some cases because of MADV_PAGEOUT skip THP: https://lore.kernel.org/linux-mm/[email protected]/T/#mbf0f2ec7fbe45da47526de1d7036183981691e81 and I confirmed this patchset make it work again. This patch (of 3): Commit 07e8c82b5eff ("madvise: convert madvise_cold_or_pageout_pte_range() to use folios") replaced the page_mapcount() with folio_mapcount() to check whether the folio is shared by other mapping. It's not correct for large folio. folio_mapcount() returns the total mapcount of large folio which is not suitable to detect whether the folio is shared. Use folio_estimated_sharers() which returns a estimated number of shares. That means it's not 100% correct. It should be OK for madvise case here. User-visible effects is that the THP is skipped when user call madvise. But the correct behavior is THP should be split and processed then. NOTE: this change is a temporary fix to reduce the user-visible effects before the long term fix from David is ready. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 07e8c82b5eff ("madvise: convert madvise_cold_or_pageout_pte_range() to use folios") Signed-off-by: Yin Fengwei <[email protected]> Reviewed-by: Yu Zhao <[email protected]> Reviewed-by: Ryan Roberts <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Vishal Moola (Oracle) <[email protected]> Cc: Yang Shi <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-08-24Merge tag 'nfsd-6.5-5' of ↵Linus Torvalds2-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "Two last-minute one-liners for v6.5-rc. One got lost in the shuffle, and the other was reported just this morning" - Close race window when handling FREE_STATEID operations - Fix regression in /proc/fs/nfsd/v4_end_grace introduced in v6.5-rc" * tag 'nfsd-6.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Fix a thinko introduced by recent trace point changes nfsd: Fix race to FREE_STATEID and cl_revoked
2023-08-24Merge tag 'spi-fix-v6.5-rc7' of ↵Linus Torvalds2-10/+15
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A couple more small driver specific fixes for v6.5. The device mode for Cadence had been broken by some recent updates done for host mode and large transfers for multi-byte words on stm32 had been broken by an API update in what I think was a rebasing incident" * tag 'spi-fix-v6.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-cadence: Fix data corruption issues in slave mode spi: stm32: fix accidential revert to byte-sized transfer splitting
2023-08-24riscv: Fix build errors using binutils2.37 toolchainsMingzheng Xing1-4/+4
When building the kernel with binutils 2.37 and GCC-11.1.0/GCC-11.2.0, the following error occurs: Assembler messages: Error: cannot find default versions of the ISA extension `zicsr' Error: cannot find default versions of the ISA extension `zifencei' The above error originated from this commit of binutils[0], which has been resolved and backported by GCC-12.1.0[1] and GCC-11.3.0[2]. So fix this by change the GCC version in CONFIG_TOOLCHAIN_NEEDS_OLD_ISA_SPEC to GCC-11.3.0. Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=f0bae2552db1dd4f1995608fbf6648fcee4e9e0c [0] Link: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ca2bbb88f999f4d3cc40e89bc1aba712505dd598 [1] Link: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d29f5d6ab513c52fd872f532c492e35ae9fd6671 [2] Fixes: ca09f772ccca ("riscv: Handle zicsr/zifencei issue between gcc and binutils") Reported-by: Conor Dooley <[email protected]> Cc: <[email protected]> Signed-off-by: Mingzheng Xing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Closes: https://lore.kernel.org/all/20230823-captive-abdomen-befd942a4a73@wendy/ Reviewed-by: Conor Dooley <[email protected]> Tested-by: Conor Dooley <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2023-08-25Merge tag 'drm-misc-fixes-2023-08-24' of ↵Dave Airlie11-55/+67
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes A samsung-dsim initialization fix, a devfreq fix for panfrost, a DP DSC define fix, a recursive lock fix for dma-buf, a shader validation fix and a reference counting fix for vmwgfx Signed-off-by: Dave Airlie <[email protected]> From: Maxime Ripard <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/amy26vu5xbeeikswpx7nt6rddwfocdidshrtt2qovipihx5poj@y45p3dtzrloc
2023-08-24Merge tag 'net-6.5-rc8' of ↵Linus Torvalds79-356/+664
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from wifi, can and netfilter. Fixes to fixes: - nf_tables: - GC transaction race with abort path - defer gc run if previous batch is still pending Previous releases - regressions: - ipv4: fix data-races around inet->inet_id - phy: fix deadlocking in phy_error() invocation - mdio: fix C45 read/write protocol - ipvlan: fix a reference count leak warning in ipvlan_ns_exit() - ice: fix NULL pointer deref during VF reset - i40e: fix potential NULL pointer dereferencing of pf->vf in i40e_sync_vsi_filters() - tg3: use slab_build_skb() when needed - mtk_eth_soc: fix NULL pointer on hw reset Previous releases - always broken: - core: validate veth and vxcan peer ifindexes - sched: fix a qdisc modification with ambiguous command request - devlink: add missing unregister linecard notification - wifi: mac80211: limit reorder_buf_filtered to avoid UBSAN warning - batman: - do not get eth header before batadv_check_management_packet - fix batadv_v_ogm_aggr_send memory leak - bonding: fix macvlan over alb bond support - mlxsw: set time stamp fields also when its type is MIRROR_UTC" * tag 'net-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits) selftests: bonding: add macvlan over bond testing selftest: bond: add new topo bond_topo_2d1c.sh bonding: fix macvlan over alb bond support rtnetlink: Reject negative ifindexes in RTM_NEWLINK netfilter: nf_tables: defer gc run if previous batch is still pending netfilter: nf_tables: fix out of memory error handling netfilter: nf_tables: use correct lock to protect gc_list netfilter: nf_tables: GC transaction race with abort path netfilter: nf_tables: flush pending destroy work before netlink notifier netfilter: nf_tables: validate all pending tables ibmveth: Use dcbf rather than dcbfl i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters() net/sched: fix a qdisc modification with ambiguous command request igc: Fix the typo in the PTM Control macro batman-adv: Hold rtnl lock during MTU update via netlink igb: Avoid starting unnecessary workqueues can: raw: add missing refcount for memory leak fix can: isotp: fix support for transmission of SF without flow control bnx2x: new flag for track HW resource allocation sfc: allocate a big enough SKB for loopback selftest packet ...
2023-08-24NFSD: Fix a thinko introduced by recent trace point changesChuck Lever1-0/+1
The fixed commit erroneously removed a call to nfsd_end_grace(), which makes calls to write_v4_end_grace() a no-op. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-lkp/[email protected] Fixes: 39d432fc7630 ("NFSD: trace nfsctl operations") Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2023-08-24ASoC: amd: yc: Fix a non-functional mic on Lenovo 82SJMario Limonciello1-1/+1
Lenovo 82SJ doesn't have DMIC connected like 82V2 does. Narrow the match down to only cover 82V2. Reported-by: [email protected] Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217063 Fixes: 2232b2dd8cd4 ("ASoC: amd: yc: Add Lenovo Yoga Slim 7 Pro X to quirks table") Signed-off-by: Mario Limonciello <[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]
2023-08-24x86/fpu: Set X86_FEATURE_OSXSAVE feature after enabling OSXSAVE in CR4Feng Tang1-0/+7
0-Day found a 34.6% regression in stress-ng's 'af-alg' test case, and bisected it to commit b81fac906a8f ("x86/fpu: Move FPU initialization into arch_cpu_finalize_init()"), which optimizes the FPU init order, and moves the CR4_OSXSAVE enabling into a later place: arch_cpu_finalize_init identify_boot_cpu identify_cpu generic_identify get_cpu_cap --> setup cpu capability ... fpu__init_cpu fpu__init_cpu_xstate cr4_set_bits(X86_CR4_OSXSAVE); As the FPU is not yet initialized the CPU capability setup fails to set X86_FEATURE_OSXSAVE. Many security module like 'camellia_aesni_avx_x86_64' depend on this feature and therefore fail to load, causing the regression. Cure this by setting X86_FEATURE_OSXSAVE feature right after OSXSAVE enabling. [ tglx: Moved it into the actual BSP FPU initialization code and added a comment ] Fixes: b81fac906a8f ("x86/fpu: Move FPU initialization into arch_cpu_finalize_init()") Reported-by: kernel test robot <[email protected]> Signed-off-by: Feng Tang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/lkml/[email protected] Link: https://lore.kernel.org/lkml/[email protected]
2023-08-24x86/fpu: Invalidate FPU state correctly on exec()Rick Edgecombe2-3/+2
The thread flag TIF_NEED_FPU_LOAD indicates that the FPU saved state is valid and should be reloaded when returning to userspace. However, the kernel will skip doing this if the FPU registers are already valid as determined by fpregs_state_valid(). The logic embedded there considers the state valid if two cases are both true: 1: fpu_fpregs_owner_ctx points to the current tasks FPU state 2: the last CPU the registers were live in was the current CPU. This is usually correct logic. A CPU’s fpu_fpregs_owner_ctx is set to the current FPU during the fpregs_restore_userregs() operation, so it indicates that the registers have been restored on this CPU. But this alone doesn’t preclude that the task hasn’t been rescheduled to a different CPU, where the registers were modified, and then back to the current CPU. To verify that this was not the case the logic relies on the second condition. So the assumption is that if the registers have been restored, AND they haven’t had the chance to be modified (by being loaded on another CPU), then they MUST be valid on the current CPU. Besides the lazy FPU optimizations, the other cases where the FPU registers might not be valid are when the kernel modifies the FPU register state or the FPU saved buffer. In this case the operation modifying the FPU state needs to let the kernel know the correspondence has been broken. The comment in “arch/x86/kernel/fpu/context.h” has: /* ... * If the FPU register state is valid, the kernel can skip restoring the * FPU state from memory. * * Any code that clobbers the FPU registers or updates the in-memory * FPU state for a task MUST let the rest of the kernel know that the * FPU registers are no longer valid for this task. * * Either one of these invalidation functions is enough. Invalidate * a resource you control: CPU if using the CPU for something else * (with preemption disabled), FPU for the current task, or a task that * is prevented from running by the current task. */ However, this is not completely true. When the kernel modifies the registers or saved FPU state, it can only rely on __fpu_invalidate_fpregs_state(), which wipes the FPU’s last_cpu tracking. The exec path instead relies on fpregs_deactivate(), which sets the CPU’s FPU context to NULL. This was observed to fail to restore the reset FPU state to the registers when returning to userspace in the following scenario: 1. A task is executing in userspace on CPU0 - CPU0’s FPU context points to tasks - fpu->last_cpu=CPU0 2. The task exec()’s 3. While in the kernel the task is preempted - CPU0 gets a thread executing in the kernel (such that no other FPU context is activated) - Scheduler sets task’s fpu->last_cpu=CPU0 when scheduling out 4. Task is migrated to CPU1 5. Continuing the exec(), the task gets to fpu_flush_thread()->fpu_reset_fpregs() - Sets CPU1’s fpu context to NULL - Copies the init state to the task’s FPU buffer - Sets TIF_NEED_FPU_LOAD on the task 6. The task reschedules back to CPU0 before completing the exec() and returning to userspace - During the reschedule, scheduler finds TIF_NEED_FPU_LOAD is set - Skips saving the registers and updating task’s fpu→last_cpu, because TIF_NEED_FPU_LOAD is the canonical source. 7. Now CPU0’s FPU context is still pointing to the task’s, and fpu->last_cpu is still CPU0. So fpregs_state_valid() returns true even though the reset FPU state has not been restored. So the root cause is that exec() is doing the wrong kind of invalidate. It should reset fpu->last_cpu via __fpu_invalidate_fpregs_state(). Further, fpu__drop() doesn't really seem appropriate as the task (and FPU) are not going away, they are just getting reset as part of an exec. So switch to __fpu_invalidate_fpregs_state(). Also, delete the misleading comment that says that either kind of invalidate will be enough, because it’s not always the case. Fixes: 33344368cb08 ("x86/fpu: Clean up the fpu__clear() variants") Reported-by: Lei Wang <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Lijun Pan <[email protected]> Reviewed-by: Sohil Mehta <[email protected]> Acked-by: Lijun Pan <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected]
2023-08-24Merge tag 'nf-23-08-23' of ↵Paolo Abeni5-11/+37
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Florian Westphal says: ==================== netfilter updates for net This PR contains nf_tables updates for your *net* tree. First patch fixes table validation, I broke this in 6.4 when tracking validation state per table, reported by Pablo, fixup from myself. Second patch makes sure objects waiting for memory release have been released, this was broken in 6.1, patch from Pablo Neira Ayuso. Patch three is a fix-for-fix from previous PR: In case a transaction gets aborted, gc sequence counter needs to be incremented so pending gc requests are invalidated, from Pablo. Same for patch 4: gc list needs to use gc list lock, not destroy lock, also from Pablo. Patch 5 fixes a UaF in a set backend, but this should only occur when failslab is enabled for GFP_KERNEL allocations, broken since feature was added in 5.6, from myself. Patch 6 fixes a double-free bug that was also added via previous PR: We must not schedule gc work if the previous batch is still queued. netfilter pull request 2023-08-23 * tag 'nf-23-08-23' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: defer gc run if previous batch is still pending netfilter: nf_tables: fix out of memory error handling netfilter: nf_tables: use correct lock to protect gc_list netfilter: nf_tables: GC transaction race with abort path netfilter: nf_tables: flush pending destroy work before netlink notifier netfilter: nf_tables: validate all pending tables ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-08-24Merge branch 'fix-macvlan-over-alb-bond-support'Paolo Abeni7-127/+272
Hangbin Liu says: ==================== fix macvlan over alb bond support Currently, the macvlan over alb bond is broken after commit 14af9963ba1e ("bonding: Support macvlans on top of tlb/rlb mode bonds"). Fix this and add relate tests. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-08-24selftests: bonding: add macvlan over bond testingHangbin Liu2-1/+101
Add a macvlan over bonding test with mode active-backup, balance-tlb and balance-alb. ]# ./bond_macvlan.sh TEST: active-backup: IPv4: client->server [ OK ] TEST: active-backup: IPv6: client->server [ OK ] TEST: active-backup: IPv4: client->macvlan_1 [ OK ] TEST: active-backup: IPv6: client->macvlan_1 [ OK ] TEST: active-backup: IPv4: client->macvlan_2 [ OK ] TEST: active-backup: IPv6: client->macvlan_2 [ OK ] TEST: active-backup: IPv4: macvlan_1->macvlan_2 [ OK ] TEST: active-backup: IPv6: macvlan_1->macvlan_2 [ OK ] TEST: active-backup: IPv4: server->client [ OK ] TEST: active-backup: IPv6: server->client [ OK ] TEST: active-backup: IPv4: macvlan_1->client [ OK ] TEST: active-backup: IPv6: macvlan_1->client [ OK ] TEST: active-backup: IPv4: macvlan_2->client [ OK ] TEST: active-backup: IPv6: macvlan_2->client [ OK ] TEST: active-backup: IPv4: macvlan_2->macvlan_2 [ OK ] TEST: active-backup: IPv6: macvlan_2->macvlan_2 [ OK ] [...] TEST: balance-alb: IPv4: client->server [ OK ] TEST: balance-alb: IPv6: client->server [ OK ] TEST: balance-alb: IPv4: client->macvlan_1 [ OK ] TEST: balance-alb: IPv6: client->macvlan_1 [ OK ] TEST: balance-alb: IPv4: client->macvlan_2 [ OK ] TEST: balance-alb: IPv6: client->macvlan_2 [ OK ] TEST: balance-alb: IPv4: macvlan_1->macvlan_2 [ OK ] TEST: balance-alb: IPv6: macvlan_1->macvlan_2 [ OK ] TEST: balance-alb: IPv4: server->client [ OK ] TEST: balance-alb: IPv6: server->client [ OK ] TEST: balance-alb: IPv4: macvlan_1->client [ OK ] TEST: balance-alb: IPv6: macvlan_1->client [ OK ] TEST: balance-alb: IPv4: macvlan_2->client [ OK ] TEST: balance-alb: IPv6: macvlan_2->client [ OK ] TEST: balance-alb: IPv4: macvlan_2->macvlan_2 [ OK ] TEST: balance-alb: IPv6: macvlan_2->macvlan_2 [ OK ] Signed-off-by: Hangbin Liu <[email protected]> Acked-by: Jay Vosburgh <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-08-24selftest: bond: add new topo bond_topo_2d1c.shHangbin Liu4-113/+167
Add a new testing topo bond_topo_2d1c.sh which is used more commonly. Make bond_topo_3d1c.sh just source bond_topo_2d1c.sh and add the extra link. Signed-off-by: Hangbin Liu <[email protected]> Acked-by: Jay Vosburgh <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-08-24bonding: fix macvlan over alb bond supportHangbin Liu2-13/+4
The commit 14af9963ba1e ("bonding: Support macvlans on top of tlb/rlb mode bonds") aims to enable the use of macvlans on top of rlb bond mode. However, the current rlb bond mode only handles ARP packets to update remote neighbor entries. This causes an issue when a macvlan is on top of the bond, and remote devices send packets to the macvlan using the bond's MAC address as the destination. After delivering the packets to the macvlan, the macvlan will rejects them as the MAC address is incorrect. Consequently, this commit makes macvlan over bond non-functional. To address this problem, one potential solution is to check for the presence of a macvlan port on the bond device using netif_is_macvlan_port(bond->dev) and return NULL in the rlb_arp_xmit() function. However, this approach doesn't fully resolve the situation when a VLAN exists between the bond and macvlan. So let's just do a partial revert for commit 14af9963ba1e in rlb_arp_xmit(). As the comment said, Don't modify or load balance ARPs that do not originate locally. Fixes: 14af9963ba1e ("bonding: Support macvlans on top of tlb/rlb mode bonds") Reported-by: [email protected] Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2117816 Signed-off-by: Hangbin Liu <[email protected]> Acked-by: Jay Vosburgh <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-08-24rtnetlink: Reject negative ifindexes in RTM_NEWLINKIdo Schimmel1-0/+3
Negative ifindexes are illegal, but the kernel does not validate the ifindex in the ancillary header of RTM_NEWLINK messages, resulting in the kernel generating a warning [1] when such an ifindex is specified. Fix by rejecting negative ifindexes. [1] WARNING: CPU: 0 PID: 5031 at net/core/dev.c:9593 dev_index_reserve+0x1a2/0x1c0 net/core/dev.c:9593 [...] Call Trace: <TASK> register_netdevice+0x69a/0x1490 net/core/dev.c:10081 br_dev_newlink+0x27/0x110 net/bridge/br_netlink.c:1552 rtnl_newlink_create net/core/rtnetlink.c:3471 [inline] __rtnl_newlink+0x115e/0x18c0 net/core/rtnetlink.c:3688 rtnl_newlink+0x67/0xa0 net/core/rtnetlink.c:3701 rtnetlink_rcv_msg+0x439/0xd30 net/core/rtnetlink.c:6427 netlink_rcv_skb+0x16b/0x440 net/netlink/af_netlink.c:2545 netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline] netlink_unicast+0x536/0x810 net/netlink/af_netlink.c:1368 netlink_sendmsg+0x93c/0xe40 net/netlink/af_netlink.c:1910 sock_sendmsg_nosec net/socket.c:728 [inline] sock_sendmsg+0xd9/0x180 net/socket.c:751 ____sys_sendmsg+0x6ac/0x940 net/socket.c:2538 ___sys_sendmsg+0x135/0x1d0 net/socket.c:2592 __sys_sendmsg+0x117/0x1e0 net/socket.c:2621 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 38f7b870d4a6 ("[RTNETLINK]: Link creation API") Reported-by: [email protected] Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2023-08-24Merge tag 'asoc-fix-v6.5-rc7' of ↵Takashi Iwai8-29/+84
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.5 A relatively large but generally not super urgent set of fixes for ASoC, including some quirks and a MAINTAINERS update. There's also an update to cs35l56 to change the firmware ABI, there are no current shipping systems which use the current interface and the sooner we get the new interface in the less likely it is that something will start. It'd be nice if these landed for v6.5 but not the end of the world if they wait till v6.6.
2023-08-23Merge tag 'acpi-6.5-rc8' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Make an existing ACPI IRQ override quirk for PCSpecialist Elimina Pro 16 M work as intended (Hans de Goede)" * tag 'acpi-6.5-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: resource: Fix IRQ override quirk for PCSpecialist Elimina Pro 16 M
2023-08-23drm/i915: Fix HPD polling, reenabling the output poll work as neededImre Deak1-2/+2
After the commit in the Fixes: line below, HPD polling stopped working on i915, since after that change calling drm_kms_helper_poll_enable() doesn't restart drm_mode_config::output_poll_work if the work was stopped (no connectors needing polling) and enabling polling for a connector (during runtime suspend or detecting an HPD IRQ storm). After the above change calling drm_kms_helper_poll_enable() is a nop after it's been called already and polling for some connectors was disabled/re-enabled. Fix this by calling drm_kms_helper_poll_reschedule() added in the previous patch instead, which reschedules the work whenever expected. Fixes: d33a54e3991d ("drm/probe_helper: sort out poll_running vs poll_enabled") CC: [email protected] # 6.4+ Cc: Dmitry Baryshkov <[email protected]> Cc: [email protected] Reviewed-by: Jouni Högander <[email protected]> Signed-off-by: Imre Deak <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 50452f2f76852322620b63e62922b85e955abe94) Signed-off-by: Rodrigo Vivi <[email protected]>
2023-08-23drm: Add an HPD poll helper to reschedule the poll workImre Deak2-22/+47
Add a helper to reschedule drm_mode_config::output_poll_work after polling has been enabled for a connector (and needing a reschedule, since previously polling was disabled for all connectors and hence output_poll_work was not running). This is needed by the next patch fixing HPD polling on i915. CC: [email protected] # 6.4+ Cc: Dmitry Baryshkov <[email protected]> Cc: [email protected] Reviewed-by: Jouni Högander <[email protected]> Reviewed-by: Dmitry Baryshkov <[email protected]> Signed-off-by: Imre Deak <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit fe2352fd64029918174de4b460dfe6df0c6911cd) Signed-off-by: Rodrigo Vivi <[email protected]>
2023-08-23Merge patch series "riscv: fix ptrace and export VLENB"Palmer Dabbelt4-71/+3
Andy Chiu <[email protected]> says: We add a vlenb field in Vector context and save it with the riscv_vstate_save() macro. It should not cause performance regression as VLENB is a design-time constant and is frequently used by hardware. Also, adding this field into the __sc_riscv_v_state may benifit us on a future compatibility issue becuse a hardware may have writable VLENB. Adding and saving VLENB have an immediate benifit as it gives ptrace a better view of the Vector extension and makes it possible to reconstruct Vector register files from the dump without doing an additional csr read. This patchset also sync the number of note types between us and gdb for riscv to solve a conflicting note. This is not an ABI break given that 6.5 has not been released yet. * b4-shazam-merge: RISC-V: vector: export VLENB csr in __sc_riscv_v_state RISC-V: Remove ptrace support for vectors Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2023-08-23gpio: sim: pass the GPIO device's software node to irq domainBartosz Golaszewski1-1/+1
Associate the swnode of the GPIO device's (which is the interrupt controller here) with the irq domain. Otherwise the interrupt-controller device attribute is a no-op. Fixes: cb8c474e79be ("gpio: sim: new testing module") Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]>
2023-08-23gpio: sim: dispose of irq mappings before destroying the irq_sim domainBartosz Golaszewski1-0/+13
If a GPIO simulator device is unbound with interrupts still requested, we will hit a use-after-free issue in __irq_domain_deactivate_irq(). The owner of the irq domain must dispose of all mappings before destroying the domain object. Fixes: cb8c474e79be ("gpio: sim: new testing module") Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]>
2023-08-23drm/vmwgfx: Fix possible invalid drm gem put callsZack Rusin6-16/+16
vmw_bo_unreference sets the input buffer to null on exit, resulting in null ptr deref's on the subsequent drm gem put calls. This went unnoticed because only very old userspace would be exercising those paths but it wouldn't be hard to hit on old distros with brand new kernels. Introduce a new function that abstracts unrefing of user bo's to make the code cleaner and more explicit. Signed-off-by: Zack Rusin <[email protected]> Reported-by: Ian Forbes <[email protected]> Fixes: 9ef8d83e8e25 ("drm/vmwgfx: Do not drop the reference to the handle too soon") Cc: <[email protected]> # v6.4+ Reviewed-by: Maaz Mombasawala<[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-08-23drm/vmwgfx: Fix shader stage validationZack Rusin2-18/+23
For multiple commands the driver was not correctly validating the shader stages resulting in possible kernel oopses. The validation code was only. if ever, checking the upper bound on the shader stages but never a lower bound (valid shader stages start at 1 not 0). Fixes kernel oopses ending up in vmw_binding_add, e.g.: Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 2443 Comm: testcase Not tainted 6.3.0-rc4-vmwgfx #1 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 RIP: 0010:vmw_binding_add+0x4c/0x140 [vmwgfx] Code: 7e 30 49 83 ff 0e 0f 87 ea 00 00 00 4b 8d 04 7f 89 d2 89 cb 48 c1 e0 03 4c 8b b0 40 3d 93 c0 48 8b 80 48 3d 93 c0 49 0f af de <48> 03 1c d0 4c 01 e3 49 8> RSP: 0018:ffffb8014416b968 EFLAGS: 00010206 RAX: ffffffffc0933ec0 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 00000000ffffffff RSI: ffffb8014416b9c0 RDI: ffffb8014316f000 RBP: ffffb8014416b998 R08: 0000000000000003 R09: 746f6c735f726564 R10: ffffffffaaf2bda0 R11: 732e676e69646e69 R12: ffffb8014316f000 R13: ffffb8014416b9c0 R14: 0000000000000040 R15: 0000000000000006 FS: 00007fba8c0af740(0000) GS:ffff8a1277c80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000007c0933eb8 CR3: 0000000118244001 CR4: 00000000003706e0 Call Trace: <TASK> vmw_view_bindings_add+0xf5/0x1b0 [vmwgfx] ? ___drm_dbg+0x8a/0xb0 [drm] vmw_cmd_dx_set_shader_res+0x8f/0xc0 [vmwgfx] vmw_execbuf_process+0x590/0x1360 [vmwgfx] vmw_execbuf_ioctl+0x173/0x370 [vmwgfx] ? __drm_dev_dbg+0xb4/0xe0 [drm] ? __pfx_vmw_execbuf_ioctl+0x10/0x10 [vmwgfx] drm_ioctl_kernel+0xbc/0x160 [drm] drm_ioctl+0x2d2/0x580 [drm] ? __pfx_vmw_execbuf_ioctl+0x10/0x10 [vmwgfx] ? do_fault+0x1a6/0x420 vmw_generic_ioctl+0xbd/0x180 [vmwgfx] vmw_unlocked_ioctl+0x19/0x20 [vmwgfx] __x64_sys_ioctl+0x96/0xd0 do_syscall_64+0x5d/0x90 ? handle_mm_fault+0xe4/0x2f0 ? debug_smp_processor_id+0x1b/0x30 ? fpregs_assert_state_consistent+0x2e/0x50 ? exit_to_user_mode_prepare+0x40/0x180 ? irqentry_exit_to_user_mode+0xd/0x20 ? irqentry_exit+0x3f/0x50 ? exc_page_fault+0x8b/0x180 entry_SYSCALL_64_after_hwframe+0x72/0xdc Signed-off-by: Zack Rusin <[email protected]> Cc: [email protected] Reported-by: Ziming Zhang <[email protected]> Testcase-found-by: Niels De Graef <[email protected]> Fixes: d80efd5cb3de ("drm/vmwgfx: Initial DX support") Cc: <[email protected]> # v4.3+ Reviewed-by: Maaz Mombasawala<[email protected]> Reviewed-by: Martin Krastev <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-08-23ALSA: ymfpci: Fix the missing snd_card_free() call at probe errorTakashi Iwai1-2/+8
Like a few other drivers, YMFPCI driver needs to clean up with snd_card_free() call at an error path of the probe; otherwise the other devres resources are released before the card and it results in the UAF. This patch uses the helper for handling the probe error gracefully. Fixes: f33fc1576757 ("ALSA: ymfpci: Create card with device-managed snd_devm_card_new()") Cc: <[email protected]> Reported-and-tested-by: Takashi Yano <[email protected]> Closes: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2023-08-23Merge tag 'platform-drivers-x86-v6.5-5' of ↵Linus Torvalds3-0/+13
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: "Final set of three small fixes for 6.5" * tag 'platform-drivers-x86-v6.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/mellanox: Fix mlxbf-tmfifo not handling all virtio CONSOLE notifications platform/x86: ideapad-laptop: Add support for new hotkeys found on ThinkBook 14s Yoga ITL platform/x86: lenovo-ymc: Add Lenovo Yoga 7 14ACN6 to ec_trigger_quirk_dmi_table
2023-08-23platform/mellanox: Fix mlxbf-tmfifo not handling all virtio CONSOLE ↵Shih-Yi Chen1-0/+1
notifications rshim console does not show all entries of dmesg. Fixed by setting MLXBF_TM_TX_LWM_IRQ for every CONSOLE notification. Signed-off-by: Shih-Yi Chen <[email protected]> Reviewed-by: Liming Sung <[email protected]> Reviewed-by: David Thompson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Hans de Goede <[email protected]> Signed-off-by: Hans de Goede <[email protected]>
2023-08-23netfilter: nf_tables: defer gc run if previous batch is still pendingFlorian Westphal3-0/+11
Don't queue more gc work, else we may queue the same elements multiple times. If an element is flagged as dead, this can mean that either the previous gc request was invalidated/discarded by a transaction or that the previous request is still pending in the system work queue. The latter will happen if the gc interval is set to a very low value, e.g. 1ms, and system work queue is backlogged. The sets refcount is 1 if no previous gc requeusts are queued, so add a helper for this and skip gc run if old requests are pending. Add a helper for this and skip the gc run in this case. Fixes: f6c383b8c31a ("netfilter: nf_tables: adapt set backend to use GC transaction API") Signed-off-by: Florian Westphal <[email protected]> Reviewed-by: Pablo Neira Ayuso <[email protected]>
2023-08-23netfilter: nf_tables: fix out of memory error handlingFlorian Westphal1-3/+10
Several instances of pipapo_resize() don't propagate allocation failures, this causes a crash when fault injection is enabled for gfp_kernel slabs. Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Signed-off-by: Florian Westphal <[email protected]> Reviewed-by: Stefano Brivio <[email protected]>
2023-08-23netfilter: nf_tables: use correct lock to protect gc_listPablo Neira Ayuso1-2/+2
Use nf_tables_gc_list_lock spinlock, not nf_tables_destroy_list_lock to protect the gc list. Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Florian Westphal <[email protected]>
2023-08-23netfilter: nf_tables: GC transaction race with abort pathPablo Neira Ayuso1-1/+5
Abort path is missing a synchronization point with GC transactions. Add GC sequence number hence any GC transaction losing race will be discarded. Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Florian Westphal <[email protected]>
2023-08-23netfilter: nf_tables: flush pending destroy work before netlink notifierPablo Neira Ayuso1-1/+1
Destroy work waits for the RCU grace period then it releases the objects with no mutex held. All releases objects follow this path for transactions, therefore, order is guaranteed and references to top-level objects in the hierarchy remain valid. However, netlink notifier might interfer with pending destroy work. rcu_barrier() is not correct because objects are not release via RCU callback. Flush destroy work before releasing objects from netlink notifier path. Fixes: d4bc8271db21 ("netfilter: nf_tables: netlink notifier might race to release objects") Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Florian Westphal <[email protected]>
2023-08-23netfilter: nf_tables: validate all pending tablesFlorian Westphal2-4/+8
We have to validate all tables in the transaction that are in VALIDATE_DO state, the blamed commit below did not move the break statement to its right location so we only validate one table. Moreover, we can't init table->validate to _SKIP when a table object is allocated. If we do, then if a transcaction creates a new table and then fails the transaction, nfnetlink will loop and nft will hang until user cancels the command. Add back the pernet state as a place to stash the last state encountered. This is either _DO (we hit an error during commit validation) or _SKIP (transaction passed all checks). Fixes: 00c320f9b755 ("netfilter: nf_tables: make validation state per table") Reported-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Florian Westphal <[email protected]>
2023-08-23ASoC: cs35l41: Correct amp_gain_tlv valuesCharles Keepax1-1/+1
The current analog gain TLV seems to have completely incorrect values in it. The gain starts at 0.5dB, proceeds in 1dB steps, and has no mute value, correct the control to match. Signed-off-by: Charles Keepax <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2023-08-23ibmveth: Use dcbf rather than dcbflMichael Ellerman1-1/+1
When building for power4, newer binutils don't recognise the "dcbfl" extended mnemonic. dcbfl RA, RB is equivalent to dcbf RA, RB, 1. Switch to "dcbf" to avoid the build error. Signed-off-by: Michael Ellerman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-23i40e: fix potential NULL pointer dereferencing of pf->vf i40e_sync_vsi_filters()Andrii Staikov1-2/+3
Add check for pf->vf not being NULL before dereferencing pf->vf[vsi->vf_id] in updating VSI filter sync. Add a similar check before dereferencing !pf->vf[vsi->vf_id].trusted in the condition for clearing promisc mode bit. Fixes: c87c938f62d8 ("i40e: Add VF VLAN pruning") Signed-off-by: Andrii Staikov <[email protected]> Signed-off-by: Aleksandr Loktionov <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-23net/sched: fix a qdisc modification with ambiguous command requestJamal Hadi Salim1-13/+40
When replacing an existing root qdisc, with one that is of the same kind, the request boils down to essentially a parameterization change i.e not one that requires allocation and grafting of a new qdisc. syzbot was able to create a scenario which resulted in a taprio qdisc replacing an existing taprio qdisc with a combination of NLM_F_CREATE, NLM_F_REPLACE and NLM_F_EXCL leading to create and graft scenario. The fix ensures that only when the qdisc kinds are different that we should allow a create and graft, otherwise it goes into the "change" codepath. While at it, fix the code and comments to improve readability. While syzbot was able to create the issue, it did not zone on the root cause. Analysis from Vladimir Oltean <[email protected]> helped narrow it down. v1->V2 changes: - remove "inline" function definition (Vladmir) - remove extrenous braces in branches (Vladmir) - change inline function names (Pedro) - Run tdc tests (Victor) v2->v3 changes: - dont break else/if (Simon) Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: [email protected] Closes: https://lore.kernel.org/netdev/20230816225759.g25x76kmgzya2gei@skbuf/T/ Tested-by: Vladimir Oltean <[email protected]> Tested-by: Victor Nogueira <[email protected]> Reviewed-by: Pedro Tammela <[email protected]> Reviewed-by: Victor Nogueira <[email protected]> Signed-off-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-08-23dma-buf/sw_sync: Avoid recursive lock during fence signalRob Clark1-9/+9
If a signal callback releases the sw_sync fence, that will trigger a deadlock as the timeline_fence_release recurses onto the fence->lock (used both for signaling and the the timeline tree). To avoid that, temporarily hold an extra reference to the signalled fences until after we drop the lock. (This is an alternative implementation of https://patchwork.kernel.org/patch/11664717/ which avoids some potential UAF issues with the original patch.) v2: Remove now obsolete comment, use list_move_tail() and list_del_init() Reported-by: Bas Nieuwenhuizen <[email protected]> Fixes: d3c6dd1fb30d ("dma-buf/sw_sync: Synchronize signal vs syncpt free") Signed-off-by: Rob Clark <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Christian König <[email protected]> Signed-off-by: Christian König <[email protected]>
2023-08-23media: vcodec: Fix potential array out-of-bounds in encoder queue_setupWei Chen1-0/+2
variable *nplanes is provided by user via system call argument. The possible value of q_data->fmt->num_planes is 1-3, while the value of *nplanes can be 1-8. The array access by index i can cause array out-of-bounds. Fix this bug by checking *nplanes against the array size. Fixes: 4e855a6efa54 ("[media] vcodec: mediatek: Add Mediatek V4L2 Video Encoder Driver") Signed-off-by: Wei Chen <[email protected]> Cc: [email protected] Reviewed-by: Chen-Yu Tsai <[email protected]> Signed-off-by: Hans Verkuil <[email protected]>
2023-08-22igc: Fix the typo in the PTM Control macroSasha Neftin1-1/+1
The IGC_PTM_CTRL_SHRT_CYC defines the time between two consecutive PTM requests. The bit resolution of this field is six bits. That bit five was missing in the mask. This patch comes to correct the typo in the IGC_PTM_CTRL_SHRT_CYC macro. Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()") Signed-off-by: Sasha Neftin <[email protected]> Tested-by: Naama Meir <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Kalesh AP <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-22batman-adv: Hold rtnl lock during MTU update via netlinkSven Eckelmann1-0/+3
The automatic recalculation of the maximum allowed MTU is usually triggered by code sections which are already rtnl lock protected by callers outside of batman-adv. But when the fragmentation setting is changed via batman-adv's own batadv genl family, then the rtnl lock is not yet taken. But dev_set_mtu requires that the caller holds the rtnl lock because it uses netdevice notifiers. And this code will then fail the check for this lock: RTNL: assertion failed at net/core/dev.c (1953) Cc: [email protected] Reported-by: [email protected] Fixes: c6a953cce8d0 ("batman-adv: Trigger events for auto adjusted MTU") Signed-off-by: Sven Eckelmann <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/20230821-batadv-missing-mtu-rtnl-lock-v1-1-1c5a7bfe861e@narfation.org Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-22igb: Avoid starting unnecessary workqueuesAlessio Igor Bogani1-12/+12
If ptp_clock_register() fails or CONFIG_PTP isn't enabled, avoid starting PTP related workqueues. In this way we can fix this: BUG: unable to handle page fault for address: ffffc9000440b6f8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 100000067 P4D 100000067 PUD 1001e0067 PMD 107dc5067 PTE 0 Oops: 0000 [#1] PREEMPT SMP [...] Workqueue: events igb_ptp_overflow_check RIP: 0010:igb_rd32+0x1f/0x60 [...] Call Trace: igb_ptp_read_82580+0x20/0x50 timecounter_read+0x15/0x60 igb_ptp_overflow_check+0x1a/0x50 process_one_work+0x1cb/0x3c0 worker_thread+0x53/0x3f0 ? rescuer_thread+0x370/0x370 kthread+0x142/0x160 ? kthread_associate_blkcg+0xc0/0xc0 ret_from_fork+0x1f/0x30 Fixes: 1f6e8178d685 ("igb: Prevent dropped Tx timestamps via work items and interrupts.") Fixes: d339b1331616 ("igb: add PTP Hardware Clock code") Signed-off-by: Alessio Igor Bogani <[email protected]> Tested-by: Arpana Arland <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-22Merge branch '100GbE' of ↵Jakub Kicinski5-33/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-08-21 (ice) This series contains updates to ice driver only. Jesse fixes an issue on calculating buffer size. Petr Oros reverts a commit that does not fully resolve VF reset issues and implements one that provides a fuller fix. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Fix NULL pointer deref during VF reset Revert "ice: Fix ice VF reset during iavf initialization" ice: fix receive buffer size miscalculation ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-22Merge branch 'can-fixes-for-6-5-rc7'Jakub Kicinski2-24/+33
Oliver Hartkopp says: ==================== CAN fixes for 6.5-rc7 The isotp fix removes an unnecessary check which leads to delays and/or a wrong error notification. The fix for the CAN_RAW socket solves the last issue that has been introduced with commit ee8b94c8510c ("can: raw: fix receiver memory leak") in this upstream cycle (detected by Eric Dumazet). ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-22can: raw: add missing refcount for memory leak fixOliver Hartkopp1-9/+26
Commit ee8b94c8510c ("can: raw: fix receiver memory leak") introduced a new reference to the CAN netdevice that has assigned CAN filters. But this new ro->dev reference did not maintain its own refcount which lead to another KASAN use-after-free splat found by Eric Dumazet. This patch ensures a proper refcount for the CAN nedevice. Fixes: ee8b94c8510c ("can: raw: fix receiver memory leak") Reported-by: Eric Dumazet <[email protected]> Cc: Ziyang Xuan <[email protected]> Signed-off-by: Oliver Hartkopp <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-22can: isotp: fix support for transmission of SF without flow controlOliver Hartkopp1-15/+7
The original implementation had a very simple handling for single frame transmissions as it just sent the single frame without a timeout handling. With the new echo frame handling the echo frame was also introduced for single frames but the former exception ('simple without timers') has been maintained by accident. This leads to a 1 second timeout when closing the socket and to an -ECOMM error when CAN_ISOTP_WAIT_TX_DONE is selected. As the echo handling is always active (also for single frames) remove the wrong extra condition for single frames. Fixes: 9f39d36530e5 ("can: isotp: add support for transmission without flow control") Signed-off-by: Oliver Hartkopp <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-22bnx2x: new flag for track HW resource allocationThinh Tran4-28/+44
While injecting PCIe errors to the upstream PCIe switch of a BCM57810 NIC, system hangs/crashes were observed. After several calls to bnx2x_tx_timout() complete, bnx2x_nic_unload() is called to free up HW resources and bnx2x_napi_disable() is called to release NAPI objects. Later, when the EEH driver calls bnx2x_io_slot_reset() to complete the recovery process, bnx2x attempts to disable NAPI again by calling bnx2x_napi_disable() and freeing resources which have already been freed, resulting in a hang or crash. Introduce a new flag to track the HW resource and NAPI allocation state, refactor duplicated code into a single function, check page pool allocation status before freeing, and reduces debug output when a TX timeout event occurs. Reviewed-by: Manish Chopra <[email protected]> Tested-by: Abdul Haleem <[email protected]> Tested-by: David Christensen <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Venkata Sai Duggi <[email protected]> Signed-off-by: Thinh Tran <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-08-22clk: Fix slab-out-of-bounds error in devm_clk_release()Andrey Skvortsov1-6/+7
Problem can be reproduced by unloading snd_soc_simple_card, because in devm_get_clk_from_child() devres data is allocated as `struct clk`, but devm_clk_release() expects devres data to be `struct devm_clk_state`. KASAN report: ================================================================== BUG: KASAN: slab-out-of-bounds in devm_clk_release+0x20/0x54 Read of size 8 at addr ffffff800ee09688 by task (udev-worker)/287 Call trace: dump_backtrace+0xe8/0x11c show_stack+0x1c/0x30 dump_stack_lvl+0x60/0x78 print_report+0x150/0x450 kasan_report+0xa8/0xf0 __asan_load8+0x78/0xa0 devm_clk_release+0x20/0x54 release_nodes+0x84/0x120 devres_release_all+0x144/0x210 device_unbind_cleanup+0x1c/0xac really_probe+0x2f0/0x5b0 __driver_probe_device+0xc0/0x1f0 driver_probe_device+0x68/0x120 __driver_attach+0x140/0x294 bus_for_each_dev+0xec/0x160 driver_attach+0x38/0x44 bus_add_driver+0x24c/0x300 driver_register+0xf0/0x210 __platform_driver_register+0x48/0x54 asoc_simple_card_init+0x24/0x1000 [snd_soc_simple_card] do_one_initcall+0xac/0x340 do_init_module+0xd0/0x300 load_module+0x2ba4/0x3100 __do_sys_init_module+0x2c8/0x300 __arm64_sys_init_module+0x48/0x5c invoke_syscall+0x64/0x190 el0_svc_common.constprop.0+0x124/0x154 do_el0_svc+0x44/0xdc el0_svc+0x14/0x50 el0t_64_sync_handler+0xec/0x11c el0t_64_sync+0x14c/0x150 Allocated by task 287: kasan_save_stack+0x38/0x60 kasan_set_track+0x28/0x40 kasan_save_alloc_info+0x20/0x30 __kasan_kmalloc+0xac/0xb0 __kmalloc_node_track_caller+0x6c/0x1c4 __devres_alloc_node+0x44/0xb4 devm_get_clk_from_child+0x44/0xa0 asoc_simple_parse_clk+0x1b8/0x1dc [snd_soc_simple_card_utils] simple_parse_node.isra.0+0x1ec/0x230 [snd_soc_simple_card] simple_dai_link_of+0x1bc/0x334 [snd_soc_simple_card] __simple_for_each_link+0x2ec/0x320 [snd_soc_simple_card] asoc_simple_probe+0x468/0x4dc [snd_soc_simple_card] platform_probe+0x90/0xf0 really_probe+0x118/0x5b0 __driver_probe_device+0xc0/0x1f0 driver_probe_device+0x68/0x120 __driver_attach+0x140/0x294 bus_for_each_dev+0xec/0x160 driver_attach+0x38/0x44 bus_add_driver+0x24c/0x300 driver_register+0xf0/0x210 __platform_driver_register+0x48/0x54 asoc_simple_card_init+0x24/0x1000 [snd_soc_simple_card] do_one_initcall+0xac/0x340 do_init_module+0xd0/0x300 load_module+0x2ba4/0x3100 __do_sys_init_module+0x2c8/0x300 __arm64_sys_init_module+0x48/0x5c invoke_syscall+0x64/0x190 el0_svc_common.constprop.0+0x124/0x154 do_el0_svc+0x44/0xdc el0_svc+0x14/0x50 el0t_64_sync_handler+0xec/0x11c el0t_64_sync+0x14c/0x150 The buggy address belongs to the object at ffffff800ee09600 which belongs to the cache kmalloc-256 of size 256 The buggy address is located 136 bytes inside of 256-byte region [ffffff800ee09600, ffffff800ee09700) The buggy address belongs to the physical page: page:000000002d97303b refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4ee08 head:000000002d97303b order:1 compound_mapcount:0 compound_pincount:0 flags: 0x10200(slab|head|zone=0) raw: 0000000000010200 0000000000000000 dead000000000122 ffffff8002c02480 raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffffff800ee09580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffffff800ee09600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffffff800ee09680: 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffffff800ee09700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffffff800ee09780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ================================================================== Fixes: abae8e57e49a ("clk: generalize devm_clk_get() a bit") Signed-off-by: Andrey Skvortsov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Stephen Boyd <[email protected]>