aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-07-24can: j1939: j1939_session_deactivate(): clarify lifetime of session objectOleksij Rempel1-2/+7
The j1939_session_deactivate() is decrementing the session ref-count and potentially can free() the session. This would cause use-after-free situation. However, the code calling j1939_session_deactivate() does always hold another reference to the session, so that it would not be free()ed in this code path. This patch adds a comment to make this clear and a WARN_ON, to ensure that future changes will not violate this requirement. Further this patch avoids dereferencing the session pointer as a precaution to avoid use-after-free if the session is actually free()ed. Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Link: https://lore.kernel.org/r/[email protected] Reported-by: Xiaochen Zou <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2021-07-24can: raw: raw_setsockopt(): fix raw_rcv panic for sock UAFZiyang Xuan1-2/+18
We get a bug during ltp can_filter test as following. =========================================== [60919.264984] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [60919.265223] PGD 8000003dda726067 P4D 8000003dda726067 PUD 3dda727067 PMD 0 [60919.265443] Oops: 0000 [#1] SMP PTI [60919.265550] CPU: 30 PID: 3638365 Comm: can_filter Kdump: loaded Tainted: G W 4.19.90+ #1 [60919.266068] RIP: 0010:selinux_socket_sock_rcv_skb+0x3e/0x200 [60919.293289] RSP: 0018:ffff8d53bfc03cf8 EFLAGS: 00010246 [60919.307140] RAX: 0000000000000000 RBX: 000000000000001d RCX: 0000000000000007 [60919.320756] RDX: 0000000000000001 RSI: ffff8d5104a8ed00 RDI: ffff8d53bfc03d30 [60919.334319] RBP: ffff8d9338056800 R08: ffff8d53bfc29d80 R09: 0000000000000001 [60919.347969] R10: ffff8d53bfc03ec0 R11: ffffb8526ef47c98 R12: ffff8d53bfc03d30 [60919.350320] perf: interrupt took too long (3063 > 2500), lowering kernel.perf_event_max_sample_rate to 65000 [60919.361148] R13: 0000000000000001 R14: ffff8d53bcf90000 R15: 0000000000000000 [60919.361151] FS: 00007fb78b6b3600(0000) GS:ffff8d53bfc00000(0000) knlGS:0000000000000000 [60919.400812] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [60919.413730] CR2: 0000000000000010 CR3: 0000003e3f784006 CR4: 00000000007606e0 [60919.426479] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [60919.439339] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [60919.451608] PKRU: 55555554 [60919.463622] Call Trace: [60919.475617] <IRQ> [60919.487122] ? update_load_avg+0x89/0x5d0 [60919.498478] ? update_load_avg+0x89/0x5d0 [60919.509822] ? account_entity_enqueue+0xc5/0xf0 [60919.520709] security_sock_rcv_skb+0x2a/0x40 [60919.531413] sk_filter_trim_cap+0x47/0x1b0 [60919.542178] ? kmem_cache_alloc+0x38/0x1b0 [60919.552444] sock_queue_rcv_skb+0x17/0x30 [60919.562477] raw_rcv+0x110/0x190 [can_raw] [60919.572539] can_rcv_filter+0xbc/0x1b0 [can] [60919.582173] can_receive+0x6b/0xb0 [can] [60919.591595] can_rcv+0x31/0x70 [can] [60919.600783] __netif_receive_skb_one_core+0x5a/0x80 [60919.609864] process_backlog+0x9b/0x150 [60919.618691] net_rx_action+0x156/0x400 [60919.627310] ? sched_clock_cpu+0xc/0xa0 [60919.635714] __do_softirq+0xe8/0x2e9 [60919.644161] do_softirq_own_stack+0x2a/0x40 [60919.652154] </IRQ> [60919.659899] do_softirq.part.17+0x4f/0x60 [60919.667475] __local_bh_enable_ip+0x60/0x70 [60919.675089] __dev_queue_xmit+0x539/0x920 [60919.682267] ? finish_wait+0x80/0x80 [60919.689218] ? finish_wait+0x80/0x80 [60919.695886] ? sock_alloc_send_pskb+0x211/0x230 [60919.702395] ? can_send+0xe5/0x1f0 [can] [60919.708882] can_send+0xe5/0x1f0 [can] [60919.715037] raw_sendmsg+0x16d/0x268 [can_raw] It's because raw_setsockopt() concurrently with unregister_netdevice_many(). Concurrent scenario as following. cpu0 cpu1 raw_bind raw_setsockopt unregister_netdevice_many unlist_netdevice dev_get_by_index raw_notifier raw_enable_filters ...... can_rx_register can_rcv_list_find(..., net->can.rx_alldev_list) ...... sock_close raw_release(sock_a) ...... can_receive can_rcv_filter(net->can.rx_alldev_list, ...) raw_rcv(skb, sock_a) BUG After unlist_netdevice(), dev_get_by_index() return NULL in raw_setsockopt(). Function raw_enable_filters() will add sock and can_filter to net->can.rx_alldev_list. Then the sock is closed. Followed by, we sock_sendmsg() to a new vcan device use the same can_filter. Protocol stack match the old receiver whose sock has been released on net->can.rx_alldev_list in can_rcv_filter(). Function raw_rcv() uses the freed sock. UAF BUG is triggered. We can find that the key issue is that net_device has not been protected in raw_setsockopt(). Use rtnl_lock to protect net_device in raw_setsockopt(). Fixes: c18ce101f2e4 ("[CAN]: Add raw protocol") Link: https://lore.kernel.org/r/[email protected] Cc: linux-stable <[email protected]> Signed-off-by: Ziyang Xuan <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2021-07-24arm64: dts: imx8mp: remove fallback compatible string for FlexCANJoakim Zhang1-2/+2
FlexCAN on i.MX8MP is not derived from i.MX6Q, instead reuses from i.MX8QM with extra ECC added and default is enabled, so that the FlexCAN would be put into freeze mode without FLEXCAN_QUIRK_DISABLE_MECR quirk. This patch removes "fsl,imx6q-flexcan" fallback compatible string since it's not compatible with the i.MX6Q. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joakim Zhang <[email protected]> Reviewed-by: Fabio Estevam <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2021-07-23riscv: __asm_copy_to-from_user: Fix: Typos in commentsAkira Tsukamoto1-9/+9
Fixing typos and grammar mistakes and using more intuitive label name. Signed-off-by: Akira Tsukamoto <[email protected]> Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall") Signed-off-by: Palmer Dabbelt <[email protected]>
2021-07-23riscv: __asm_copy_to-from_user: Remove unnecessary size checkAkira Tsukamoto1-1/+0
Clean up: The size of 0 will be evaluated in the next step. Not required here. Signed-off-by: Akira Tsukamoto <[email protected]> Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall") Signed-off-by: Palmer Dabbelt <[email protected]>
2021-07-23riscv: __asm_copy_to-from_user: Fix: fail on RV32Akira Tsukamoto1-1/+1
Had a bug when converting bytes to bits when the cpu was rv32. The a3 contains the number of bytes and multiple of 8 would be the bits. The LGREG is holding 2 for RV32 and 3 for RV32, so to achieve multiple of 8 it must always be constant 3. The 2 was mistakenly used for rv32. Signed-off-by: Akira Tsukamoto <[email protected]> Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall") Signed-off-by: Palmer Dabbelt <[email protected]>
2021-07-23riscv: __asm_copy_to-from_user: Fix: overrun copyAkira Tsukamoto1-3/+3
There were two causes for the overrun memory access. The threshold size was too small. The aligning dst require one SZREG and unrolling word copy requires 8*SZREG, total have to be at least 9*SZREG. Inside the unrolling copy, the subtracting -(8*SZREG-1) would make iteration happening one extra loop. Proper value is -(8*SZREG). Signed-off-by: Akira Tsukamoto <[email protected]> Fixes: ca6eaaa210de ("riscv: __asm_copy_to-from_user: Optimize unaligned memory access and pipeline stall") Signed-off-by: Palmer Dabbelt <[email protected]>
2021-07-23hugetlbfs: fix mount mode command line processingMike Kravetz1-1/+1
In commit 32021982a324 ("hugetlbfs: Convert to fs_context") processing of the mount mode string was changed from match_octal() to fsparam_u32. This changed existing behavior as match_octal does not require octal values to have a '0' prefix, but fsparam_u32 does. Use fsparam_u32oct which provides the same behavior as match_octal. Link: https://lkml.kernel.org/r/[email protected] Fixes: 32021982a324 ("hugetlbfs: Convert to fs_context") Signed-off-by: Mike Kravetz <[email protected]> Reported-by: Dennis Camera <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Cc: David Howells <[email protected]> Cc: Al Viro <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23mm: fix the deadlock in finish_fault()Qi Zheng1-1/+10
Commit 63f3655f9501 ("mm, memcg: fix reclaim deadlock with writeback") fix the following ABBA deadlock by pre-allocating the pte page table without holding the page lock. lock_page(A) SetPageWriteback(A) unlock_page(A) lock_page(B) lock_page(B) pte_alloc_one shrink_page_list wait_on_page_writeback(A) SetPageWriteback(B) unlock_page(B) # flush A, B to clear the writeback Commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") reworked the relevant code but ignored this race. This will cause the deadlock above to appear again, so fix it. Link: https://lkml.kernel.org/r/[email protected] Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") Signed-off-by: Qi Zheng <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Muchun Song <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23mm: mmap_lock: fix disabling preemption directlyMuchun Song1-2/+2
Commit 832b50725373 ("mm: mmap_lock: use local locks instead of disabling preemption") fixed a bug by using local locks. But commit d01079f3d0c0 ("mm/mmap_lock: remove dead code for !CONFIG_TRACING configurations") changed those lines back to the original version. I guess it was introduced by fixing conflicts. Link: https://lkml.kernel.org/r/[email protected] Fixes: d01079f3d0c0 ("mm/mmap_lock: remove dead code for !CONFIG_TRACING configurations") Signed-off-by: Muchun Song <[email protected]> Acked-by: Mel Gorman <[email protected]> Reviewed-by: Yang Shi <[email protected]> Reviewed-by: Pankaj Gupta <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23mm/secretmem: wire up ->set_page_dirtyMike Rapoport1-0/+1
Make secretmem up to date with the changes done in commit 0af573780b0b ("mm: require ->set_page_dirty to be explicitly wired up") so that unconditional call to this method won't cause crashes. Link: https://lkml.kernel.org/r/[email protected] Fixes: 0af573780b0b ("mm: require ->set_page_dirty to be explicitly wired up") Signed-off-by: Mike Rapoport <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23writeback, cgroup: do not reparent dax inodesRoman Gushchin1-0/+3
The inode switching code is not suited for dax inodes. An attempt to switch a dax inode to a parent writeback structure (as a part of a writeback cleanup procedure) results in a panic like this: run fstests generic/270 at 2021-07-15 05:54:02 XFS (pmem0p2): EXPERIMENTAL big timestamp feature in use. Use at your own risk! XFS (pmem0p2): DAX enabled. Warning: EXPERIMENTAL, use at your own risk XFS (pmem0p2): EXPERIMENTAL inode btree counters feature in use. Use at your own risk! XFS (pmem0p2): Mounting V5 Filesystem XFS (pmem0p2): Ending clean mount XFS (pmem0p2): Quotacheck needed: Please wait. XFS (pmem0p2): Quotacheck: Done. XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks) XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks) XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks) BUG: unable to handle page fault for address: 0000000005b0f669 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 13 PID: 10479 Comm: kworker/13:16 Not tainted 5.14.0-rc1-master-8096acd7442e+ #8 Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 09/13/2016 Workqueue: inode_switch_wbs inode_switch_wbs_work_fn RIP: 0010:inode_do_switch_wbs+0xaf/0x470 Code: 00 30 0f 85 c1 03 00 00 0f 1f 44 00 00 31 d2 48 c7 c6 ff ff ff ff 48 8d 7c 24 08 e8 eb 49 1a 00 48 85 c0 74 4a bb ff ff ff ff <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 00 a8 08 0f 85 RSP: 0018:ffff9c66691abdc8 EFLAGS: 00010002 RAX: 0000000005b0f661 RBX: 00000000ffffffff RCX: ffff89e6a21382b0 RDX: 0000000000000001 RSI: ffff89e350230248 RDI: ffffffffffffffff RBP: ffff89e681d19400 R08: 0000000000000000 R09: 0000000000000228 R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffff89e6a2138130 R13: ffff89e316af7400 R14: ffff89e316af6e78 R15: ffff89e6a21382b0 FS: 0000000000000000(0000) GS:ffff89ee5fb40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005b0f669 CR3: 0000000cb2410004 CR4: 00000000001706e0 Call Trace: inode_switch_wbs_work_fn+0xb6/0x2a0 process_one_work+0x1e6/0x380 worker_thread+0x53/0x3d0 kthread+0x10f/0x130 ret_from_fork+0x22/0x30 Modules linked in: xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables nfnetlink bridge stp llc rfkill sunrpc intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ipmi_ssif kvm mgag200 i2c_algo_bit iTCO_wdt irqbypass drm_kms_helper iTCO_vendor_support acpi_ipmi rapl syscopyarea sysfillrect intel_cstate ipmi_si sysimgblt ioatdma dax_pmem_compat fb_sys_fops ipmi_devintf device_dax i2c_i801 pcspkr intel_uncore hpilo nd_pmem cec dax_pmem_core dca i2c_smbus acpi_tad lpc_ich ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c sd_mod t10_pi crct10dif_pclmul crc32_pclmul crc32c_intel tg3 ghash_clmulni_intel serio_raw hpsa hpwdt scsi_transport_sas wmi dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000005b0f669 ---[ end trace ed2105faff8384f3 ]--- RIP: 0010:inode_do_switch_wbs+0xaf/0x470 Code: 00 30 0f 85 c1 03 00 00 0f 1f 44 00 00 31 d2 48 c7 c6 ff ff ff ff 48 8d 7c 24 08 e8 eb 49 1a 00 48 85 c0 74 4a bb ff ff ff ff <48> 8b 50 08 48 8d 4a ff 83 e2 01 48 0f 45 c1 48 8b 00 a8 08 0f 85 RSP: 0018:ffff9c66691abdc8 EFLAGS: 00010002 RAX: 0000000005b0f661 RBX: 00000000ffffffff RCX: ffff89e6a21382b0 RDX: 0000000000000001 RSI: ffff89e350230248 RDI: ffffffffffffffff RBP: ffff89e681d19400 R08: 0000000000000000 R09: 0000000000000228 R10: ffffffffffffffff R11: ffffffffffffffc0 R12: ffff89e6a2138130 R13: ffff89e316af7400 R14: ffff89e316af6e78 R15: ffff89e6a21382b0 FS: 0000000000000000(0000) GS:ffff89ee5fb40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005b0f669 CR3: 0000000cb2410004 CR4: 00000000001706e0 Kernel panic - not syncing: Fatal exception Kernel Offset: 0x15200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception ]--- The crash happens on an attempt to iterate over attached pagecache pages and check the dirty flag: a dax inode's xarray contains pfn's instead of generic struct page pointers. This happens for DAX and not for other kinds of non-page entries in the inodes because it's a tagged iteration, and shadow/swap entries are never tagged; only DAX entries get tagged. Fix the problem by bailing out (with the false return value) of inode_prepare_sbs_switch() if a dax inode is passed. [[email protected]: changelog addition] Link: https://lkml.kernel.org/r/[email protected] Fixes: c22d70a162d3 ("writeback, cgroup: release dying cgwbs by switching attached inodes") Signed-off-by: Roman Gushchin <[email protected]> Reported-by: Murphy Zhou <[email protected]> Reported-by: Darrick J. Wong <[email protected]> Tested-by: Darrick J. Wong <[email protected]> Tested-by: Murphy Zhou <[email protected]> Acked-by: Matthew Wilcox (Oracle) <[email protected]> Cc: Jan Kara <[email protected]> Cc: Dave Chinner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23writeback, cgroup: remove wb from offline list before releasing refcntRoman Gushchin1-1/+1
Boyang reported that the commit c22d70a162d3 ("writeback, cgroup: release dying cgwbs by switching attached inodes") causes the kernel to crash while running xfstests generic/256 on ext4 on aarch64 and ppc64le. run fstests generic/256 at 2021-07-12 05:41:40 EXT4-fs (vda3): mounted filesystem with ordered data mode. Opts: . Quota mode: none. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x96000005 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault Data abort info: ISV = 0, ISS = 0x00000005 CM = 0, WnR = 0 user pgtable: 64k pages, 48-bit VAs, pgdp=00000000b0502000 [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 Internal error: Oops: 96000005 [#1] SMP Modules linked in: dm_flakey dm_snapshot dm_bufio dm_zero dm_mod loop tls rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc ext4 vfat fat mbcache jbd2 drm fuse xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64 sha1_ce virtio_blk virtio_net net_failover virtio_console failover virtio_mmio aes_neon_bs [last unloaded: scsi_debug] CPU: 0 PID: 408468 Comm: kworker/u8:5 Tainted: G X --------- --- 5.14.0-0.rc1.15.bx.el9.aarch64 #1 Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 Workqueue: events_unbound cleanup_offline_cgwbs_workfn pstate: 004000c5 (nzcv daIF +PAN -UAO -TCO BTYPE=--) pc : cleanup_offline_cgwbs_workfn+0x320/0x394 lr : cleanup_offline_cgwbs_workfn+0xe0/0x394 sp : ffff80001554fd10 x29: ffff80001554fd10 x28: 0000000000000000 x27: 0000000000000001 x26: 0000000000000000 x25: 00000000000000e0 x24: ffffd2a2fbe671a8 x23: ffff80001554fd88 x22: ffffd2a2fbe67198 x21: ffffd2a2fc25a730 x20: ffff210412bc3000 x19: ffff210412bc3280 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000040 x11: ffff210481572238 x10: ffff21048157223a x9 : ffffd2a2fa276c60 x8 : ffff210484106b60 x7 : 0000000000000000 x6 : 000000000007d18a x5 : ffff210416a86400 x4 : ffff210412bc0280 x3 : 0000000000000000 x2 : ffff80001554fd88 x1 : ffff210412bc0280 x0 : 0000000000000003 Call trace: cleanup_offline_cgwbs_workfn+0x320/0x394 process_one_work+0x1f4/0x4b0 worker_thread+0x184/0x540 kthread+0x114/0x120 ret_from_fork+0x10/0x18 Code: d63f0020 97f99963 17ffffa6 f8588263 (f9400061) ---[ end trace e250fe289272792a ]--- Kernel panic - not syncing: Oops: Fatal exception SMP: stopping secondary CPUs SMP: failed to stop secondary CPUs 0-2 Kernel Offset: 0x52a2e9fa0000 from 0xffff800010000000 PHYS_OFFSET: 0xfff0defca0000000 CPU features: 0x00200251,23200840 Memory Limit: none ---[ end Kernel panic - not syncing: Oops: Fatal exception ]--- The problem happens when cgwb_release_workfn() races with cleanup_offline_cgwbs_workfn(): wb_tryget() in cleanup_offline_cgwbs_workfn() can be called after percpu_ref_exit() is cgwb_release_workfn(), which is basically a use-after-free error. Fix the problem by making removing the writeback structure from the offline list before releasing the percpu reference counter. It will guarantee that cleanup_offline_cgwbs_workfn() will not see and not access writeback structures which are about to be released. Link: https://lkml.kernel.org/r/[email protected] Fixes: c22d70a162d3 ("writeback, cgroup: release dying cgwbs by switching attached inodes") Signed-off-by: Roman Gushchin <[email protected]> Reported-by: Boyang Xue <[email protected]> Suggested-by: Jan Kara <[email protected]> Tested-by: Darrick J. Wong <[email protected]> Cc: Will Deacon <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Murphy Zhou <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23memblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regionsMike Rapoport2-3/+4
Commit b10d6bca8720 ("arch, drivers: replace for_each_membock() with for_each_mem_range()") didn't take into account that when there is movable_node parameter in the kernel command line, for_each_mem_range() would skip ranges marked with MEMBLOCK_HOTPLUG. The page table setup code in POWER uses for_each_mem_range() to create the linear mapping of the physical memory and since the regions marked as MEMORY_HOTPLUG are skipped, they never make it to the linear map. A later access to the memory in those ranges will fail: BUG: Unable to handle kernel data access on write at 0xc000000400000000 Faulting instruction address: 0xc00000000008a3c0 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 0 PID: 53 Comm: kworker/u2:0 Not tainted 5.13.0 #7 NIP: c00000000008a3c0 LR: c0000000003c1ed8 CTR: 0000000000000040 REGS: c000000008a57770 TRAP: 0300 Not tainted (5.13.0) MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 84222202 XER: 20040000 CFAR: c0000000003c1ed4 DAR: c000000400000000 DSISR: 42000000 IRQMASK: 0 GPR00: c0000000003c1ed8 c000000008a57a10 c0000000019da700 c000000400000000 GPR04: 0000000000000280 0000000000000180 0000000000000400 0000000000000200 GPR08: 0000000000000100 0000000000000080 0000000000000040 0000000000000300 GPR12: 0000000000000380 c000000001bc0000 c0000000001660c8 c000000006337e00 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000040000000 0000000020000000 c000000001a81990 c000000008c30000 GPR24: c000000008c20000 c000000001a81998 000fffffffff0000 c000000001a819a0 GPR28: c000000001a81908 c00c000001000000 c000000008c40000 c000000008a64680 NIP clear_user_page+0x50/0x80 LR __handle_mm_fault+0xc88/0x1910 Call Trace: __handle_mm_fault+0xc44/0x1910 (unreliable) handle_mm_fault+0x130/0x2a0 __get_user_pages+0x248/0x610 __get_user_pages_remote+0x12c/0x3e0 get_arg_page+0x54/0xf0 copy_string_kernel+0x11c/0x210 kernel_execve+0x16c/0x220 call_usermodehelper_exec_async+0x1b0/0x2f0 ret_from_kernel_thread+0x5c/0x70 Instruction dump: 79280fa4 79271764 79261f24 794ae8e2 7ca94214 7d683a14 7c893a14 7d893050 7d4903a6 60000000 60000000 60000000 <7c001fec> 7c091fec 7c081fec 7c051fec ---[ end trace 490b8c67e6075e09 ]--- Making for_each_mem_range() include MEMBLOCK_HOTPLUG regions in the traversal fixes this issue. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976100 Link: https://lkml.kernel.org/r/[email protected] Fixes: b10d6bca8720 ("arch, drivers: replace for_each_membock() with for_each_mem_range()") Signed-off-by: Mike Rapoport <[email protected]> Tested-by: Greg Kurz <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: <[email protected]> [5.10+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23mm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interactionSergei Trofimovich1-13/+16
To reproduce the failure we need the following system: - kernel command: page_poison=1 init_on_free=0 init_on_alloc=0 - kernel config: * CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y * CONFIG_INIT_ON_FREE_DEFAULT_ON=y * CONFIG_PAGE_POISONING=y Resulting in: 0000000085629bdd: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000000022861832: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000000c597f5b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ CPU: 11 PID: 15195 Comm: bash Kdump: loaded Tainted: G U O 5.13.1-gentoo-x86_64 #1 Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 2801 01/13/2021 Call Trace: dump_stack+0x64/0x7c __kernel_unpoison_pages.cold+0x48/0x84 post_alloc_hook+0x60/0xa0 get_page_from_freelist+0xdb8/0x1000 __alloc_pages+0x163/0x2b0 __get_free_pages+0xc/0x30 pgd_alloc+0x2e/0x1a0 mm_init+0x185/0x270 dup_mm+0x6b/0x4f0 copy_process+0x190d/0x1b10 kernel_clone+0xba/0x3b0 __do_sys_clone+0x8f/0xb0 do_syscall_64+0x68/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Before commit 51cba1ebc60d ("init_on_alloc: Optimize static branches") init_on_alloc never enabled static branch by default. It could only be enabed explicitly by init_mem_debugging_and_hardening(). But after commit 51cba1ebc60d, a static branch could already be enabled by default. There was no code to ever disable it. That caused page_poison=1 / init_on_free=1 conflict. This change extends init_mem_debugging_and_hardening() to also disable static branch disabling. Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Fixes: 51cba1ebc60d ("init_on_alloc: Optimize static branches") Signed-off-by: Sergei Trofimovich <[email protected]> Signed-off-by: Kees Cook <[email protected]> Co-developed-by: Kees Cook <[email protected]> Reported-by: Mikhail Morfikov <[email protected]> Reported-by: <[email protected]> Tested-by: <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23mm: use kmap_local_page in memzero_pageChristoph Hellwig1-2/+2
The commit message introducing the global memzero_page explicitly mentions switching to kmap_local_page in the commit log but doesn't actually do that. Link: https://lkml.kernel.org/r/[email protected] Fixes: 28961998f858 ("iov_iter: lift memzero_page() to highmem.h") Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23mm: call flush_dcache_page() in memcpy_to_page() and memzero_page()Christoph Hellwig1-0/+2
memcpy_to_page and memzero_page can write to arbitrary pages, which could be in the page cache or in high memory, so call flush_kernel_dcache_pages to flush the dcache. This is a problem when using these helpers on dcache challeneged architectures. Right now there are just a few users, chances are no one used the PC floppy driver, the aha1542 driver for an ISA SCSI HBA, and a few advanced and optional btrfs and ext4 features on those platforms yet since the conversion. Link: https://lkml.kernel.org/r/[email protected] Fixes: bb90d4bc7b6a ("mm/highmem: Lift memcpy_[to|from]_page to core") Fixes: 28961998f858 ("iov_iter: lift memzero_page() to highmem.h") Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Cc: Chaitanya Kulkarni <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23kfence: skip all GFP_ZONEMASK allocationsAlexander Potapenko1-0/+9
Allocation requests outside ZONE_NORMAL (MOVABLE, HIGHMEM or DMA) cannot be fulfilled by KFENCE, because KFENCE memory pool is located in a zone different from the requested one. Because callers of kmem_cache_alloc() may actually rely on the allocation to reside in the requested zone (e.g. memory allocations done with __GFP_DMA must be DMAable), skip all allocations done with GFP_ZONEMASK and/or respective SLAB flags (SLAB_CACHE_DMA and SLAB_CACHE_DMA32). Link: https://lkml.kernel.org/r/[email protected] Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure") Signed-off-by: Alexander Potapenko <[email protected]> Reviewed-by: Marco Elver <[email protected]> Acked-by: Souptick Joarder <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Souptick Joarder <[email protected]> Cc: <[email protected]> [5.12+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23kfence: move the size check to the beginning of __kfence_alloc()Alexander Potapenko1-3/+7
Check the allocation size before toggling kfence_allocation_gate. This way allocations that can't be served by KFENCE will not result in waiting for another CONFIG_KFENCE_SAMPLE_INTERVAL without allocating anything. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alexander Potapenko <[email protected]> Suggested-by: Marco Elver <[email protected]> Reviewed-by: Marco Elver <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Marco Elver <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: <[email protected]> [5.12+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23kfence: defer kfence_test_init to ensure that kunit debugfs is createdWeizhao Ouyang1-1/+1
kfence_test_init and kunit_init both use the same level late_initcall, which means if kfence_test_init linked ahead of kunit_init, kfence_test_init will get a NULL debugfs_rootdir as parent dentry, then kfence_test_init and kfence_debugfs_init both create a debugfs node named "kfence" under debugfs_mount->mnt_root, and it will throw out "debugfs: Directory 'kfence' with parent '/' already present!" with EEXIST. So kfence_test_init should be deferred. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Weizhao Ouyang <[email protected]> Tested-by: Marco Elver <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Dmitry Vyukov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23selftest: use mmap instead of posix_memalign to allocate memoryPeter Collingbourne1-2/+4
This test passes pointers obtained from anon_allocate_area to the userfaultfd and mremap APIs. This causes a problem if the system allocator returns tagged pointers because with the tagged address ABI the kernel rejects tagged addresses passed to these APIs, which would end up causing the test to fail. To make this test compatible with such system allocators, stop using the system allocator to allocate memory in anon_allocate_area, and instead just use mmap. Link: https://lkml.kernel.org/r/[email protected] Link: https://linux-review.googlesource.com/id/Icac91064fcd923f77a83e8e133f8631c5b8fc241 Fixes: c47174fc362a ("userfaultfd: selftest") Co-developed-by: Lokesh Gidra <[email protected]> Signed-off-by: Lokesh Gidra <[email protected]> Signed-off-by: Peter Collingbourne <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Cc: Vincenzo Frascino <[email protected]> Cc: Dave Martin <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Alistair Delva <[email protected]> Cc: William McVicker <[email protected]> Cc: Evgenii Stepanov <[email protected]> Cc: Mitch Phillips <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: <[email protected]> [5.4] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23userfaultfd: do not untag user pointersPeter Collingbourne2-22/+30
Patch series "userfaultfd: do not untag user pointers", v5. If a user program uses userfaultfd on ranges of heap memory, it may end up passing a tagged pointer to the kernel in the range.start field of the UFFDIO_REGISTER ioctl. This can happen when using an MTE-capable allocator, or on Android if using the Tagged Pointers feature for MTE readiness [1]. When a fault subsequently occurs, the tag is stripped from the fault address returned to the application in the fault.address field of struct uffd_msg. However, from the application's perspective, the tagged address *is* the memory address, so if the application is unaware of memory tags, it may get confused by receiving an address that is, from its point of view, outside of the bounds of the allocation. We observed this behavior in the kselftest for userfaultfd [2] but other applications could have the same problem. Address this by not untagging pointers passed to the userfaultfd ioctls. Instead, let the system call fail. Also change the kselftest to use mmap so that it doesn't encounter this problem. [1] https://source.android.com/devices/tech/debug/tagged-pointers [2] tools/testing/selftests/vm/userfaultfd.c This patch (of 2): Do not untag pointers passed to the userfaultfd ioctls. Instead, let the system call fail. This will provide an early indication of problems with tag-unaware userspace code instead of letting the code get confused later, and is consistent with how we decided to handle brk/mmap/mremap in commit dcde237319e6 ("mm: Avoid creating virtual address aliases in brk()/mmap()/mremap()"), as well as being consistent with the existing tagged address ABI documentation relating to how ioctl arguments are handled. The code change is a revert of commit 7d0325749a6c ("userfaultfd: untag user pointers") plus some fixups to some additional calls to validate_range that have appeared since then. [1] https://source.android.com/devices/tech/debug/tagged-pointers [2] tools/testing/selftests/vm/userfaultfd.c Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://linux-review.googlesource.com/id/I761aa9f0344454c482b83fcfcce547db0a25501b Fixes: 63f0c6037965 ("arm64: Introduce prctl() options to control the tagged user addresses ABI") Signed-off-by: Peter Collingbourne <[email protected]> Reviewed-by: Andrey Konovalov <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Cc: Alistair Delva <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Dave Martin <[email protected]> Cc: Evgenii Stepanov <[email protected]> Cc: Lokesh Gidra <[email protected]> Cc: Mitch Phillips <[email protected]> Cc: Vincenzo Frascino <[email protected]> Cc: Will Deacon <[email protected]> Cc: William McVicker <[email protected]> Cc: <[email protected]> [5.4] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-23riscv: stacktrace: pin the task's stack in get_wchanJisheng Zhang1-1/+5
Pin the task's stack before calling walk_stackframe() in get_wchan(). This can fix the panic as reported by Andreas when CONFIG_VMAP_STACK=y: [ 65.609696] Unable to handle kernel paging request at virtual address ffffffd0003bbde8 [ 65.610460] Oops [#1] [ 65.610626] Modules linked in: virtio_blk virtio_mmio rtc_goldfish btrfs blake2b_generic libcrc32c xor raid6_pq sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua efivarfs [ 65.611670] CPU: 2 PID: 1 Comm: systemd Not tainted 5.14.0-rc1-1.g34fe32a-default #1 openSUSE Tumbleweed (unreleased) c62f7109153e5a0897ee58ba52393ad99b070fd2 [ 65.612334] Hardware name: riscv-virtio,qemu (DT) [ 65.613008] epc : get_wchan+0x5c/0x88 [ 65.613334] ra : get_wchan+0x42/0x88 [ 65.613625] epc : ffffffff800048a4 ra : ffffffff8000488a sp : ffffffd00021bb90 [ 65.614008] gp : ffffffff817709f8 tp : ffffffe07fe91b80 t0 : 00000000000001f8 [ 65.614411] t1 : 0000000000020000 t2 : 0000000000000000 s0 : ffffffd00021bbd0 [ 65.614818] s1 : ffffffd0003bbdf0 a0 : 0000000000000001 a1 : 0000000000000002 [ 65.615237] a2 : ffffffff81618008 a3 : 0000000000000000 a4 : 0000000000000000 [ 65.615637] a5 : ffffffd0003bc000 a6 : 0000000000000002 a7 : ffffffe27d370000 [ 65.616022] s2 : ffffffd0003bbd90 s3 : ffffffff8071a81e s4 : 0000000000003fff [ 65.616407] s5 : ffffffffffffc000 s6 : 0000000000000000 s7 : ffffffff81618008 [ 65.616845] s8 : 0000000000000001 s9 : 0000000180000040 s10: 0000000000000000 [ 65.617248] s11: 000000000000016b t3 : 000000ff00000000 t4 : 0c6aec92de5e3fd7 [ 65.617672] t5 : fff78f60608fcfff t6 : 0000000000000078 [ 65.618088] status: 0000000000000120 badaddr: ffffffd0003bbde8 cause: 000000000000000d [ 65.618621] [<ffffffff800048a4>] get_wchan+0x5c/0x88 [ 65.619008] [<ffffffff8022da88>] do_task_stat+0x7a2/0xa46 [ 65.619325] [<ffffffff8022e87e>] proc_tgid_stat+0xe/0x16 [ 65.619637] [<ffffffff80227dd6>] proc_single_show+0x46/0x96 [ 65.619979] [<ffffffff801ccb1e>] seq_read_iter+0x190/0x31e [ 65.620341] [<ffffffff801ccd70>] seq_read+0xc4/0x104 [ 65.620633] [<ffffffff801a6bfe>] vfs_read+0x6a/0x112 [ 65.620922] [<ffffffff801a701c>] ksys_read+0x54/0xbe [ 65.621206] [<ffffffff801a7094>] sys_read+0xe/0x16 [ 65.621474] [<ffffffff8000303e>] ret_from_syscall+0x0/0x2 [ 65.622169] ---[ end trace f24856ed2b8789c5 ]--- [ 65.622832] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b Signed-off-by: Jisheng Zhang <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2021-07-23io_uring: explicitly catch any illegal async queue attemptJens Axboe2-1/+17
Catch an illegal case to queue async from an unrelated task that got the ring fd passed to it. This should not be possible to hit, but better be proactive and catch it explicitly. io-wq is extended to check for early IO_WQ_WORK_CANCEL being set on a work item as well, so it can run the request through the normal cancelation path. Signed-off-by: Jens Axboe <[email protected]>
2021-07-23io_uring: never attempt iopoll reissue from release pathJens Axboe1-7/+7
There are two reasons why this shouldn't be done: 1) Ring is exiting, and we're canceling requests anyway. Any request should be canceled anyway. In theory, this could iterate for a number of times if someone else is also driving the target block queue into request starvation, however the likelihood of this happening is miniscule. 2) If the original task decided to pass the ring to another task, then we don't want to be reissuing from this context as it may be an unrelated task or context. No assumptions should be made about the context in which ->release() is run. This can only happen for pure read/write, and we'll get -EFAULT on them anyway. Link: https://lore.kernel.org/io-uring/[email protected]/ Reported-by: Al Viro <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2021-07-23Merge branch 'ionic-fixes'David S. Miller4-127/+132
Shannon Nelson says: ==================== ionic: bug fixes Fix a thread race in rx_mode, remove unnecessary log message, fix dynamic coalescing issues, and count all csum_none cases. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-07-23ionic: count csum_none when offload enabledShannon Nelson1-6/+5
Be sure to count the csum_none cases when csum offload is enabled. Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling") Signed-off-by: Shannon Nelson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-07-23ionic: fix up dim accounting for tx and rxShannon Nelson1-7/+21
We need to count the correct Tx and/or Rx packets for dynamic interrupt moderation, depending on which we're processing on the queue interrupt. Fixes: 04a834592bf5 ("ionic: dynamic interrupt moderation") Signed-off-by: Shannon Nelson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-07-23ionic: remove intr coalesce update from napiShannon Nelson2-5/+13
Move the interrupt coalesce value update out of the napi thread and into the dim_work thread and set it only when it has actually changed. Fixes: 04a834592bf5 ("ionic: dynamic interrupt moderation") Signed-off-by: Shannon Nelson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-07-23ionic: catch no ptp support earlierShannon Nelson2-8/+9
If PTP configuration is attempted on ports that don't support it, such as VF ports, the driver will return an error status -95, or EOPNOSUPP and print an error message enp98s0: hwstamp set failed: -95 Because some daemons can retry every few seconds, this can end up filling the dmesg log and pushing out other more useful messages. We can catch this issue earlier in our handling and return the error without a log message. Fixes: 829600ce5e4e ("ionic: add ts_config replay") Signed-off-by: Shannon Nelson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-07-23ionic: make all rx_mode work threadsafeShannon Nelson2-102/+85
Move the bulk of the code from ionic_set_rx_mode(), which can be called from atomic context, into ionic_lif_rx_mode() which is a safe context. A call from the stack will get pushed off into a work thread, but it is also possible to simultaneously have a call driven by a queue reconfig request from an ethtool command or fw recovery event. We add a mutex around the rx_mode work to be sure they don't collide. Fixes: 81dbc24147f9 ("ionic: change set_rx_mode from_ndo to can_sleep") Signed-off-by: Shannon Nelson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-07-23Merge branch '40GbE' of ↵David S. Miller4-25/+94
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-07-23 This series contains updates to i40e driver only. Arkadiusz corrects the order of calls for disabling queues to resolve a false error message and adds a better message to the user when transitioning FW LLDP back on while the firmware is still processing the off request. Lukasz adds additional information regarding possible incorrect cable use when a PHY type error occurs. Jedrzej adds ndo_select_queue support to resolve incorrect queue selection when SW DCB is used and adds a warning when there are not enough queues for desired TC configuration. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-07-23Merge tag 'for-5.14-rc2-tag' of ↵Linus Torvalds12-47/+79
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few fixes and one patch to help some block layer API cleanups: - skip missing device when running fstrim - fix unpersisted i_size on fsync after expanding truncate - fix lock inversion problem when doing qgroup extent tracing - replace bdgrab/bdput usage, replace gendisk by block_device" * tag 'for-5.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: store a block_device in struct btrfs_ordered_extent btrfs: fix lock inversion problem when doing qgroup extent tracing btrfs: check for missing device in btrfs_trim_fs btrfs: fix unpersisted i_size on fsync after expanding truncate
2021-07-23Merge tag 'ceph-for-5.14-rc3' of git://github.com/ceph/ceph-clientLinus Torvalds2-21/+14
Pull ceph fixes from Ilya Dryomov: "A subtle deadlock on lock_rwsem (marked for stable) and rbd fixes for a -rc1 regression. Also included a rare WARN condition tweak" * tag 'ceph-for-5.14-rc3' of git://github.com/ceph/ceph-client: rbd: resurrect setting of disk->private_data in rbd_init_disk() ceph: don't WARN if we're still opening a session to an MDS rbd: don't hold lock_rwsem while running_list is being drained rbd: always kick acquire on "acquired" and "released" notifications
2021-07-23Merge tag 'trace-v5.14-rc2' of ↵Linus Torvalds8-20/+53
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Fix deadloop in ring buffer because of using stale "read" variable - Fix synthetic event use of field_pos as boolean and not an index - Fixed histogram special var "cpu" overriding event fields called "cpu" - Cleaned up error prone logic in alloc_synth_event() - Removed call to synchronize_rcu_tasks_rude() when not needed - Removed redundant initialization of a local variable "ret" - Fixed kernel crash when updating tracepoint callbacks of different priorities. * tag 'trace-v5.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracepoints: Update static_call before tp_funcs when adding a tracepoint ftrace: Remove redundant initialization of variable ret ftrace: Avoid synchronize_rcu_tasks_rude() call when not necessary tracing: Clean up alloc_synth_event() tracing/histogram: Rename "cpu" to "common_cpu" tracing: Synthetic event field_pos is an index not a boolean tracing: Fix bug in rb_per_cpu_empty() that might cause deadloop.
2021-07-23Merge tag 'm68k-for-v5.14-tag2' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k fix from Geert Uytterhoeven: - Fix a Mac defconfig regression due to the IDE -> ATA switch * tag 'm68k-for-v5.14-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: MAC should select HAVE_PATA_PLATFORM
2021-07-23Merge tag 'acpi-5.14-rc3' of ↵Linus Torvalds5-15/+6
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix a recently broken Kconfig dependency and ACPI device reference counting in an iterator macro. Specifics: - Fix recently broken Kconfig dependency for the ACPI table override via built-in initrd (Robert Richter) - Fix ACPI device reference counting in the for_each_acpi_dev_match() helper macro to avoid use-after-free (Andy Shevchenko)" * tag 'acpi-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: utils: Fix reference counting in for_each_acpi_dev_match() ACPI: Kconfig: Fix table override from built-in initrd
2021-07-23Merge tag 'driver-core-5.14-rc3' of ↵Linus Torvalds2-3/+11
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are two small driver core fixes to resolve some reported problems for 5.14-rc3. They include: - aux bus memory leak fix - unneeded warning message removed when removing a device link. Both have been in linux-next with no reported problems" * tag 'driver-core-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: driver core: Prevent warning when removing a device link from unregistered consumer driver core: auxiliary bus: Fix memory leak when driver_register() fail
2021-07-23Merge tag 'char-misc-5.14-rc3' of ↵Linus Torvalds4-16/+58
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fixes from Greg KH: "Here are some small char/misc driver fixes for 5.14-rc3. Included in here are: - MAINTAINERS file updates for two changes in different driver subsystems - mhi bus bugfixes - nds32 bugfix that resolves a reported problem All have been in linux-next with no reported problems" * tag 'char-misc-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: nds32: fix up stack guard gap MAINTAINERS: Change ACRN HSM driver maintainer MAINTAINERS: Update for VMCI driver bus: mhi: pci_generic: Fix inbound IPCR channel bus: mhi: core: Validate channel ID when processing command completions bus: mhi: pci_generic: Apply no-op for wake using sideband wake boolean
2021-07-23Merge tag 'usb-5.14-rc3' of ↵Linus Torvalds36-127/+264
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some USB fixes for 5.14-rc3 to resolve a bunch of tiny problems reported. Included in here are: - dtsi revert to resolve a problem which broke android systems that relied on the dts name to find the USB controller device. People are still working out the "real" solution for this, but for now the revert is needed. - core USB fix for pipe calculation found by syzbot - typec fixes - gadget driver fixes - new usb-serial device ids - new USB quirks - xhci fixes - usb hub fixes for power management issues reported - other tiny fixes All have been in linux-next with no reported problems" * tag 'usb-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (27 commits) USB: serial: cp210x: add ID for CEL EM3588 USB ZigBee stick Revert "USB: quirks: ignore remote wake-up on Fibocom L850-GL LTE modem" usb: cdc-wdm: fix build error when CONFIG_WWAN_CORE is not set Revert "arm64: dts: qcom: Harmonize DWC USB3 DT nodes name" usb: dwc2: gadget: Fix sending zero length packet in DDMA mode. usb: dwc2: Skip clock gating on Samsung SoCs usb: renesas_usbhs: Fix superfluous irqs happen after usb_pkt_pop() usb: dwc2: gadget: Fix GOUTNAK flow for Slave mode. usb: phy: Fix page fault from usb_phy_uevent usb: xhci: avoid renesas_usb_fw.mem when it's unusable usb: gadget: u_serial: remove WARN_ON on null port usb: dwc3: avoid NULL access of usb_gadget_driver usb: max-3421: Prevent corruption of freed memory usb: gadget: Fix Unbalanced pm_runtime_enable in tegra_xudc_probe MAINTAINERS: repair reference in USB IP DRIVER FOR HISILICON KIRIN 970 usb: typec: stusb160x: Don't block probing of consumer of "connector" nodes usb: typec: stusb160x: register role switch before interrupt registration USB: usb-storage: Add LaCie Rugged USB3-FW to IGNORE_UAS usb: ehci: Prevent missed ehci interrupts with edge-triggered MSI usb: hub: Disable USB 3 device initiated lpm if exit latency is too high ...
2021-07-23Merge tag 'sound-5.14-rc3' of ↵Linus Torvalds23-96/+195
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes, mostly covering device-specific regressions and bugs over ASoC, HD-audio and USB-audio, while the ALSA PCM core received a few additional fixes for the possible (new and old) regressions" * tag 'sound-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (29 commits) ALSA: usb-audio: Add registration quirk for JBL Quantum headsets ALSA: hda/hdmi: Add quirk to force pin connectivity on NUC10 ALSA: pcm: Fix mmap without buffer preallocation ALSA: pcm: Fix mmap capability check ALSA: hda: intel-dsp-cfg: add missing ElkhartLake PCI ID ASoC: ti: j721e-evm: Check for not initialized parent_clk_id ASoC: ti: j721e-evm: Fix unbalanced domain activity tracking during startup ALSA: hda/realtek: Fix pop noise and 2 Front Mic issues on a machine ALSA: hdmi: Expose all pins on MSI MS-7C94 board ALSA: sb: Fix potential ABBA deadlock in CSP driver ASoC: rt5682: Fix the issue of garbled recording after powerd_dbus_suspend ASoC: amd: reverse stop sequence for stoneyridge platform ASoC: soc-pcm: add a flag to reverse the stop sequence ASoC: codecs: wcd938x: setup irq during component bind ASoC: dt-bindings: renesas: rsnd: Fix incorrect 'port' regex schema ALSA: usb-audio: Add missing proc text entry for BESPOKEN type ASoC: codecs: wcd938x: make sdw dependency explicit in Kconfig ASoC: SOF: Intel: Update ADL descriptor to use ACPI power states ASoC: rt5631: Fix regcache sync errors on resume ALSA: pcm: Call substream ack() method upon compat mmap commit ...
2021-07-23Merge tag 'mac80211-for-net-2021-07-23' of ↵David S. Miller8-53/+95
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Couple of fixes: * fix aggregation on mesh * fix late enabling of 4-addr mode * leave monitor SKBs with some headroom * limit band information for old applications * fix virt-wifi WARN_ON * fix memory leak in cfg80211 BSS list maintenance
2021-07-23NIU: fix incorrect error return, missed in previous revertPaul Jakma1-1/+2
Commit 7930742d6, reverting 26fd962, missed out on reverting an incorrect change to a return value. The niu_pci_vpd_scan_props(..) == 1 case appears to be a normal path - treating it as an error and return -EINVAL was breaking VPD_SCAN and causing the driver to fail to load. Fix, so my Neptune card works again. Cc: Kangjie Lu <[email protected]> Cc: Shannon Nelson <[email protected]> Cc: David S. Miller <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: stable <[email protected]> Fixes: 7930742d ('Revert "niu: fix missing checks of niu_pci_eeprom_read"') Signed-off-by: Paul Jakma <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-07-23net: qrtr: fix memory leaksPavel Skripkin1-1/+5
Syzbot reported memory leak in qrtr. The problem was in unputted struct sock. qrtr_local_enqueue() function calls qrtr_port_lookup() which takes sock reference if port was found. Then there is the following check: if (!ipc || &ipc->sk == skb->sk) { ... return -ENODEV; } Since we should drop the reference before returning from this function and ipc can be non-NULL inside this if, we should add qrtr_port_put() inside this if. The similar corner case is in qrtr_endpoint_post() as Manivannan reported. In case of sock_queue_rcv_skb() failure we need to put port reference to avoid leaking struct sock pointer. Fixes: e04df98adf7d ("net: qrtr: Remove receive worker") Fixes: bdabad3e363d ("net: Add Qualcomm IPC router") Reported-and-tested-by: [email protected] Signed-off-by: Pavel Skripkin <[email protected]> Reviewed-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-07-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller6-10/+41
Pablo Neira Ayusosays: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Memleak in commit audit error path, from Dongliang Mu. 2) Avoid possible false sharing for flowtable timeout updates and nft_last use. 3) Adjust conntrack timestamp due to garbage collection delay, from Florian Westphal. 4) Fix nft_nat without layer 3 address for the inet family. 5) Fix compilation warning in nfnl_hook when ingress support is disabled, from Arnd Bergmann. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-07-23octeontx2-af: Fix uninitialized variables in rvu_switchSubbaraya Sundeep2-7/+10
Get the number of VFs of a PF correctly by calling rvu_get_pf_numvfs in rvu_switch_disable function. Also hwvf is not required hence remove it. Fixes: 23109f8dd06d ("octeontx2-af: Introduce internal packet switching") Reported-by: kernel test robot <[email protected]> Reported-by: Colin Ian King <[email protected]> Signed-off-by: Subbaraya Sundeep <[email protected]> Signed-off-by: Sunil Goutham <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-07-23loop: reintroduce global lock for safe loop_validate_file() traversalTetsuo Handa1-31/+97
Commit 6cc8e7430801fa23 ("loop: scale loop device by introducing per device lock") re-opened a race window for NULL pointer dereference at loop_validate_file() where commit 310ca162d779efee ("block/loop: Use global lock for ioctl() operation.") has closed. Although we need to guarantee that other loop devices will not change during traversal, we can't take remote "struct loop_device"->lo_mutex inside loop_validate_file() in order to avoid AB-BA deadlock. Therefore, introduce a global lock dedicated for loop_validate_file() which is conditionally taken before local "struct loop_device"->lo_mutex is taken. Signed-off-by: Tetsuo Handa <[email protected]> Fixes: 6cc8e7430801fa23 ("loop: scale loop device by introducing per device lock") Signed-off-by: Jens Axboe <[email protected]>
2021-07-23wwan: core: Fix missing RTM_NEWLINK event for default linkLoic Poulain1-0/+2
A wwan link created via the wwan_create_default_link procedure is never notified to the user (RTM_NEWLINK), causing issues with user tools relying on such event to track network links (NetworkManager). This is because the procedure misses a call to rtnl_configure_link(), which sets the link as initialized and notifies the new link (cf proper usage in __rtnl_newlink()). Cc: [email protected] Fixes: ca374290aaad ("wwan: core: support default netdev creation") Suggested-by: Sergey Ryazanov <[email protected]> Signed-off-by: Loic Poulain <[email protected]> Acked-by: Sergey Ryazanov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-07-23net: dsa: mv88e6xxx: silently accept the deletion of VID 0 tooVladimir Oltean1-1/+1
The blamed commit modified the driver to accept the addition of VID 0 without doing anything, but deleting that VID still fails: [ 32.080780] mv88e6085 d0032004.mdio-mii:10 lan8: failed to kill vid 0081/0 Modify mv88e6xxx_port_vlan_leave() to do the same thing as the addition. Fixes: b8b79c414eca ("net: dsa: mv88e6xxx: Fix adding vlan 0") Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-07-23ipv6: decrease hop limit counter in ip6_forward()Kangmin Park1-2/+3
Decrease hop limit counter when deliver skb to ndp proxy. Signed-off-by: Kangmin Park <[email protected]> Signed-off-by: David S. Miller <[email protected]>