aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-06-17docs/LoongArch: Fix notes rendering by using reST directivesYanteng Si2-15/+22
Notes are better expressed with reST admonitions. Fixes: 0ea8ce61cb2c ("Documentation: LoongArch: Add basic documentations") Reviewed-by: WANG Xuerui <[email protected]> Signed-off-by: Yanteng Si <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2022-06-17LoongArch: vmlinux.lds.S: Add missing ELF_DETAILSYouling Tang1-0/+1
Commit c604abc3f6e ("vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUG") splits ELF_DETAILS from STABS_DEBUG, resulting in missing ELF_DETAILS information in LoongArch architecture, so add it. Fixes: c604abc3f6e ("vmlinux.lds.h: Split ELF_DETAILS from STABS_DEBUG") Signed-off-by: Youling Tang <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
2022-06-17block: freeze the queue earlier in del_gendiskChristoph Hellwig1-2/+1
Freeze the queue earlier in del_gendisk so that the state does not change while we remove debugfs and sysfs files. Ming mentioned that being able to observer request in debugfs might be useful while the queue is being frozen in del_gendisk, which is made possible by this change. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-06-17block: remove per-disk debugfs files in blk_unregister_queueChristoph Hellwig4-25/+8
The block debugfs files are created in blk_register_queue, which is called by add_disk and use a naming scheme based on the disk_name. After del_gendisk returns that name can be reused and thus we must not leave these debugfs files around, otherwise the kernel is unhappy and spews messages like: Directory XXXXX with parent 'block' already present! and the newly created devices will not have working debugfs files. Move the unregistration to blk_unregister_queue instead (which matches the sysfs unregistration) to make sure the debugfs life time rules match those of the disk name. As part of the move also make sure the whole debugfs unregistration is inside a single debugfs_mutex critical section. Note that this breaks blktests block/002, which checks that the debugfs directory has not been removed while blktests is running, but that particular check should simply be removed from the test case. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-06-17block: serialize all debugfs operations using q->debugfs_mutexChristoph Hellwig8-29/+52
Various places like I/O schedulers or the QOS infrastructure try to register debugfs files on demans, which can race with creating and removing the main queue debugfs directory. Use the existing debugfs_mutex to serialize all debugfs operations that rely on q->debugfs_dir or the directories hanging off it. To make the teardown code a little simpler declare all debugfs dentry pointers and not just the main one uncoditionally in blkdev.h. Move debugfs_mutex next to the dentries that it protects and document what it is used for. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-06-17block: disable the elevator int del_gendiskChristoph Hellwig2-41/+11
The elevator is only used for file system requests, which are stopped in del_gendisk. Move disabling the elevator and freeing the scheduler tags to the end of del_gendisk instead of doing that work in disk_release and blk_cleanup_queue to avoid a use after free on q->tag_set from disk_release as the tag_set might not be alive at that point. Move the blk_qos_exit call as well, as it just depends on the elevator exit and would be the only reason to keep the not exactly cheap queue freeze in disk_release. Fixes: e155b0c238b2 ("blk-mq: Use shared tags for shared sbitmap support") Reported-by: [email protected] Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2022-06-17riscv: Fix ALT_THEAD_PMA's asm parametersNathan Chancellor1-7/+7
After commit a35707c3d850 ("riscv: add memory-type errata for T-Head"), builds with LLVM's integrated assembler fail like: In file included from arch/riscv/kernel/asm-offsets.c:10: In file included from ./include/linux/mm.h:29: In file included from ./include/linux/pgtable.h:6: In file included from ./arch/riscv/include/asm/pgtable.h:114: ./arch/riscv/include/asm/pgtable-64.h:210:2: error: invalid input constraint '0' in asm ALT_THEAD_PMA(prot_val); ^ ./arch/riscv/include/asm/errata_list.h:88:4: note: expanded from macro 'ALT_THEAD_PMA' : "0"(_val), \ ^ This was reported upstream to LLVM where Jessica pointed out a couple of issues with the existing implementation of ALT_THEAD_PMA: * t3 is modified but not listed in the clobbers list. * "+r"(_val) marks _val as both an input and output of the asm but then "0"(_val) marks _val as an input matching constraint, which does not make much sense in this situation, as %1 is not actually used in the asm and matching constraints are designed to be used for different inputs that need to use the same register. Drop the matching contraint and shift all the operands by one, as %1 is unused, and mark t3 as clobbered. This resolves the build error and goes not cause any problems with GNU as. Fixes: a35707c3d850 ("riscv: add memory-type errata for T-Head") Link: https://github.com/ClangBuiltLinux/linux/issues/1641 Link: https://github.com/llvm/llvm-project/issues/55514 Link: https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html Suggested-by: Jessica Clarke <[email protected]> Signed-off-by: Nathan Chancellor <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Reviewed-by: Heiko Stuebner <[email protected]> Tested-by: Heiko Stuebner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-06-17io_uring: recycle provided buffer if we punt to io-wqJens Axboe1-0/+1
io_arm_poll_handler() will recycle the buffer appropriately if we end up arming poll (or if we're ready to retry), but not for the io-wq case if we have attempted poll first. Explicitly recycle the buffer to avoid both hanging on to it too long, but also to avoid multiple reads grabbing the same one. This can happen for ring mapped buffers, since it hasn't necessarily been committed. Fixes: c7fb19428d67 ("io_uring: add support for ring mapped supplied buffers") Link: https://github.com/axboe/liburing/issues/605 Signed-off-by: Jens Axboe <[email protected]>
2022-06-17ipv4: ping: fix bind address validity checkRiccardo Paolo Bestetti2-3/+40
Commit 8ff978b8b222 ("ipv4/raw: support binding to nonlocal addresses") introduced a helper function to fold duplicated validity checks of bind addresses into inet_addr_valid_or_nonlocal(). However, this caused an unintended regression in ping_check_bind_addr(), which previously would reject binding to multicast and broadcast addresses, but now these are both incorrectly allowed as reported in [1]. This patch restores the original check. A simple reordering is done to improve readability and make it evident that multicast and broadcast addresses should not be allowed. Also, add an early exit for INADDR_ANY which replaces lost behavior added by commit 0ce779a9f501 ("net: Avoid unnecessary inet_addr_type() call when addr is INADDR_ANY"). Furthermore, this patch introduces regression selftests to catch these specific cases. [1] https://lore.kernel.org/netdev/CANP3RGdkAcDyAZoT1h8Gtuu0saq+eOrrTiWbxnOs+5zn+cpyKg@mail.gmail.com/ Fixes: 8ff978b8b222 ("ipv4/raw: support binding to nonlocal addresses") Cc: Miaohe Lin <[email protected]> Reported-by: Maciej Żenczykowski <[email protected]> Signed-off-by: Carlos Llamas <[email protected]> Signed-off-by: Riccardo Paolo Bestetti <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-06-17hamradio: 6pack: fix array-index-out-of-bounds in decode_std_command()Xu Jia1-1/+8
Hulk Robot reports incorrect sp->rx_count_cooked value in decode_std_command(). This should be caused by the subtracting from sp->rx_count_cooked before. It seems that sp->rx_count_cooked value is changed to 0, which bypassed the previous judgment. The situation is shown below: (Thread 1) | (Thread 2) decode_std_command() | resync_tnc() ... | if (rest == 2) | sp->rx_count_cooked -= 2; | else if (rest == 3) | ... | sp->rx_count_cooked = 0; sp->rx_count_cooked -= 1; | for (i = 0; i < sp->rx_count_cooked; i++) // report error checksum += sp->cooked_buf[i]; sp->rx_count_cooked is a shared variable but is not protected by a lock. The same applies to sp->rx_count. This patch adds a lock to fix the bug. The fail log is shown below: ======================================================================= UBSAN: array-index-out-of-bounds in drivers/net/hamradio/6pack.c:925:31 index 400 is out of range for type 'unsigned char [400]' CPU: 3 PID: 7433 Comm: kworker/u10:1 Not tainted 5.18.0-rc5-00163-g4b97bac0756a #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Workqueue: events_unbound flush_to_ldisc Call Trace: <TASK> dump_stack_lvl+0xcd/0x134 ubsan_epilogue+0xb/0x50 __ubsan_handle_out_of_bounds.cold+0x62/0x6c sixpack_receive_buf+0xfda/0x1330 tty_ldisc_receive_buf+0x13e/0x180 tty_port_default_receive_buf+0x6d/0xa0 flush_to_ldisc+0x213/0x3f0 process_one_work+0x98f/0x1620 worker_thread+0x665/0x1080 kthread+0x2e9/0x3a0 ret_from_fork+0x1f/0x30 ... Reported-by: Hulk Robot <[email protected]> Signed-off-by: Xu Jia <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-06-17tipc: fix use-after-free Read in tipc_named_reinitHoang Le1-2/+1
syzbot found the following issue on: ================================================================== BUG: KASAN: use-after-free in tipc_named_reinit+0x94f/0x9b0 net/tipc/name_distr.c:413 Read of size 8 at addr ffff88805299a000 by task kworker/1:9/23764 CPU: 1 PID: 23764 Comm: kworker/1:9 Not tainted 5.18.0-rc4-syzkaller-00878-g17d49e6e8012 #0 Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: events tipc_net_finalize_work Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_address_description.constprop.0.cold+0xeb/0x495 mm/kasan/report.c:313 print_report mm/kasan/report.c:429 [inline] kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491 tipc_named_reinit+0x94f/0x9b0 net/tipc/name_distr.c:413 tipc_net_finalize+0x234/0x3d0 net/tipc/net.c:138 process_one_work+0x996/0x1610 kernel/workqueue.c:2289 worker_thread+0x665/0x1080 kernel/workqueue.c:2436 kthread+0x2e9/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298 </TASK> [...] ================================================================== In the commit d966ddcc3821 ("tipc: fix a deadlock when flushing scheduled work"), the cancel_work_sync() function just to make sure ONLY the work tipc_net_finalize_work() is executing/pending on any CPU completed before tipc namespace is destroyed through tipc_exit_net(). But this function is not guaranteed the work is the last queued. So, the destroyed instance may be accessed in the work which will try to enqueue later. In order to completely fix, we re-order the calling of cancel_work_sync() to make sure the work tipc_net_finalize_work() was last queued and it must be completed by calling cancel_work_sync(). Reported-by: [email protected] Fixes: d966ddcc3821 ("tipc: fix a deadlock when flushing scheduled work") Acked-by: Jon Maloy <[email protected]> Signed-off-by: Ying Xue <[email protected]> Signed-off-by: Hoang Le <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-06-17veth: Add updating of trans_startJay Vosburgh1-0/+4
Since commit 21a75f0915dd ("bonding: Fix ARP monitor validation"), the bonding ARP / ND link monitors depend on the trans_start time to determine link availability. NETIF_F_LLTX drivers must update trans_start directly, which veth does not do. This prevents use of the ARP or ND link monitors with veth interfaces in a bond. Resolve this by having veth_xmit update the trans_start time. Reported-by: Jonathan Toppins <[email protected]> Tested-by: Jonathan Toppins <[email protected]> Signed-off-by: Jay Vosburgh <[email protected]> Fixes: 21a75f0915dd ("bonding: Fix ARP monitor validation") Link: https://lore.kernel.org/netdev/[email protected]/ Signed-off-by: David S. Miller <[email protected]>
2022-06-17net: fix data-race in dev_isalive()Eric Dumazet2-10/+16
dev_isalive() is called under RTNL or dev_base_lock protection. This means that changes to dev->reg_state should be done with both locks held. syzbot reported: BUG: KCSAN: data-race in register_netdevice / type_show write to 0xffff888144ecf518 of 1 bytes by task 20886 on cpu 0: register_netdevice+0xb9f/0xdf0 net/core/dev.c:10050 lapbeth_new_device drivers/net/wan/lapbether.c:414 [inline] lapbeth_device_event+0x4a0/0x6c0 drivers/net/wan/lapbether.c:456 notifier_call_chain kernel/notifier.c:87 [inline] raw_notifier_call_chain+0x53/0xb0 kernel/notifier.c:455 __dev_notify_flags+0x1d6/0x3a0 dev_change_flags+0xa2/0xc0 net/core/dev.c:8607 do_setlink+0x778/0x2230 net/core/rtnetlink.c:2780 __rtnl_newlink net/core/rtnetlink.c:3546 [inline] rtnl_newlink+0x114c/0x16a0 net/core/rtnetlink.c:3593 rtnetlink_rcv_msg+0x811/0x8c0 net/core/rtnetlink.c:6089 netlink_rcv_skb+0x13e/0x240 net/netlink/af_netlink.c:2501 rtnetlink_rcv+0x18/0x20 net/core/rtnetlink.c:6107 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x58a/0x660 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x661/0x750 net/netlink/af_netlink.c:1921 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] __sys_sendto+0x21e/0x2c0 net/socket.c:2119 __do_sys_sendto net/socket.c:2131 [inline] __se_sys_sendto net/socket.c:2127 [inline] __x64_sys_sendto+0x74/0x90 net/socket.c:2127 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 read to 0xffff888144ecf518 of 1 bytes by task 20423 on cpu 1: dev_isalive net/core/net-sysfs.c:38 [inline] netdev_show net/core/net-sysfs.c:50 [inline] type_show+0x24/0x90 net/core/net-sysfs.c:112 dev_attr_show+0x35/0x90 drivers/base/core.c:2095 sysfs_kf_seq_show+0x175/0x240 fs/sysfs/file.c:59 kernfs_seq_show+0x75/0x80 fs/kernfs/file.c:162 seq_read_iter+0x2c3/0x8e0 fs/seq_file.c:230 kernfs_fop_read_iter+0xd1/0x2f0 fs/kernfs/file.c:235 call_read_iter include/linux/fs.h:2052 [inline] new_sync_read fs/read_write.c:401 [inline] vfs_read+0x5a5/0x6a0 fs/read_write.c:482 ksys_read+0xe8/0x1a0 fs/read_write.c:620 __do_sys_read fs/read_write.c:630 [inline] __se_sys_read fs/read_write.c:628 [inline] __x64_sys_read+0x3e/0x50 fs/read_write.c:628 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 value changed: 0x00 -> 0x01 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 20423 Comm: udevd Tainted: G W 5.19.0-rc2-syzkaller-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-06-17KVM: arm64: Add Oliver as a reviewerMarc Zyngier1-0/+1
Oliver Upton has agreed to help with reviewing the KVM/arm64 patches, and has been doing so for a while now, so adding him as to the reviewer list. Note that Oliver is using a different email address for this purpose, rather than the one his been using for his other contributions. Signed-off-by: Marc Zyngier <[email protected]> Acked-by: Oliver Upton <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-06-17KVM: arm64: Prevent kmemleak from accessing pKVM memoryQuentin Perret1-3/+3
Commit a7259df76702 ("memblock: make memblock_find_in_range method private") changed the API using which memory is reserved for the pKVM hypervisor. However, memblock_phys_alloc() differs from the original API in terms of kmemleak semantics -- the old one didn't report the reserved regions to kmemleak while the new one does. Unfortunately, when protected KVM is enabled, all kernel accesses to pKVM-private memory result in a fatal exception, which can now happen because of kmemleak scans: $ echo scan > /sys/kernel/debug/kmemleak [ 34.991354] kvm [304]: nVHE hyp BUG at: [<ffff800008fa3750>] __kvm_nvhe_handle_host_mem_abort+0x270/0x290! [ 34.991580] kvm [304]: Hyp Offset: 0xfffe8be807e00000 [ 34.991813] Kernel panic - not syncing: HYP panic: [ 34.991813] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800 [ 34.991813] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000 [ 34.991813] VCPU:0000000000000000 [ 34.993660] CPU: 0 PID: 304 Comm: bash Not tainted 5.19.0-rc2 #102 [ 34.994059] Hardware name: linux,dummy-virt (DT) [ 34.994452] Call trace: [ 34.994641] dump_backtrace.part.0+0xcc/0xe0 [ 34.994932] show_stack+0x18/0x6c [ 34.995094] dump_stack_lvl+0x68/0x84 [ 34.995276] dump_stack+0x18/0x34 [ 34.995484] panic+0x16c/0x354 [ 34.995673] __hyp_pgtable_total_pages+0x0/0x60 [ 34.995933] scan_block+0x74/0x12c [ 34.996129] scan_gray_list+0xd8/0x19c [ 34.996332] kmemleak_scan+0x2c8/0x580 [ 34.996535] kmemleak_write+0x340/0x4a0 [ 34.996744] full_proxy_write+0x60/0xbc [ 34.996967] vfs_write+0xc4/0x2b0 [ 34.997136] ksys_write+0x68/0xf4 [ 34.997311] __arm64_sys_write+0x20/0x2c [ 34.997532] invoke_syscall+0x48/0x114 [ 34.997779] el0_svc_common.constprop.0+0x44/0xec [ 34.998029] do_el0_svc+0x2c/0xc0 [ 34.998205] el0_svc+0x2c/0x84 [ 34.998421] el0t_64_sync_handler+0xf4/0x100 [ 34.998653] el0t_64_sync+0x18c/0x190 [ 34.999252] SMP: stopping secondary CPUs [ 35.000034] Kernel Offset: disabled [ 35.000261] CPU features: 0x800,00007831,00001086 [ 35.000642] Memory Limit: none [ 35.001329] ---[ end Kernel panic - not syncing: HYP panic: [ 35.001329] PS:600003c9 PC:0000f418011a3750 ESR:00000000f2000800 [ 35.001329] FAR:ffff000439200000 HPFAR:0000000004792000 PAR:0000000000000000 [ 35.001329] VCPU:0000000000000000 ]--- Fix this by explicitly excluding the hypervisor's memory pool from kmemleak like we already do for the hyp BSS. Cc: Mike Rapoport <[email protected]> Fixes: a7259df76702 ("memblock: make memblock_find_in_range method private") Signed-off-by: Quentin Perret <[email protected]> Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-06-17ALSA: x86: intel_hdmi_audio: use pm_runtime_resume_and_get()Pierre-Louis Bossart1-2/+8
The current code does not check for errors and does not release the reference on errors. Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Bard Liao <[email protected]> Reviewed-by: Kai Vehmanen <[email protected]> Reviewed-by: Ranjani Sridharan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2022-06-17ALSA: x86: intel_hdmi_audio: enable pm_runtime and set autosuspend delayPierre-Louis Bossart1-0/+5
The existing code uses pm_runtime_get_sync/put_autosuspend, but pm_runtime was not explicitly enabled. The autosuspend delay was not set either, the value is set to 5s since HDMI is rather painful to resume. Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Bard Liao <[email protected]> Reviewed-by: Kai Vehmanen <[email protected]> Reviewed-by: Ranjani Sridharan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2022-06-17ALSA: hda: intel-nhlt: remove use of __func__ in dev_dbgPierre-Louis Bossart1-9/+8
The module and function information can be added with 'modprobe foo dyndbg=+pmf' Suggested-by: Greg KH <[email protected]> Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Ranjani Sridharan <[email protected]> Reviewed-by: Péter Ujfalusi <[email protected]> Reviewed-by: Bard Liao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2022-06-17ALSA: hda: intel-dspcfg: use SOF for UpExtreme and UpExtreme11 boardsPierre-Louis Bossart1-0/+12
The UpExtreme BIOS reports microphones that are not physically present, so this module ends-up selecting SOF, while the UpExtreme11 BIOS does not report microphones so the snd-hda-intel driver is selected. For consistency use SOF unconditionally in autodetection mode. The use of the snd-hda-intel driver can still be enabled with 'options snd-intel-dspcfg dsp_driver=1' Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Bard Liao <[email protected]> Reviewed-by: Péter Ujfalusi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2022-06-17firewire: convert sysfs sprintf/snprintf family to sysfs_emitJiapeng Chong1-4/+2
Fix the following coccicheck warning: ./drivers/firewire/core-device.c:375:8-16: WARNING: use scnprintf or sprintf. Reported-by: Abaci Robot<[email protected]> Signed-off-by: Jiapeng Chong <[email protected]> Signed-off-by: Takashi Sakamoto <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2022-06-17firewire: cdev: fix potential leak of kernel stack due to uninitialized valueTakashi Sakamoto1-1/+1
Recent change brings potential leak of value on kernel stack to userspace due to uninitialized value. This commit fixes the bug. Reported-by: Dan Carpenter <[email protected]> Fixes: baa914cd81f5 ("firewire: add kernel API to access CYCLE_TIME register") Signed-off-by: Takashi Sakamoto <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2022-06-17ata: libata: add qc->flags in ata_qc_complete_template tracepointEdward Wu1-0/+1
Add flags value to check the result of ata completion Fixes: 255c03d15a29 ("libata: Add tracepoints") Cc: [email protected] Signed-off-by: Edward Wu <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2022-06-16Merge tag 'drm-fixes-2022-06-17' of git://anongit.freedesktop.org/drm/drmLinus Torvalds17-115/+141
Pull drm fixes from Dave Airlie: "Regular drm fixes for rc3. Nothing too serious, i915, amdgpu and exynos all have a few small driver fixes, and two ttm fixes, and one compiler warning. atomic: - fix spurious compiler warning ttm: - add NULL ptr check in swapout code - fix bulk move handling i915: - Fix page fault on error state read - Fix memory leaks in per-gt sysfs - Fix multiple fence handling - Remove accidental static from a local variable amdgpu: - Fix regression in GTT size reporting - OLED backlight fix exynos: - Check a null pointer instead of IS_ERR() - Rework initialization code of Exynos MIC driver" * tag 'drm-fixes-2022-06-17' of git://anongit.freedesktop.org/drm/drm: drm/amd/display: Cap OLED brightness per max frame-average luminance drm/amdgpu: Fix GTT size reporting in amdgpu_ioctl drm/exynos: mic: Rework initialization drm/exynos: fix IS_ERR() vs NULL check in probe drm/ttm: fix bulk move handling v2 drm/i915/uc: remove accidental static from a local variable drm/i915: Individualize fences before adding to dma_resv obj drm/i915/gt: Fix memory leaks in per-gt sysfs drm/i915/reset: Fix error_state_read ptr + offset use drm/ttm: fix missing NULL check in ttm_device_swapout drm/atomic: fix warning of unused variable
2022-06-16phy: aquantia: Fix AN when higher speeds than 1G are not advertisedClaudiu Manoil1-1/+14
Even when the eth port is resticted to work with speeds not higher than 1G, and so the eth driver is requesting the phy (via phylink) to advertise up to 1000BASET support, the aquantia phy device is still advertising for 2.5G and 5G speeds. Clear these advertising defaults when requested. Cc: Ondrej Spacek <[email protected]> Fixes: 09c4c57f7bc41 ("net: phy: aquantia: add support for auto-negotiation configuration") Signed-off-by: Claudiu Manoil <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-06-16Merge branch 'bpf: Fix cookie values for kprobe multi'Alexei Starovoitov5-68/+110
Jiri Olsa says: ==================== hi, there's bug in kprobe_multi link that makes cookies misplaced when using symbols to attach. The reason is that we sort symbols by name but not adjacent cookie values. Current test did not find it because bpf_fentry_test* are already sorted by name. v3 changes: - fixed kprobe_multi bench test to filter out invalid entries from available_filter_functions v2 changes: - rebased on top of bpf/master - checking if cookies are defined later in swap function [Andrii] - added acks thanks, jirka ==================== Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-16selftest/bpf: Fix kprobe_multi bench testJiri Olsa1-0/+3
With [1] the available_filter_functions file contains records starting with __ftrace_invalid_address___ and marking disabled entries. We need to filter them out for the bench test to pass only resolvable symbols to kernel. [1] commit b39181f7c690 ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function") Fixes: b39181f7c690 ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function") Signed-off-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-16bpf: Force cookies array to follow symbols sortingJiri Olsa1-15/+45
When user specifies symbols and cookies for kprobe_multi link interface it's very likely the cookies will be misplaced and returned to wrong functions (via get_attach_cookie helper). The reason is that to resolve the provided functions we sort them before passing them to ftrace_lookup_symbols, but we do not do the same sort on the cookie values. Fixing this by using sort_r function with custom swap callback that swaps cookie values as well. Fixes: 0236fec57a15 ("bpf: Resolve symbols with ftrace_lookup_symbols for kprobe multi link") Signed-off-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-16ftrace: Keep address offset in ftrace_lookup_symbolsJiri Olsa1-2/+11
We want to store the resolved address on the same index as the symbol string, because that's the user (bpf kprobe link) code assumption. Also making sure we don't store duplicates that might be present in kallsyms. Acked-by: Song Liu <[email protected]> Acked-by: Steven Rostedt (Google) <[email protected]> Fixes: bed0d9a50dac ("ftrace: Add ftrace_lookup_symbols function") Signed-off-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-16selftests/bpf: Shuffle cookies symbols in kprobe multi testJiri Olsa2-51/+51
There's a kernel bug that causes cookies to be misplaced and the reason we did not catch this with this test is that we provide bpf_fentry_test* functions already sorted by name. Shuffling function bpf_fentry_test2 deeper in the list and keeping the current cookie values as before will trigger the bug. The kernel fix is coming in following changes. Acked-by: Song Liu <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-16mailmap: add entry for Christian MarangiChristian Marangi1-0/+1
Add entry to map [email protected] to the unique identity of Christian Marangi. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Christian Marangi <[email protected]> Cc: Jens Axboe <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16mm/memory-failure: disable unpoison once hw error happenszhenwei pi6-4/+18
Currently unpoison_memory(unsigned long pfn) is designed for soft poison(hwpoison-inject) only. Since 17fae1294ad9d, the KPTE gets cleared on a x86 platform once hardware memory corrupts. Unpoisoning a hardware corrupted page puts page back buddy only, the kernel has a chance to access the page with *NOT PRESENT* KPTE. This leads BUG during accessing on the corrupted KPTE. Suggested by David&Naoya, disable unpoison mechanism when a real HW error happens to avoid BUG like this: Unpoison: Software-unpoisoned page 0x61234 BUG: unable to handle page fault for address: ffff888061234000 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 2c01067 P4D 2c01067 PUD 107267063 PMD 10382b063 PTE 800fffff9edcb062 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 4 PID: 26551 Comm: stress Kdump: loaded Tainted: G M OE 5.18.0.bm.1-amd64 #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ... RIP: 0010:clear_page_erms+0x7/0x10 Code: ... RSP: 0000:ffffc90001107bc8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000901 RCX: 0000000000001000 RDX: ffffea0001848d00 RSI: ffffea0001848d40 RDI: ffff888061234000 RBP: ffffea0001848d00 R08: 0000000000000901 R09: 0000000000001276 R10: 0000000000000003 R11: 0000000000000000 R12: 0000000000000001 R13: 0000000000000000 R14: 0000000000140dca R15: 0000000000000001 FS: 00007fd8b2333740(0000) GS:ffff88813fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff888061234000 CR3: 00000001023d2005 CR4: 0000000000770ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> prep_new_page+0x151/0x170 get_page_from_freelist+0xca0/0xe20 ? sysvec_apic_timer_interrupt+0xab/0xc0 ? asm_sysvec_apic_timer_interrupt+0x1b/0x20 __alloc_pages+0x17e/0x340 __folio_alloc+0x17/0x40 vma_alloc_folio+0x84/0x280 __handle_mm_fault+0x8d4/0xeb0 handle_mm_fault+0xd5/0x2a0 do_user_addr_fault+0x1d0/0x680 ? kvm_read_and_reset_apf_flags+0x3b/0x50 exc_page_fault+0x78/0x170 asm_exc_page_fault+0x27/0x30 Link: https://lkml.kernel.org/r/[email protected] Fixes: 847ce401df392 ("HWPOISON: Add unpoisoning support") Fixes: 17fae1294ad9d ("x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned") Signed-off-by: zhenwei pi <[email protected]> Acked-by: David Hildenbrand <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Reviewed-by: Miaohe Lin <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: <[email protected]> [5.8+] Signed-off-by: Andrew Morton <[email protected]>
2022-06-16hugetlbfs: zero partial pages during fallocate hole punchMike Kravetz1-15/+53
hugetlbfs fallocate support was originally added with commit 70c3547e36f5 ("hugetlbfs: add hugetlbfs_fallocate()"). Initial support only operated on whole hugetlb pages. This makes sense for populating files as other interfaces such as mmap and truncate require hugetlb page size alignment. Only operating on whole hugetlb pages for the hole punch case was a simplification and there was no compelling use case to zero partial pages. In a recent discussion[1] it was assumed that hugetlbfs hole punch would zero partial hugetlb pages as that is in line with the man page description saying 'partial filesystem blocks are zeroed'. However, the hugetlbfs hole punch code actually does this: hole_start = round_up(offset, hpage_size); hole_end = round_down(offset + len, hpage_size); Modify code to zero partial hugetlb pages in hole punch range. It is possible that application code could note a change in behavior. However, that would imply the code is passing in an unaligned range and expecting only whole pages be removed. This is unlikely as the fallocate documentation states the opposite. The current hugetlbfs fallocate hole punch behavior is tested with the libhugetlbfs test fallocate_align[2]. This test will be updated to validate partial page zeroing. [1] https://lore.kernel.org/linux-mm/[email protected]/ [2] https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/fallocate_align.c Link: https://lkml.kernel.org/r/YqeiMlZDKI1Kabfe@monkey Signed-off-by: Mike Kravetz <[email protected]> Reviewed-by: Muchun Song <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Axel Rasmussen <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16mm: memcontrol: reference to tools/cgroup/memcg_slabinfo.pyYang Yang1-1/+1
There is no slabinfo.py in tools/cgroup, but has memcg_slabinfo.py instead. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yang Yang <[email protected]> Reviewed-by: Muchun Song <[email protected]> Acked-by: Roman Gushchin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16mm: re-allow pinning of zero pfnsAlex Williamson1-1/+1
The commit referenced below subtly and inadvertently changed the logic to disallow pinning of zero pfns. This breaks device assignment with vfio and potentially various other users of gup. Exclude the zero page test from the negation. Link: https://lkml.kernel.org/r/165490039431.944052.12458624139225785964.stgit@omen Fixes: 1c563432588d ("mm: fix is_pinnable_page against a cma page") Signed-off-by: Alex Williamson <[email protected]> Acked-by: Minchan Kim <[email protected]> Acked-by: David Hildenbrand <[email protected]> Reported-by: Yishai Hadas <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: John Hubbard <[email protected]> Cc: John Dias <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Zhangfei Gao <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Joao Martins <[email protected]> Cc: Yi Liu <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16mm/kfence: select random number before taking raw lockJason A. Donenfeld1-2/+5
The RNG uses vanilla spinlocks, not raw spinlocks, so kfence should pick its random numbers before taking its raw spinlocks. This also has the nice effect of doing less work inside the lock. It should fix a splat that Geert saw with CONFIG_PROVE_RAW_LOCK_NESTING: dump_backtrace.part.0+0x98/0xc0 show_stack+0x14/0x28 dump_stack_lvl+0xac/0xec dump_stack+0x14/0x2c __lock_acquire+0x388/0x10a0 lock_acquire+0x190/0x2c0 _raw_spin_lock_irqsave+0x6c/0x94 crng_make_state+0x148/0x1e4 _get_random_bytes.part.0+0x4c/0xe8 get_random_u32+0x4c/0x140 __kfence_alloc+0x460/0x5c4 kmem_cache_alloc_trace+0x194/0x1dc __kthread_create_on_node+0x5c/0x1a8 kthread_create_on_node+0x58/0x7c printk_start_kthread.part.0+0x34/0xa8 printk_activate_kthreads+0x4c/0x54 do_one_initcall+0xec/0x278 kernel_init_freeable+0x11c/0x214 kernel_init+0x24/0x124 ret_from_fork+0x10/0x20 Link: https://lkml.kernel.org/r/[email protected] Fixes: d4150779e60f ("random32: use real rng for non-deterministic randomness") Signed-off-by: Jason A. Donenfeld <[email protected]> Reported-by: Geert Uytterhoeven <[email protected]> Tested-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Marco Elver <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Cc: John Ogness <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Dmitry Vyukov <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16MAINTAINERS: add maillist information for LoongArchHuacai Chen1-0/+1
Now there is a dedicated maillist ([email protected]) for LoongArch, add it for better collaboration. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Huacai Chen <[email protected]> Reviewed-by: WANG Xuerui <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Xuefeng Li <[email protected]> Cc: Guo Ren <[email protected]> Cc: Xuerui Wang <[email protected]> Cc: Jiaxun Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16MAINTAINERS: update MM tree referencesAndrew Morton1-3/+2
Describe the new kernel.org location of the MM trees. Suggested-by: David Hildenbrand <[email protected]> Cc: Muchun Song <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16MAINTAINERS: update Abel Vesa's emailAbel Vesa2-1/+3
Use Abel Vesa's kernel.org account in maintainer entry and mailmap. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Abel Vesa <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: Dong Aisheng <[email protected]> Cc: Arnd Bergmann <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16MAINTAINERS: add MEMORY HOT(UN)PLUG section and add David as reviewerDavid Hildenbrand1-0/+12
There are certainly a lot more files that partially fall into the memory hot(un)plug category, including parts of mm/sparse.c, mm/page_isolation.c and mm/page_alloc.c. Let's only add what's almost completely memory hot(un)plug related. Add myself as reviewer so it's easier for contributors to figure out whom to CC. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/YqlaE/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Acked-by: Muchun Song <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Cc: Miaohe Lin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16MAINTAINERS: add Miaohe Lin as a memory-failure reviewerMiaohe Lin1-0/+1
I have been focusing on mm for the past two years. e.g. fixing bugs, cleaning up the code and reviewing. I would like to help maintainers and people working on memory-failure by reviewing their work. Let me be Cc'd on patches related to memory-failure. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Miaohe Lin <[email protected]> Acked-by: Naoya Horiguchi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16mailmap: add alias for [email protected]Jarkko Sakkinen1-0/+1
Add alias for patches that I contribute on behalf of Profian (my current employer). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jarkko Sakkinen <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16mm/damon/reclaim: schedule 'damon_reclaim_timer' only after 'system_wq' is ↵SeongJae Park1-0/+8
initialized Commit 059342d1dd4e ("mm/damon/reclaim: fix the timer always stays active") made DAMON_RECLAIM's 'enabled' parameter store callback, 'enabled_store()', to schedule 'damon_reclaim_timer'. The scheduling uses 'system_wq', which is initialized in 'workqueue_init_early()'. As kernel parameters parsing function ('parse_args()') is called before 'workqueue_init_early()', 'enabled_store()' can be executed before 'workqueue_init_early()' and end up accessing the uninitialized 'system_wq'. As a result, the booting hang[1]. This commit fixes the issue by checking if the initialization is done before scheduling the timer. [1] https://lkml.kernel.org/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Fixes: 059342d1dd4e ("mm/damon/reclaim: fix the timer always stays active") Signed-off-by: SeongJae Park <[email protected]> Reported-by: Greg White <[email protected]> Cc: Hailong Tu <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16kthread: make it clear that kthread_create_on_node() might be terminated by ↵Petr Mladek1-7/+7
any fatal signal The comments in kernel/kthread.c create a feeling that only SIGKILL is able to terminate the creation of kernel kthreads by kthread_create()/_on_node()/_on_cpu() APIs. In reality, wait_for_completion_killable() might be killed by any fatal signal that does not have a custom handler: (!siginmask(signr, SIG_KERNEL_IGNORE_MASK|SIG_KERNEL_STOP_MASK) && \ (t)->sighand->action[(signr)-1].sa.sa_handler == SIG_DFL) static inline void signal_wake_up(struct task_struct *t, bool resume) { signal_wake_up_state(t, resume ? TASK_WAKEKILL : 0); } static void complete_signal(int sig, struct task_struct *p, enum pid_type type) { [...] /* * Found a killable thread. If the signal will be fatal, * then start taking the whole group down immediately. */ if (sig_fatal(p, sig) ...) { if (!sig_kernel_coredump(sig)) { [...] do { task_clear_jobctl_pending(t, JOBCTL_PENDING_MASK); sigaddset(&t->pending.signal, SIGKILL); signal_wake_up(t, 1); } while_each_thread(p, t); return; } } } Update the comments in kernel/kthread.c to make this more obvious. The motivation for this change was debugging why a module initialization failed. The module was being loaded from initrd. It "magically" failed when systemd was switching to the real root. The clean up operations sent SIGTERM to various pending processed that were started from initrd. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Petr Mladek <[email protected]> Reviewed-by: "Eric W. Biederman" <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Kees Cook <[email protected]> Cc: Marco Elver <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16mm: lru_cache_disable: use synchronize_rcu_expeditedMarcelo Tosatti1-1/+1
commit ff042f4a9b050 ("mm: lru_cache_disable: replace work queue synchronization with synchronize_rcu") replaced lru_cache_disable's usage of work queues with synchronize_rcu. Some users reported large performance regressions due to this commit, for example: https://lore.kernel.org/all/20220521234616.GO1790663@paulmck-ThinkPad-P17-Gen-1/T/ Switching to synchronize_rcu_expedited fixes the problem. Link: https://lkml.kernel.org/r/YpToHCmnx/[email protected] Fixes: ff042f4a9b050 ("mm: lru_cache_disable: replace work queue synchronization with synchronize_rcu") Signed-off-by: Marcelo Tosatti <[email protected]> Tested-by: Stefan Wahren <[email protected]> Tested-by: Michael Larabel <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Cc: Nicolas Saenz Julienne <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Phil Elwell <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16mm/page_isolation.c: fix one kernel-doc commentYang Li1-0/+2
Remove one warning found by running scripts/kernel-doc, which is caused by using 'make W=1': mm/page_isolation.c:304: warning: Function parameter or member 'skip_isolation' not described in 'isolate_single_pageblock' Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yang Li <[email protected]> Reported-by: Abaci Robot <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-06-16scsi: ibmvfc: Store vhost pointer during subcrq allocationTyrel Datwyler2-2/+3
Currently the back pointer from a queue to the vhost adapter isn't set until after subcrq interrupt registration. The value is available when a queue is first allocated and can/should be also set for primary and async queues as well as subcrqs. This fixes a crash observed during kexec/kdump on Power 9 with legacy XICS interrupt controller where a pending subcrq interrupt from the previous kernel can be replayed immediately upon IRQ registration resulting in dereference of a garbage backpointer in ibmvfc_interrupt_scsi(). Kernel attempted to read user page (58) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000058 Faulting instruction address: 0xc008000003216a08 Oops: Kernel access of bad area, sig: 11 [#1] ... NIP [c008000003216a08] ibmvfc_interrupt_scsi+0x40/0xb0 [ibmvfc] LR [c0000000082079e8] __handle_irq_event_percpu+0x98/0x270 Call Trace: [c000000047fa3d80] [c0000000123e6180] 0xc0000000123e6180 (unreliable) [c000000047fa3df0] [c0000000082079e8] __handle_irq_event_percpu+0x98/0x270 [c000000047fa3ea0] [c000000008207d18] handle_irq_event+0x98/0x188 [c000000047fa3ef0] [c00000000820f564] handle_fasteoi_irq+0xc4/0x310 [c000000047fa3f40] [c000000008205c60] generic_handle_irq+0x50/0x80 [c000000047fa3f60] [c000000008015c40] __do_irq+0x70/0x1a0 [c000000047fa3f90] [c000000008016d7c] __do_IRQ+0x9c/0x130 [c000000014622f60] [0000000020000000] 0x20000000 [c000000014622ff0] [c000000008016e50] do_IRQ+0x40/0xa0 [c000000014623020] [c000000008017044] replay_soft_interrupts+0x194/0x2f0 [c000000014623210] [c0000000080172a8] arch_local_irq_restore+0x108/0x170 [c000000014623240] [c000000008eb1008] _raw_spin_unlock_irqrestore+0x58/0xb0 [c000000014623270] [c00000000820b12c] __setup_irq+0x49c/0x9f0 [c000000014623310] [c00000000820b7c0] request_threaded_irq+0x140/0x230 [c000000014623380] [c008000003212a50] ibmvfc_register_scsi_channel+0x1e8/0x2f0 [ibmvfc] [c000000014623450] [c008000003213d1c] ibmvfc_init_sub_crqs+0xc4/0x1f0 [ibmvfc] [c0000000146234d0] [c0080000032145a8] ibmvfc_reset_crq+0x150/0x210 [ibmvfc] [c000000014623550] [c0080000032147c8] ibmvfc_init_crq+0x160/0x280 [ibmvfc] [c0000000146235f0] [c00800000321a9cc] ibmvfc_probe+0x2a4/0x530 [ibmvfc] Link: https://lore.kernel.org/r/[email protected] Fixes: 3034ebe26389 ("scsi: ibmvfc: Add alloc/dealloc routines for SCSI Sub-CRQ Channels") Cc: [email protected] Reviewed-by: Brian King <[email protected]> Signed-off-by: Tyrel Datwyler <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2022-06-16scsi: ibmvfc: Allocate/free queue resource only during probe/removeTyrel Datwyler1-17/+62
Currently, the sub-queues and event pool resources are allocated/freed for every CRQ connection event such as reset and LPM. This exposes the driver to a couple issues. First the inefficiency of freeing and reallocating memory that can simply be resued after being sanitized. Further, a system under memory pressue runs the risk of allocation failures that could result in a crippled driver. Finally, there is a race window where command submission/compeletion can try to pull/return elements from/to an event pool that is being deleted or already has been deleted due to the lack of host state around freeing/allocating resources. The following is an example of list corruption following a live partition migration (LPM): Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: vfat fat isofs cdrom ext4 mbcache jbd2 nft_counter nft_compat nf_tables nfnetlink rpadlpar_io rpaphp xsk_diag nfsv3 nfs_acl nfs lockd grace fscache netfs rfkill bonding tls sunrpc pseries_rng drm drm_panel_orientation_quirks xfs libcrc32c dm_service_time sd_mod t10_pi sg ibmvfc scsi_transport_fc ibmveth vmx_crypto dm_multipath dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse CPU: 0 PID: 2108 Comm: ibmvfc_0 Kdump: loaded Not tainted 5.14.0-70.9.1.el9_0.ppc64le #1 NIP: c0000000007c4bb0 LR: c0000000007c4bac CTR: 00000000005b9a10 REGS: c00000025c10b760 TRAP: 0700 Not tainted (5.14.0-70.9.1.el9_0.ppc64le) MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 2800028f XER: 0000000f CFAR: c0000000001f55bc IRQMASK: 0 GPR00: c0000000007c4bac c00000025c10ba00 c000000002a47c00 000000000000004e GPR04: c0000031e3006f88 c0000031e308bd00 c00000025c10b768 0000000000000027 GPR08: 0000000000000000 c0000031e3009dc0 00000031e0eb0000 0000000000000000 GPR12: c0000031e2ffffa8 c000000002dd0000 c000000000187108 c00000020fcee2c0 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 c008000002f81300 GPR24: 5deadbeef0000100 5deadbeef0000122 c000000263ba6910 c00000024cc88000 GPR28: 000000000000003c c0000002430a0000 c0000002430ac300 000000000000c300 NIP [c0000000007c4bb0] __list_del_entry_valid+0x90/0x100 LR [c0000000007c4bac] __list_del_entry_valid+0x8c/0x100 Call Trace: [c00000025c10ba00] [c0000000007c4bac] __list_del_entry_valid+0x8c/0x100 (unreliable) [c00000025c10ba60] [c008000002f42284] ibmvfc_free_queue+0xec/0x210 [ibmvfc] [c00000025c10bb10] [c008000002f4246c] ibmvfc_deregister_scsi_channel+0xc4/0x160 [ibmvfc] [c00000025c10bba0] [c008000002f42580] ibmvfc_release_sub_crqs+0x78/0x130 [ibmvfc] [c00000025c10bc20] [c008000002f4f6cc] ibmvfc_do_work+0x5c4/0xc70 [ibmvfc] [c00000025c10bce0] [c008000002f4fdec] ibmvfc_work+0x74/0x1e8 [ibmvfc] [c00000025c10bda0] [c0000000001872b8] kthread+0x1b8/0x1c0 [c00000025c10be10] [c00000000000cd64] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 40820034 38600001 38210060 4e800020 7c0802a6 7c641b78 3c62fe7a 7d254b78 3863b590 f8010070 4ba309cd 60000000 <0fe00000> 7c0802a6 3c62fe7a 3863b640 ---[ end trace 11a2b65a92f8b66c ]--- ibmvfc 30000003: Send warning. Receive queue closed, will retry. Add registration/deregistration helpers that are called instead during connection resets to sanitize and reconfigure the queues. Link: https://lore.kernel.org/r/[email protected] Fixes: 3034ebe26389 ("scsi: ibmvfc: Add alloc/dealloc routines for SCSI Sub-CRQ Channels") Cc: [email protected] Reviewed-by: Brian King <[email protected]> Signed-off-by: Tyrel Datwyler <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2022-06-16scsi: storvsc: Correct reporting of Hyper-V I/O size limitsSaurabh Sengar1-5/+22
Current code is based on the idea that the max number of SGL entries also determines the max size of an I/O request. While this idea was true in older versions of the storvsc driver when SGL entry length was limited to 4 Kbytes, commit 3d9c3dcc58e9 ("scsi: storvsc: Enable scatterlist entry lengths > 4Kbytes") removed that limitation. It's now theoretically possible for the block layer to send requests that exceed the maximum size supported by Hyper-V. This problem doesn't currently happen in practice because the block layer defaults to a 512 Kbyte maximum, while Hyper-V in Azure supports 2 Mbyte I/O sizes. But some future configuration of Hyper-V could have a smaller max I/O size, and the block layer could exceed that max. Fix this by correctly setting max_sectors as well as sg_tablesize to reflect the maximum I/O size that Hyper-V reports. While allowing I/O sizes larger than the block layer default of 512 Kbytes doesn’t provide any noticeable performance benefit in the tests we ran, it's still appropriate to report the correct underlying Hyper-V capabilities to the Linux block layer. Also tweak the virt_boundary_mask to reflect that the required alignment derives from Hyper-V communication using a 4 Kbyte page size, and not on the guest page size, which might be bigger (eg. ARM64). Link: https://lore.kernel.org/r/[email protected] Fixes: 3d9c3dcc58e9 ("scsi: storvsc: Enable scatter list entry lengths > 4Kbytes") Reviewed-by: Michael Kelley <[email protected]> Signed-off-by: Saurabh Sengar <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2022-06-17Merge tag 'exynos-drm-fixes-v5.19-rc3' of ↵Dave Airlie2-33/+15
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes two regression fixups - Check a null pointer instead of IS_ERR(). - Rework initialization code of Exynos MIC driver. Signed-off-by: Dave Airlie <[email protected]> From: Inki Dae <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-06-16scsi: ufs: Fix a race between the interrupt handler and the reset handlerBart Van Assche1-9/+19
Prevent that both the interrupt handler and the reset handler try to complete a request at the same time. This patch is the result of an analysis of the following crash: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000120 CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE 5.10.107-android13-4-00051-g1e48e8970cca-ab8664745 #1 pc : ufshcd_release_scsi_cmd+0x30/0x46c lr : __ufshcd_transfer_req_compl+0x4fc/0x9c0 Call trace: ufshcd_release_scsi_cmd+0x30/0x46c __ufshcd_transfer_req_compl+0x4fc/0x9c0 ufshcd_poll+0xf0/0x208 ufshcd_sl_intr+0xb8/0xf0 ufshcd_intr+0x168/0x2f4 __handle_irq_event_percpu+0xa0/0x30c handle_irq_event+0x84/0x178 handle_fasteoi_irq+0x150/0x2e8 __handle_domain_irq+0x114/0x1e4 gic_handle_irq.31846+0x58/0x300 el1_irq+0xe4/0x1c0 cpuidle_enter_state+0x3ac/0x8c4 do_idle+0x2fc/0x55c cpu_startup_entry+0x84/0x90 kernel_init+0x0/0x310 start_kernel+0x0/0x608 start_kernel+0x4ec/0x608 Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Stanley Chu <[email protected]> Reviewed-by: Adrian Hunter <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>