aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-02-22Merge tag 'vfs-6.8-rc6.fixes' of ↵Linus Torvalds11-19/+32
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix a memory leak in cachefiles - Restrict aio cancellations to I/O submitted through the aio interfaces as this is otherwise causing issues for I/O submitted via io_uring - Increase buffer for afs volume status to avoid overflow - Fix a missing zero-length check in unbuffered writes in the netfs library. If generic_write_checks() returns zero make netfs_unbuffered_write_iter() return right away - Prevent a leak in i_dio_count caused by netfs_begin_read() operating past i_size. It will return early and leave i_dio_count incremented - Account for ipv4 addresses as well as ipv6 addresses when processing incoming callbacks in afs * tag 'vfs-6.8-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio afs: Increase buffer size in afs_update_volume_status() afs: Fix ignored callbacks over ipv4 cachefiles: fix memory leak in cachefiles_add_cache() netfs: Fix missing zero-length check in unbuffered write netfs: Fix i_dio_count leak on DIO read past i_size
2024-02-22Merge tag 'net-6.8.0-rc6' of ↵Linus Torvalds75-378/+870
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - regressions: - af_unix: fix another unix GC hangup Previous releases - regressions: - core: fix a possible AF_UNIX deadlock - bpf: fix NULL pointer dereference in sk_psock_verdict_data_ready() - netfilter: nft_flow_offload: release dst in case direct xmit path is used - bridge: switchdev: ensure MDB events are delivered exactly once - l2tp: pass correct message length to ip6_append_data - dccp/tcp: unhash sk from ehash for tb2 alloc failure after check_estalblished() - tls: fixes for record type handling with PEEK - devlink: fix possible use-after-free and memory leaks in devlink_init() Previous releases - always broken: - bpf: fix an oops when attempting to read the vsyscall page through bpf_probe_read_kernel - sched: act_mirred: use the backlog for mirred ingress - netfilter: nft_flow_offload: fix dst refcount underflow - ipv6: sr: fix possible use-after-free and null-ptr-deref - mptcp: fix several data races - phonet: take correct lock to peek at the RX queue Misc: - handful of fixes and reliability improvements for selftests" * tag 'net-6.8.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits) l2tp: pass correct message length to ip6_append_data net: phy: realtek: Fix rtl8211f_config_init() for RTL8211F(D)(I)-VD-CG PHY selftests: ioam: refactoring to align with the fix Fix write to cloned skb in ipv6_hop_ioam() phonet/pep: fix racy skb_queue_empty() use phonet: take correct lock to peek at the RX queue net: sparx5: Add spinlock for frame transmission from CPU net/sched: flower: Add lock protection when remove filter handle devlink: fix port dump cmd type net: stmmac: Fix EST offset for dwmac 5.10 tools: ynl: don't leak mcast_groups on init error tools: ynl: make sure we always pass yarg to mnl_cb_run net: mctp: put sock on tag allocation failure netfilter: nf_tables: use kzalloc for hook allocation netfilter: nf_tables: register hooks last when adding new chain/flowtable netfilter: nft_flow_offload: release dst in case direct xmit path is used netfilter: nft_flow_offload: reset dst in route object after setting up flow netfilter: nf_tables: set dormant flag on hook register failure selftests: tls: add test for peeking past a record of a different type selftests: tls: add test for merging of same-type control messages ...
2024-02-22gpu: host1x: Skip reset assert on Tegra186Mikko Perttunen2-6/+15
On Tegra186, secure world applications may need to access host1x during suspend/resume, and rely on the kernel to keep Host1x out of reset during the suspend cycle. As such, as a quirk, skip asserting Host1x's reset on Tegra186. We don't need to keep the clocks enabled, as BPMP ensures the clock stays on while Host1x is being used. On newer SoC's, the reset line is inaccessible, so there is no need for the quirk. Fixes: b7c00cdf6df5 ("gpu: host1x: Enable system suspend callbacks") Signed-off-by: Mikko Perttunen <[email protected]> Reviewed-by: Jon Hunter <[email protected]> Tested-by: Jon Hunter <[email protected]> Signed-off-by: Thierry Reding <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-02-22drm/amdgpu: Fix the runtime resume failure issueMa Jun1-0/+3
Don't set power state flag when system enter runtime suspend, or it may cause runtime resume failure issue. Fixes: 3a9626c816db ("drm/amd: Stop evicting resources on APUs in suspend") Signed-off-by: Ma Jun <[email protected]> Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-02-22drm/amd/display: fix null-pointer dereference on edid readingMelissa Wen1-4/+15
Use i2c adapter when there isn't aux_mode in dc_link to fix a null-pointer derefence that happens when running igt@kms_force_connector_basic in a system with DCN2.1 and HDMI connector detected as below: [ +0.178146] BUG: kernel NULL pointer dereference, address: 00000000000004c0 [ +0.000010] #PF: supervisor read access in kernel mode [ +0.000005] #PF: error_code(0x0000) - not-present page [ +0.000004] PGD 0 P4D 0 [ +0.000006] Oops: 0000 [#1] PREEMPT SMP NOPTI [ +0.000006] CPU: 15 PID: 2368 Comm: kms_force_conne Not tainted 6.5.0-asdn+ #152 [ +0.000005] Hardware name: HP HP ENVY x360 Convertible 13-ay1xxx/8929, BIOS F.01 07/14/2021 [ +0.000004] RIP: 0010:i2c_transfer+0xd/0x100 [ +0.000011] Code: ea fc ff ff 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 41 54 55 53 <48> 8b 47 10 48 89 fb 48 83 38 00 0f 84 b3 00 00 00 83 3d 2f 80 16 [ +0.000004] RSP: 0018:ffff9c4f89c0fad0 EFLAGS: 00010246 [ +0.000005] RAX: 0000000000000000 RBX: 0000000000000005 RCX: 0000000000000080 [ +0.000003] RDX: 0000000000000002 RSI: ffff9c4f89c0fb20 RDI: 00000000000004b0 [ +0.000003] RBP: ffff9c4f89c0fb80 R08: 0000000000000080 R09: ffff8d8e0b15b980 [ +0.000003] R10: 00000000000380e0 R11: 0000000000000000 R12: 0000000000000080 [ +0.000002] R13: 0000000000000002 R14: ffff9c4f89c0fb0e R15: ffff9c4f89c0fb0f [ +0.000004] FS: 00007f9ad2176c40(0000) GS:ffff8d90fe9c0000(0000) knlGS:0000000000000000 [ +0.000003] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000004] CR2: 00000000000004c0 CR3: 0000000121bc4000 CR4: 0000000000750ee0 [ +0.000003] PKRU: 55555554 [ +0.000003] Call Trace: [ +0.000006] <TASK> [ +0.000006] ? __die+0x23/0x70 [ +0.000011] ? page_fault_oops+0x17d/0x4c0 [ +0.000008] ? preempt_count_add+0x6e/0xa0 [ +0.000008] ? srso_alias_return_thunk+0x5/0x7f [ +0.000011] ? exc_page_fault+0x7f/0x180 [ +0.000009] ? asm_exc_page_fault+0x26/0x30 [ +0.000013] ? i2c_transfer+0xd/0x100 [ +0.000010] drm_do_probe_ddc_edid+0xc2/0x140 [drm] [ +0.000067] ? srso_alias_return_thunk+0x5/0x7f [ +0.000006] ? _drm_do_get_edid+0x97/0x3c0 [drm] [ +0.000043] ? __pfx_drm_do_probe_ddc_edid+0x10/0x10 [drm] [ +0.000042] edid_block_read+0x3b/0xd0 [drm] [ +0.000043] _drm_do_get_edid+0xb6/0x3c0 [drm] [ +0.000041] ? __pfx_drm_do_probe_ddc_edid+0x10/0x10 [drm] [ +0.000043] drm_edid_read_custom+0x37/0xd0 [drm] [ +0.000044] amdgpu_dm_connector_mode_valid+0x129/0x1d0 [amdgpu] [ +0.000153] drm_connector_mode_valid+0x3b/0x60 [drm_kms_helper] [ +0.000000] __drm_helper_update_and_validate+0xfe/0x3c0 [drm_kms_helper] [ +0.000000] ? amdgpu_dm_connector_get_modes+0xb6/0x520 [amdgpu] [ +0.000000] ? srso_alias_return_thunk+0x5/0x7f [ +0.000000] drm_helper_probe_single_connector_modes+0x2ab/0x540 [drm_kms_helper] [ +0.000000] status_store+0xb2/0x1f0 [drm] [ +0.000000] kernfs_fop_write_iter+0x136/0x1d0 [ +0.000000] vfs_write+0x24d/0x440 [ +0.000000] ksys_write+0x6f/0xf0 [ +0.000000] do_syscall_64+0x60/0xc0 [ +0.000000] ? srso_alias_return_thunk+0x5/0x7f [ +0.000000] ? syscall_exit_to_user_mode+0x2b/0x40 [ +0.000000] ? srso_alias_return_thunk+0x5/0x7f [ +0.000000] ? do_syscall_64+0x6c/0xc0 [ +0.000000] ? do_syscall_64+0x6c/0xc0 [ +0.000000] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ +0.000000] RIP: 0033:0x7f9ad46b4b00 [ +0.000000] Code: 40 00 48 8b 15 19 b3 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d e1 3a 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89 [ +0.000000] RSP: 002b:00007ffcbd3bd6d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ +0.000000] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f9ad46b4b00 [ +0.000000] RDX: 0000000000000002 RSI: 00007f9ad48a7417 RDI: 0000000000000009 [ +0.000000] RBP: 0000000000000002 R08: 0000000000000064 R09: 0000000000000000 [ +0.000000] R10: 0000000000000000 R11: 0000000000000202 R12: 00007f9ad48a7417 [ +0.000000] R13: 0000000000000009 R14: 00007ffcbd3bd760 R15: 0000000000000001 [ +0.000000] </TASK> [ +0.000000] Modules linked in: ctr ccm rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device cmac algif_hash algif_skcipher af_alg bnep btusb btrtl btbcm btintel btmtk bluetooth uvcvideo videobuf2_vmalloc sha3_generic videobuf2_memops uvc jitterentropy_rng videobuf2_v4l2 videodev drbg videobuf2_common ansi_cprng mc ecdh_generic ecc qrtr binfmt_misc hid_sensor_accel_3d hid_sensor_magn_3d hid_sensor_gyro_3d hid_sensor_trigger industrialio_triggered_buffer kfifo_buf industrialio snd_ctl_led joydev hid_sensor_iio_common rtw89_8852ae rtw89_8852a rtw89_pci snd_hda_codec_realtek rtw89_core snd_hda_codec_generic intel_rapl_msr ledtrig_audio intel_rapl_common snd_hda_codec_hdmi mac80211 snd_hda_intel snd_intel_dspcfg kvm_amd snd_hda_codec snd_soc_dmic snd_acp3x_rn snd_acp3x_pdm_dma libarc4 snd_hwdep snd_soc_core kvm snd_hda_core cfg80211 snd_pci_acp6x snd_pcm nls_ascii snd_timer hp_wmi snd_pci_acp5x nls_cp437 snd_rn_pci_acp3x ucsi_acpi sparse_keymap ccp snd platform_profile snd_acp_config typec_ucsi irqbypass vfat sp5100_tco [ +0.000000] snd_soc_acpi fat rapl pcspkr wmi_bmof roles rfkill rng_core snd_pci_acp3x soundcore k10temp watchdog typec battery ac amd_pmc acpi_tad button hid_sensor_hub hid_multitouch evdev serio_raw msr parport_pc ppdev lp parport fuse loop efi_pstore configfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs blake2b_generic dm_crypt dm_mod efivarfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx libcrc32c crc32c_generic xor raid6_pq raid1 raid0 multipath linear md_mod amdgpu amdxcp i2c_algo_bit drm_ttm_helper ttm crc32_pclmul crc32c_intel drm_exec gpu_sched drm_suballoc_helper nvme ghash_clmulni_intel drm_buddy drm_display_helper sha512_ssse3 nvme_core ahci xhci_pci sha512_generic hid_generic xhci_hcd libahci rtsx_pci_sdmmc t10_pi i2c_hid_acpi drm_kms_helper i2c_hid mmc_core libata aesni_intel crc64_rocksoft_generic crypto_simd amd_sfh crc64_rocksoft scsi_mod usbcore cryptd crc_t10dif cec drm crct10dif_generic hid rtsx_pci crct10dif_pclmul scsi_common rc_core crc64 i2c_piix4 [ +0.000000] usb_common crct10dif_common video wmi [ +0.000000] CR2: 00000000000004c0 [ +0.000000] ---[ end trace 0000000000000000 ]--- Fixes: 0e859faf8670 ("drm/amd/display: Remove unwanted drm edid references") Signed-off-by: Melissa Wen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-02-22drm/amd/display: Fix memory leak in dm_sw_fini()Armin Wolf1-0/+1
After destroying dmub_srv, the memory associated with it is not freed, causing a memory leak: unreferenced object 0xffff896302b45800 (size 1024): comm "(udev-worker)", pid 222, jiffies 4294894636 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 6265fd77): [<ffffffff993495ed>] kmalloc_trace+0x29d/0x340 [<ffffffffc0ea4a94>] dm_dmub_sw_init+0xb4/0x450 [amdgpu] [<ffffffffc0ea4e55>] dm_sw_init+0x15/0x2b0 [amdgpu] [<ffffffffc0ba8557>] amdgpu_device_init+0x1417/0x24e0 [amdgpu] [<ffffffffc0bab285>] amdgpu_driver_load_kms+0x15/0x190 [amdgpu] [<ffffffffc0ba09c7>] amdgpu_pci_probe+0x187/0x4e0 [amdgpu] [<ffffffff9968fd1e>] local_pci_probe+0x3e/0x90 [<ffffffff996918a3>] pci_device_probe+0xc3/0x230 [<ffffffff99805872>] really_probe+0xe2/0x480 [<ffffffff99805c98>] __driver_probe_device+0x78/0x160 [<ffffffff99805daf>] driver_probe_device+0x1f/0x90 [<ffffffff9980601e>] __driver_attach+0xce/0x1c0 [<ffffffff99803170>] bus_for_each_dev+0x70/0xc0 [<ffffffff99804822>] bus_add_driver+0x112/0x210 [<ffffffff99807245>] driver_register+0x55/0x100 [<ffffffff990012d1>] do_one_initcall+0x41/0x300 Fix this by freeing dmub_srv after destroying it. Fixes: 743b9786b14a ("drm/amd/display: Hook up the DMUB service in DM") Signed-off-by: Armin Wolf <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-02-22drm/amd/display: fix input states translation error for dcn35 & dcn351Swapnil Patel1-1/+8
[Why] Currently there is an error while translating input clock sates into output clock states. The highest fclk setting from output sates is being dropped because of this error. [How] For dcn35 and dcn351, make output_states equal to input states. Reviewed-by: Charlene Liu <[email protected]> Acked-by: Rodrigo Siqueira <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Swapnil Patel <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-02-22drm/amd/display: Fix potential null pointer dereference in dc_dmub_srvSrinivasan Shanmugam1-2/+5
Fixes potential null pointer dereference warnings in the dc_dmub_srv_cmd_list_queue_execute() and dc_dmub_srv_is_hw_pwr_up() functions. In both functions, the 'dc_dmub_srv' variable was being dereferenced before it was checked for null. This could lead to a null pointer dereference if 'dc_dmub_srv' is null. The fix is to check if 'dc_dmub_srv' is null before dereferencing it. Thus moving the null checks for 'dc_dmub_srv' to the beginning of the functions to ensure that 'dc_dmub_srv' is not null when it is dereferenced. Found by smatch & thus fixing the below: drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.c:133 dc_dmub_srv_cmd_list_queue_execute() warn: variable dereferenced before check 'dc_dmub_srv' (see line 128) drivers/gpu/drm/amd/amdgpu/../display/dc/dc_dmub_srv.c:1167 dc_dmub_srv_is_hw_pwr_up() warn: variable dereferenced before check 'dc_dmub_srv' (see line 1164) Fixes: 028bac583449 ("drm/amd/display: decouple dmcub execution to reduce lock granularity") Fixes: 65138eb72e1f ("drm/amd/display: Add DCN35 DMUB") Cc: JinZe.Xu <[email protected]> Cc: Hersen Wu <[email protected]> Cc: Josip Pavic <[email protected]> Cc: Roman Li <[email protected]> Cc: Qingqing Zhuo <[email protected]> Cc: Harry Wentland <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Aurabindo Pillai <[email protected]> Cc: Tom Chung <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Reviewed-by: Tom Chung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-02-22Merge tag 'trace-v6.8-rc5' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fix from Steven Rostedt: - While working on the ring buffer I noticed that the counter used for knowing where the end of the data is on a sub-buffer was not a full "int" but just 20 bits. It was masked out to 0xfffff. With the new code that allows the user to change the size of the sub-buffer, it is theoretically possible to ask for a size bigger than 2^20. If that happens, unexpected results may occur as there's no code checking if the counter overflowed the 20 bits of the write mask. There are other checks to make sure events fit in the sub-buffer, but if the sub-buffer itself is too big, that is not checked. Add a check in the resize of the sub-buffer to make sure that it never goes beyond the size of the counter that holds how much data is on it. * tag 'trace-v6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ring-buffer: Do not let subbuf be bigger than write mask
2024-02-22drm/amd/display: Only allow dig mapping to pwrseq in new asicLewis Huang5-27/+21
[Why] The old asic only have 1 pwrseq hw. We don't need to map the diginst to pwrseq inst in old asic. [How] 1. Only mapping dig to pwrseq for new asic. 2. Move mapping function into dcn specific panel control component Cc: Stable <[email protected]> # v6.6+ Cc: Mario Limonciello <[email protected]> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3122 Reviewed-by: Anthony Koo <[email protected]> Acked-by: Rodrigo Siqueira <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Lewis Huang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-02-22drm/amd/display: adjust few initialization order in dmWayne Lin1-19/+18
[Why] Observe error message "Can't retrieve aconnector in hpd_rx_irq_offload_work" when boot up with a mst tbt4 dock connected. After analyzing, there are few parts needed to be adjusted: 1. hpd_rx_offload_wq[].aconnector is not initialzed before the dmub outbox hpd_irq handler get registered which causes the error message. 2. registeration of hpd and hpd_rx_irq event for usb4 dp tunneling is not aligned with legacy interface sequence [How] Put DMUB_NOTIFICATION_HPD and DMUB_NOTIFICATION_HPD_IRQ handler registration into register_hpd_handlers() to align other interfaces and get hpd_rx_offload_wq[].aconnector initialized earlier than that. Leave DMUB_NOTIFICATION_AUX_REPLY registered as it was since we need that while calling dc_link_detect(). USB4 connection status will be proactively detected by dc_link_detect_connection_type() in amdgpu_dm_initialize_drm_device() Cc: Stable <[email protected]> Reviewed-by: Aurabindo Pillai <[email protected]> Acked-by: Rodrigo Siqueira <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Wayne Lin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-02-22s390/cio: fix invalid -EBUSY on ccw_device_startPeter Oberparleiter1-3/+3
The s390 common I/O layer (CIO) returns an unexpected -EBUSY return code when drivers try to start I/O while a path-verification (PV) process is pending. This can lead to failed device initialization attempts with symptoms like broken network connectivity after boot. Fix this by replacing the -EBUSY return code with a deferred condition code 1 reply to make path-verification handling consistent from a driver's point of view. The problem can be reproduced semi-regularly using the following process, while repeating steps 2-3 as necessary (example assumes an OSA device with bus-IDs 0.0.a000-0.0.a002 on CHPID 0.02): 1. echo 0.0.a000,0.0.a001,0.0.a002 >/sys/bus/ccwgroup/drivers/qeth/group 2. echo 0 > /sys/bus/ccwgroup/devices/0.0.a000/online 3. echo 1 > /sys/bus/ccwgroup/devices/0.0.a000/online ; \ echo on > /sys/devices/css0/chp0.02/status Background information: The common I/O layer starts path-verification I/Os when it receives indications about changes in a device path's availability. This occurs for example when hardware events indicate a change in channel-path status, or when a manual operation such as a CHPID vary or configure operation is performed. If a driver attempts to start I/O while a PV is running, CIO reports a successful I/O start (ccw_device_start() return code 0). Then, after completion of PV, CIO synthesizes an interrupt response that indicates an asynchronous status condition that prevented the start of the I/O (deferred condition code 1). If a PV indication arrives while a device is busy with driver-owned I/O, PV is delayed until after I/O completion was reported to the driver's interrupt handler. To ensure that PV can be started eventually, CIO reports a device busy condition (ccw_device_start() return code -EBUSY) if a driver tries to start another I/O while PV is pending. In some cases this -EBUSY return code causes device drivers to consider a device not operational, resulting in failed device initialization. Note: The code that introduced the problem was added in 2003. Symptoms started appearing with the following CIO commit that causes a PV indication when a device is removed from the cio_ignore list after the associated parent subchannel device was probed, but before online processing of the CCW device has started: 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers") During boot, the cio_ignore list is modified by the cio_ignore dracut module [1] as well as Linux vendor-specific systemd service scripts[2]. When combined, this commit and boot scripts cause a frequent occurrence of the problem during boot. [1] https://github.com/dracutdevs/dracut/tree/master/modules.d/81cio_ignore [2] https://github.com/SUSE/s390-tools/blob/master/cio_ignore.service Cc: [email protected] # v5.15+ Fixes: 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers") Tested-By: Thorsten Winkler <[email protected]> Reviewed-by: Thorsten Winkler <[email protected]> Signed-off-by: Peter Oberparleiter <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2024-02-22dmaengine: idxd: Ensure safe user copy of completion recordFenghua Yu1-3/+12
If CONFIG_HARDENED_USERCOPY is enabled, copying completion record from event log cache to user triggers a kernel bug. [ 1987.159822] usercopy: Kernel memory exposure attempt detected from SLUB object 'dsa0' (offset 74, size 31)! [ 1987.170845] ------------[ cut here ]------------ [ 1987.176086] kernel BUG at mm/usercopy.c:102! [ 1987.180946] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 1987.186866] CPU: 17 PID: 528 Comm: kworker/17:1 Not tainted 6.8.0-rc2+ #5 [ 1987.194537] Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.86B.2492.D03.2307181620 07/18/2023 [ 1987.206405] Workqueue: wq0.0 idxd_evl_fault_work [idxd] [ 1987.212338] RIP: 0010:usercopy_abort+0x72/0x90 [ 1987.217381] Code: 58 65 9c 50 48 c7 c2 17 85 61 9c 57 48 c7 c7 98 fd 6b 9c 48 0f 44 d6 48 c7 c6 b3 08 62 9c 4c 89 d1 49 0f 44 f3 e8 1e 2e d5 ff <0f> 0b 49 c7 c1 9e 42 61 9c 4c 89 cf 4d 89 c8 eb a9 66 66 2e 0f 1f [ 1987.238505] RSP: 0018:ff62f5cf20607d60 EFLAGS: 00010246 [ 1987.244423] RAX: 000000000000005f RBX: 000000000000001f RCX: 0000000000000000 [ 1987.252480] RDX: 0000000000000000 RSI: ffffffff9c61429e RDI: 00000000ffffffff [ 1987.260538] RBP: ff62f5cf20607d78 R08: ff2a6a89ef3fffe8 R09: 00000000fffeffff [ 1987.268595] R10: ff2a6a89eed00000 R11: 0000000000000003 R12: ff2a66934849c89a [ 1987.276652] R13: 0000000000000001 R14: ff2a66934849c8b9 R15: ff2a66934849c899 [ 1987.284710] FS: 0000000000000000(0000) GS:ff2a66b22fe40000(0000) knlGS:0000000000000000 [ 1987.293850] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1987.300355] CR2: 00007fe291a37000 CR3: 000000010fbd4005 CR4: 0000000000f71ef0 [ 1987.308413] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1987.316470] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 1987.324527] PKRU: 55555554 [ 1987.327622] Call Trace: [ 1987.330424] <TASK> [ 1987.332826] ? show_regs+0x6e/0x80 [ 1987.336703] ? die+0x3c/0xa0 [ 1987.339988] ? do_trap+0xd4/0xf0 [ 1987.343662] ? do_error_trap+0x75/0xa0 [ 1987.347922] ? usercopy_abort+0x72/0x90 [ 1987.352277] ? exc_invalid_op+0x57/0x80 [ 1987.356634] ? usercopy_abort+0x72/0x90 [ 1987.360988] ? asm_exc_invalid_op+0x1f/0x30 [ 1987.365734] ? usercopy_abort+0x72/0x90 [ 1987.370088] __check_heap_object+0xb7/0xd0 [ 1987.374739] __check_object_size+0x175/0x2d0 [ 1987.379588] idxd_copy_cr+0xa9/0x130 [idxd] [ 1987.384341] idxd_evl_fault_work+0x127/0x390 [idxd] [ 1987.389878] process_one_work+0x13e/0x300 [ 1987.394435] ? __pfx_worker_thread+0x10/0x10 [ 1987.399284] worker_thread+0x2f7/0x420 [ 1987.403544] ? _raw_spin_unlock_irqrestore+0x2b/0x50 [ 1987.409171] ? __pfx_worker_thread+0x10/0x10 [ 1987.414019] kthread+0x107/0x140 [ 1987.417693] ? __pfx_kthread+0x10/0x10 [ 1987.421954] ret_from_fork+0x3d/0x60 [ 1987.426019] ? __pfx_kthread+0x10/0x10 [ 1987.430281] ret_from_fork_asm+0x1b/0x30 [ 1987.434744] </TASK> The issue arises because event log cache is created using kmem_cache_create() which is not suitable for user copy. Fix the issue by creating event log cache with kmem_cache_create_usercopy(), ensuring safe user copy. Fixes: c2f156bf168f ("dmaengine: idxd: create kmem cache for event log fault items") Reported-by: Tony Zhu <[email protected]> Tested-by: Tony Zhu <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Reviewed-by: Lijun Pan <[email protected]> Reviewed-by: Dave Jiang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Vinod Koul <[email protected]>
2024-02-22selftests/iommu: fix the config fragmentMuhammad Usama Anjum1-2/+3
The config fragment doesn't follow the correct format to enable those config options which make the config options getting missed while merging with other configs. ➜ merge_config.sh -m .config tools/testing/selftests/iommu/config Using .config as base Merging tools/testing/selftests/iommu/config ➜ make olddefconfig .config:5295:warning: unexpected data: CONFIG_IOMMUFD .config:5296:warning: unexpected data: CONFIG_IOMMUFD_TEST While at it, add CONFIG_FAULT_INJECTION as well which is needed for CONFIG_IOMMUFD_TEST. If CONFIG_FAULT_INJECTION isn't present in base config (such as x86 defconfig), CONFIG_IOMMUFD_TEST doesn't get enabled. Fixes: 57f0988706fe ("iommufd: Add a selftest") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Muhammad Usama Anjum <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2024-02-22drm/syncobj: handle NULL fence in syncobj_eventfd_entry_funcErik Kurzinger1-1/+12
During syncobj_eventfd_entry_func, dma_fence_chain_find_seqno may set the fence to NULL if the given seqno is signaled and a later seqno has already been submitted. In that case, the eventfd should be signaled immediately which currently does not happen. This is a similar issue to the one addressed by commit b19926d4f3a6 ("drm/syncobj: Deal with signalled fences in drm_syncobj_find_fence."). As a fix, if the return value of dma_fence_chain_find_seqno indicates success but it sets the fence to NULL, we will assign a stub fence to ensure the following code still signals the eventfd. v1 -> v2: assign a stub fence instead of signaling the eventfd Signed-off-by: Erik Kurzinger <[email protected]> Fixes: c7a472297169 ("drm/syncobj: add IOCTL to register an eventfd") Signed-off-by: Simon Ser <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-02-22iommu/arm-smmu-v3: Do not use GFP_KERNEL under as spinlockJason Gunthorpe1-26/+12
If the SMMU is configured to use a two level CD table then arm_smmu_write_ctx_desc() allocates a CD table leaf internally using GFP_KERNEL. Due to recent changes this is being done under a spinlock to iterate over the device list - thus it will trigger a sleeping while atomic warning: arm_smmu_sva_set_dev_pasid() mutex_lock(&sva_lock); __arm_smmu_sva_bind() arm_smmu_mmu_notifier_get() spin_lock_irqsave() arm_smmu_write_ctx_desc() arm_smmu_get_cd_ptr() arm_smmu_alloc_cd_leaf_table() dmam_alloc_coherent(GFP_KERNEL) This is a 64K high order allocation and really should not be done atomically. At the moment the rework of the SVA to follow the new API is half finished. Recently the CD table memory was moved from the domain to the master, however we have the confusing situation where the SVA code is wrongly using the RID domains device's list to track which CD tables the SVA is installed in. Remove the logic to replicate the CD across all the domain's masters during attach. We know which master and which CD table the PASID should be installed in. Right now SVA only works when dma-iommu.c is in control of the RID translation, which means we have a single iommu_domain shared across the entire group and that iommu_domain is not shared outside the group. Critically this means that the iommu_group->devices list and RID's smmu_domain->devices list describe the same set of masters. For PCI cases the core code also insists on singleton groups so there is only one entry in the smmu_domain->devices list that is equal to the master being passed in to arm_smmu_sva_set_dev_pasid(). Only non-PCI cases may have multi-device groups. However, the core code will repeat the calls to arm_smmu_sva_set_dev_pasid() across the entire iommu_group->devices list. Instead of having arm_smmu_mmu_notifier_get() indirectly loop over all the devices in the group via the RID's smmu_domain, rely on __arm_smmu_sva_bind() to be called for each device in the group and install the repeated CD entry that way. This avoids taking the spinlock to access the devices list and permits the arm_smmu_write_ctx_desc() to use a sleeping allocation. Leave the arm_smmu_mm_release() as a confusing situation, this requires tracking attached masters inside the SVA domain. Removing the loop allows arm_smmu_write_ctx_desc() to be called outside the spinlock and thus is safe to use GFP_KERNEL. Move the clearing of the CD into arm_smmu_sva_remove_dev_pasid() so that arm_smmu_mmu_notifier_get/put() remain paired functions. Fixes: 24503148c545 ("iommu/arm-smmu-v3: Refactor write_ctx_desc") Reported-by: Dan Carpenter <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jason Gunthorpe <[email protected]> Reviewed-by: Michael Shavit <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2024-02-22spi: cadence-qspi: add system-wide suspend and resume callbacksThéo Lebrun1-2/+18
Each SPI controller is expected to call the spi_controller_suspend() and spi_controller_resume() callbacks at system-wide suspend and resume. It (1) handles the kthread worker for queued controllers and (2) marks the controller as suspended to have spi_sync() fail while the controller is unavailable. Those two operations do not require the controller to be active, we do not need to increment the runtime PM usage counter. Signed-off-by: Théo Lebrun <[email protected]> Link: https://msgid.link/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2024-02-22spi: cadence-qspi: put runtime in runtime PM hooks namesThéo Lebrun1-4/+4
Follow kernel naming convention with regards to power-management callback function names. The convention in the kernel is: - prefix_suspend means the system-wide suspend callback; - prefix_runtime_suspend means the runtime PM suspend callback. The same applies to resume callbacks. Signed-off-by: Théo Lebrun <[email protected]> Reviewed-by: Dhruva Gole <[email protected]> Link: https://msgid.link/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2024-02-22spi: cadence-qspi: remove system-wide suspend helper calls from runtime PM hooksThéo Lebrun1-7/+2
The ->runtime_suspend() and ->runtime_resume() callbacks are not expected to call spi_controller_suspend() and spi_controller_resume(). Remove calls to those in the cadence-qspi driver. Those helpers have two roles currently: - They stop/start the queue, including dealing with the kworker. - They toggle the SPI controller SPI_CONTROLLER_SUSPENDED flag. It requires acquiring ctlr->bus_lock_mutex. Step one is irrelevant because cadence-qspi is not queued. Step two however has two implications: - A deadlock occurs, because ->runtime_resume() is called in a context where the lock is already taken (in the ->exec_op() callback, where the usage count is incremented). - It would disallow all operations once the device is auto-suspended. Here is a brief call tree highlighting the mutex deadlock: spi_mem_exec_op() ... spi_mem_access_start() mutex_lock(&ctlr->bus_lock_mutex) cqspi_exec_mem_op() pm_runtime_resume_and_get() cqspi_resume() spi_controller_resume() mutex_lock(&ctlr->bus_lock_mutex) ... spi_mem_access_end() mutex_unlock(&ctlr->bus_lock_mutex) ... Fixes: 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support") Signed-off-by: Théo Lebrun <[email protected]> Link: https://msgid.link/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2024-02-22spi: cadence-qspi: fix pointer reference in runtime PM hooksThéo Lebrun1-4/+2
dev_get_drvdata() gets used to acquire the pointer to cqspi and the SPI controller. Neither embed the other; this lead to memory corruption. On a given platform (Mobileye EyeQ5) the memory corruption is hidden inside cqspi->f_pdata. Also, this uninitialised memory is used as a mutex (ctlr->bus_lock_mutex) by spi_controller_suspend(). Fixes: 2087e85bb66e ("spi: cadence-quadspi: fix suspend-resume implementations") Reviewed-by: Dhruva Gole <[email protected]> Signed-off-by: Théo Lebrun <[email protected]> Link: https://msgid.link/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2024-02-22drm/syncobj: call drm_syncobj_fence_add_wait when WAIT_AVAILABLE flag is setErik Kurzinger1-2/+4
When waiting for a syncobj timeline point whose fence has not yet been submitted with the WAIT_FOR_SUBMIT flag, a callback is registered using drm_syncobj_fence_add_wait and the thread is put to sleep until the timeout expires. If the fence is submitted before then, drm_syncobj_add_point will wake up the sleeping thread immediately which will proceed to wait for the fence to be signaled. However, if the WAIT_AVAILABLE flag is used instead, drm_syncobj_fence_add_wait won't get called, meaning the waiting thread will always sleep for the full timeout duration, even if the fence gets submitted earlier. If it turns out that the fence *has* been submitted by the time it eventually wakes up, it will still indicate to userspace that the wait completed successfully (it won't return -ETIME), but it will have taken much longer than it should have. To fix this, we must call drm_syncobj_fence_add_wait if *either* the WAIT_FOR_SUBMIT flag or the WAIT_AVAILABLE flag is set. The only difference being that with WAIT_FOR_SUBMIT we will also wait for the fence to be signaled after it has been submitted while with WAIT_AVAILABLE we will return immediately. IGT test patch: https://lists.freedesktop.org/archives/igt-dev/2024-January/067537.html v1 -> v2: adjust lockdep_assert_none_held_once condition (cherry picked from commit 8c44ea81634a4a337df70a32621a5f3791be23df) Fixes: 01d6c3578379 ("drm/syncobj: add support for timeline point wait v8") Signed-off-by: Erik Kurzinger <[email protected]> Signed-off-by: Simon Ser <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Simon Ser <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-02-22btrfs: fix data race at btrfs_use_block_rsv() when accessing block reserveFilipe Manana2-1/+17
At btrfs_use_block_rsv() we read the size of a block reserve without locking its spinlock, which makes KCSAN complain because the size of a block reserve is always updated while holding its spinlock. The report from KCSAN is the following: [653.313148] BUG: KCSAN: data-race in btrfs_update_delayed_refs_rsv [btrfs] / btrfs_use_block_rsv [btrfs] [653.314755] read to 0x000000017f5871b8 of 8 bytes by task 7519 on cpu 0: [653.314779] btrfs_use_block_rsv+0xe4/0x2f8 [btrfs] [653.315606] btrfs_alloc_tree_block+0xdc/0x998 [btrfs] [653.316421] btrfs_force_cow_block+0x220/0xe38 [btrfs] [653.317242] btrfs_cow_block+0x1ac/0x568 [btrfs] [653.318060] btrfs_search_slot+0xda2/0x19b8 [btrfs] [653.318879] btrfs_del_csums+0x1dc/0x798 [btrfs] [653.319702] __btrfs_free_extent.isra.0+0xc24/0x2028 [btrfs] [653.320538] __btrfs_run_delayed_refs+0xd3c/0x2390 [btrfs] [653.321340] btrfs_run_delayed_refs+0xae/0x290 [btrfs] [653.322140] flush_space+0x5e4/0x718 [btrfs] [653.322958] btrfs_preempt_reclaim_metadata_space+0x102/0x2f8 [btrfs] [653.323781] process_one_work+0x3b6/0x838 [653.323800] worker_thread+0x75e/0xb10 [653.323817] kthread+0x21a/0x230 [653.323836] __ret_from_fork+0x6c/0xb8 [653.323855] ret_from_fork+0xa/0x30 [653.323887] write to 0x000000017f5871b8 of 8 bytes by task 576 on cpu 3: [653.323906] btrfs_update_delayed_refs_rsv+0x1a4/0x250 [btrfs] [653.324699] btrfs_add_delayed_data_ref+0x468/0x6d8 [btrfs] [653.325494] btrfs_free_extent+0x76/0x120 [btrfs] [653.326280] __btrfs_mod_ref+0x6a8/0x6b8 [btrfs] [653.327064] btrfs_dec_ref+0x50/0x70 [btrfs] [653.327849] walk_up_proc+0x236/0xa50 [btrfs] [653.328633] walk_up_tree+0x21c/0x448 [btrfs] [653.329418] btrfs_drop_snapshot+0x802/0x1328 [btrfs] [653.330205] btrfs_clean_one_deleted_snapshot+0x184/0x238 [btrfs] [653.330995] cleaner_kthread+0x2b0/0x2f0 [btrfs] [653.331781] kthread+0x21a/0x230 [653.331800] __ret_from_fork+0x6c/0xb8 [653.331818] ret_from_fork+0xa/0x30 So add a helper to get the size of a block reserve while holding the lock. Reading the field while holding the lock instead of using the data_race() annotation is used in order to prevent load tearing. Signed-off-by: Filipe Manana <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2024-02-22btrfs: fix data races when accessing the reserved amount of block reservesFilipe Manana2-13/+29
At space_info.c we have several places where we access the ->reserved field of a block reserve without taking the block reserve's spinlock first, which makes KCSAN warn about a data race since that field is always updated while holding the spinlock. The reports from KCSAN are like the following: [117.193526] BUG: KCSAN: data-race in btrfs_block_rsv_release [btrfs] / need_preemptive_reclaim [btrfs] [117.195148] read to 0x000000017f587190 of 8 bytes by task 6303 on cpu 3: [117.195172] need_preemptive_reclaim+0x222/0x2f0 [btrfs] [117.195992] __reserve_bytes+0xbb0/0xdc8 [btrfs] [117.196807] btrfs_reserve_metadata_bytes+0x4c/0x120 [btrfs] [117.197620] btrfs_block_rsv_add+0x78/0xa8 [btrfs] [117.198434] btrfs_delayed_update_inode+0x154/0x368 [btrfs] [117.199300] btrfs_update_inode+0x108/0x1c8 [btrfs] [117.200122] btrfs_dirty_inode+0xb4/0x140 [btrfs] [117.200937] btrfs_update_time+0x8c/0xb0 [btrfs] [117.201754] touch_atime+0x16c/0x1e0 [117.201789] filemap_read+0x674/0x728 [117.201823] btrfs_file_read_iter+0xf8/0x410 [btrfs] [117.202653] vfs_read+0x2b6/0x498 [117.203454] ksys_read+0xa2/0x150 [117.203473] __s390x_sys_read+0x68/0x88 [117.203495] do_syscall+0x1c6/0x210 [117.203517] __do_syscall+0xc8/0xf0 [117.203539] system_call+0x70/0x98 [117.203579] write to 0x000000017f587190 of 8 bytes by task 11 on cpu 0: [117.203604] btrfs_block_rsv_release+0x2e8/0x578 [btrfs] [117.204432] btrfs_delayed_inode_release_metadata+0x7c/0x1d0 [btrfs] [117.205259] __btrfs_update_delayed_inode+0x37c/0x5e0 [btrfs] [117.206093] btrfs_async_run_delayed_root+0x356/0x498 [btrfs] [117.206917] btrfs_work_helper+0x160/0x7a0 [btrfs] [117.207738] process_one_work+0x3b6/0x838 [117.207768] worker_thread+0x75e/0xb10 [117.207797] kthread+0x21a/0x230 [117.207830] __ret_from_fork+0x6c/0xb8 [117.207861] ret_from_fork+0xa/0x30 So add a helper to get the reserved amount of a block reserve while holding the lock. The value may be not be up to date anymore when used by need_preemptive_reclaim() and btrfs_preempt_reclaim_metadata_space(), but that's ok since the worst it can do is cause more reclaim work do be done sooner rather than later. Reading the field while holding the lock instead of using the data_race() annotation is used in order to prevent load tearing. Signed-off-by: Filipe Manana <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2024-02-22btrfs: send: don't issue unnecessary zero writes for trailing holeFilipe Manana1-4/+13
If we have a sparse file with a trailing hole (from the last extent's end to i_size) and then create an extent in the file that ends before the file's i_size, then when doing an incremental send we will issue a write full of zeroes for the range that starts immediately after the new extent ends up to i_size. While this isn't incorrect because the file ends up with exactly the same data, it unnecessarily results in using extra space at the destination with one or more extents full of zeroes instead of having a hole. In same cases this results in using megabytes or even gigabytes of unnecessary space. Example, reproducer: $ cat test.sh #!/bin/bash DEV=/dev/sdh MNT=/mnt/sdh mkfs.btrfs -f $DEV mount $DEV $MNT # Create 1G sparse file. xfs_io -f -c "truncate 1G" $MNT/foobar # Create base snapshot. btrfs subvolume snapshot -r $MNT $MNT/mysnap1 # Create send stream (full send) for the base snapshot. btrfs send -f /tmp/1.snap $MNT/mysnap1 # Now write one extent at the beginning of the file and one somewhere # in the middle, leaving a gap between the end of this second extent # and the file's size. xfs_io -c "pwrite -S 0xab 0 128K" \ -c "pwrite -S 0xcd 512M 128K" \ $MNT/foobar # Now create a second snapshot which is going to be used for an # incremental send operation. btrfs subvolume snapshot -r $MNT $MNT/mysnap2 # Create send stream (incremental send) for the second snapshot. btrfs send -p $MNT/mysnap1 -f /tmp/2.snap $MNT/mysnap2 # Now recreate the filesystem by receiving both send streams and # verify we get the same content that the original filesystem had # and file foobar has only two extents with a size of 128K each. umount $MNT mkfs.btrfs -f $DEV mount $DEV $MNT btrfs receive -f /tmp/1.snap $MNT btrfs receive -f /tmp/2.snap $MNT echo -e "\nFile fiemap in the second snapshot:" # Should have: # # 128K extent at file range [0, 128K[ # hole at file range [128K, 512M[ # 128K extent file range [512M, 512M + 128K[ # hole at file range [512M + 128K, 1G[ xfs_io -r -c "fiemap -v" $MNT/mysnap2/foobar # File should be using 256K of data (two 128K extents). echo -e "\nSpace used by the file: $(du -h $MNT/mysnap2/foobar | cut -f 1)" umount $MNT Running the test, we can see with fiemap that we get an extent for the range [512M, 1G[, while in the source filesystem we have an extent for the range [512M, 512M + 128K[ and a hole for the rest of the file (the range [512M + 128K, 1G[): $ ./test.sh (...) File fiemap in the second snapshot: /mnt/sdh/mysnap2/foobar: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..255]: 26624..26879 256 0x0 1: [256..1048575]: hole 1048320 2: [1048576..2097151]: 2156544..3205119 1048576 0x1 Space used by the file: 513M This happens because once we finish processing an inode, at finish_inode_if_needed(), we always issue a hole (write operations full of zeros) if there's a gap between the end of the last processed extent and the file's size, even if that range is already a hole in the parent snapshot. Fix this by issuing the hole only if the range is not already a hole. After this change, running the test above, we get the expected layout: $ ./test.sh (...) File fiemap in the second snapshot: /mnt/sdh/mysnap2/foobar: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..255]: 26624..26879 256 0x0 1: [256..1048575]: hole 1048320 2: [1048576..1048831]: 26880..27135 256 0x1 3: [1048832..2097151]: hole 1048320 Space used by the file: 256K A test case for fstests will follow soon. CC: [email protected] # 6.1+ Reported-by: Dorai Ashok S A <[email protected]> Link: https://lore.kernel.org/linux-btrfs/[email protected]/ Reviewed-by: Sweet Tea Dorminy <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2024-02-22btrfs: dev-replace: properly validate device namesDavid Sterba1-4/+20
There's a syzbot report that device name buffers passed to device replace are not properly checked for string termination which could lead to a read out of bounds in getname_kernel(). Add a helper that validates both source and target device name buffers. For devid as the source initialize the buffer to empty string in case something tries to read it later. This was originally analyzed and fixed in a different way by Edward Adam Davis (see links). Link: https://lore.kernel.org/linux-btrfs/[email protected]/ Link: https://lore.kernel.org/linux-btrfs/[email protected]/ CC: [email protected] # 4.19+ CC: Edward Adam Davis <[email protected]> Reported-and-tested-by: [email protected] Reviewed-by: Boris Burkov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2024-02-22btrfs: zoned: don't skip block group profile checks on conventional zonesJohannes Thumshirn1-0/+9
On a zoned filesystem with conventional zones, we're skipping the block group profile checks for the conventional zones. This allows converting a zoned filesystem's data block groups to RAID when all of the zones backing the chunk are on conventional zones. But this will lead to problems, once we're trying to allocate chunks backed by sequential zones. So also check for conventional zones when loading a block group's profile on them. Reported-by: HAN Yuwei <[email protected]> Link: https://lore.kernel.org/all/[email protected]/#t Reviewed-by: Boris Burkov <[email protected]> Reviewed-by: Naohiro Aota <[email protected]> Signed-off-by: Johannes Thumshirn <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2024-02-22drm/ttm: Fix an invalid freeing on already freed page in error pathThomas Hellström1-1/+1
If caching mode change fails due to, for example, OOM we free the allocated pages in a two-step process. First the pages for which the caching change has already succeeded. Secondly the pages for which a caching change did not succeed. However the second step was incorrectly freeing the pages already freed in the first step. Fix. Signed-off-by: Thomas Hellström <[email protected]> Fixes: 379989e7cbdc ("drm/ttm/pool: Fix ttm_pool_alloc error path") Cc: Christian König <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Christian Koenig <[email protected]> Cc: Huang Rui <[email protected]> Cc: [email protected] Cc: <[email protected]> # v6.4+ Reviewed-by: Matthew Auld <[email protected]> Reviewed-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-02-22ARM: dts: renesas: rcar-gen2: Add missing #interrupt-cells to DA9063 nodesGeert Uytterhoeven8-0/+8
make dtbs_check W=2: arch/arm/boot/dts/renesas/r8a7790-lager.dts:444.11-458.5: Warning (interrupt_provider): /i2c-mux4/pmic@58: Missing '#interrupt-cells' in interrupt provider ... Fix this by adding the missing #interrupt-cells properties. Reported-by: Rob Herring <[email protected]> Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/a351e503ea97fb1af68395843f513925ff1bdf26.1707922460.git.geert+renesas@glider.be
2024-02-22l2tp: pass correct message length to ip6_append_dataTom Parkin1-1/+1
l2tp_ip6_sendmsg needs to avoid accounting for the transport header twice when splicing more data into an already partially-occupied skbuff. To manage this, we check whether the skbuff contains data using skb_queue_empty when deciding how much data to append using ip6_append_data. However, the code which performed the calculation was incorrect: ulen = len + skb_queue_empty(&sk->sk_write_queue) ? transhdrlen : 0; ...due to C operator precedence, this ends up setting ulen to transhdrlen for messages with a non-zero length, which results in corrupted packets on the wire. Add parentheses to correct the calculation in line with the original intent. Fixes: 9d4c75800f61 ("ipv4, ipv6: Fix handling of transhdrlen in __ip{,6}_append_data()") Cc: David Howells <[email protected]> Cc: [email protected] Signed-off-by: Tom Parkin <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-02-22Revert "ACPI: EC: Use a spin lock without disabing interrupts"Rafael J. Wysocki1-46/+66
Commit eb9299beadbd ("ACPI: EC: Use a spin lock without disabing interrupts") introduced an unexpected user-visible change in behavior, which is a significant CPU load increase when the EC is in use. This most likely happens due to increased spinlock contention and so reducing this effect would require a major rework of the EC driver locking. There is no time for this in the current cycle, so revert commit eb9299beadbd. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218511 Reported-by: Dieter Mummenschanz <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-02-22Merge tag 'nf-24-02-22' of ↵Paolo Abeni3-43/+57
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) If user requests to wake up a table and hook fails, restore the dormant flag from the error path, from Florian Westphal. 2) Reset dst after transferring it to the flow object, otherwise dst gets released twice from the error path. 3) Release dst in case the flowtable selects a direct xmit path, eg. transmission to bridge port. Otherwise, dst is memleaked. 4) Register basechain and flowtable hooks at the end of the command. Error path releases these datastructure without waiting for the rcu grace period. 5) Use kzalloc() to initialize struct nft_hook to fix a KMSAN report on access to hook type, also from Florian Westphal. netfilter pull request 24-02-22 * tag 'nf-24-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: use kzalloc for hook allocation netfilter: nf_tables: register hooks last when adding new chain/flowtable netfilter: nft_flow_offload: release dst in case direct xmit path is used netfilter: nft_flow_offload: reset dst in route object after setting up flow netfilter: nf_tables: set dormant flag on hook register failure ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-02-22Merge tag 'for-netdev' of ↵Paolo Abeni15-17/+217
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2024-02-22 The following pull-request contains BPF updates for your *net* tree. We've added 11 non-merge commits during the last 24 day(s) which contain a total of 15 files changed, 217 insertions(+), 17 deletions(-). The main changes are: 1) Fix a syzkaller-triggered oops when attempting to read the vsyscall page through bpf_probe_read_kernel and friends, from Hou Tao. 2) Fix a kernel panic due to uninitialized iter position pointer in bpf_iter_task, from Yafang Shao. 3) Fix a race between bpf_timer_cancel_and_free and bpf_timer_cancel, from Martin KaFai Lau. 4) Fix a xsk warning in skb_add_rx_frag() (under CONFIG_DEBUG_NET) due to incorrect truesize accounting, from Sebastian Andrzej Siewior. 5) Fix a NULL pointer dereference in sk_psock_verdict_data_ready, from Shigeru Yoshida. 6) Fix a resolve_btfids warning when bpf_cpumask symbol cannot be resolved, from Hari Bathini. bpf-for-netdev * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf, sockmap: Fix NULL pointer dereference in sk_psock_verdict_data_ready() selftests/bpf: Add negtive test cases for task iter bpf: Fix an issue due to uninitialized bpf_iter_task selftests/bpf: Test racing between bpf_timer_cancel_and_free and bpf_timer_cancel bpf: Fix racing between bpf_timer_cancel_and_free and bpf_timer_cancel selftest/bpf: Test the read of vsyscall page under x86-64 x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault() x86/mm: Move is_vsyscall_vaddr() into asm/vsyscall.h bpf, scripts: Correct GPL license name xsk: Add truesize to skb_add_rx_frag(). bpf: Fix warning for bpf_cpumask in verifier ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-02-22net: phy: realtek: Fix rtl8211f_config_init() for RTL8211F(D)(I)-VD-CG PHYSiddharth Vadapalli1-1/+3
Commit bb726b753f75 ("net: phy: realtek: add support for RTL8211F(D)(I)-VD-CG") extended support of the driver from the existing support for RTL8211F(D)(I)-CG PHY to the newer RTL8211F(D)(I)-VD-CG PHY. While that commit indicated that the RTL8211F_PHYCR2 register is not supported by the "VD-CG" PHY model and therefore updated the corresponding section in rtl8211f_config_init() to be invoked conditionally, the call to "genphy_soft_reset()" was left as-is, when it should have also been invoked conditionally. This is because the call to "genphy_soft_reset()" was first introduced by the commit 0a4355c2b7f8 ("net: phy: realtek: add dt property to disable CLKOUT clock") since the RTL8211F guide indicates that a PHY reset should be issued after setting bits in the PHYCR2 register. As the PHYCR2 register is not applicable to the "VD-CG" PHY model, fix the rtl8211f_config_init() function by invoking "genphy_soft_reset()" conditionally based on the presence of the "PHYCR2" register. Fixes: bb726b753f75 ("net: phy: realtek: add support for RTL8211F(D)(I)-VD-CG") Signed-off-by: Siddharth Vadapalli <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-02-22Merge branch 'ioam6-fix-write-to-cloned-skb-s'Paolo Abeni3-67/+76
Justin Iurman says: ==================== ioam6: fix write to cloned skb's Make sure the IOAM data insertion is not applied on cloned skb's. As a consequence, ioam selftests needed a refactoring. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-02-22selftests: ioam: refactoring to align with the fixJustin Iurman2-67/+66
ioam6_parser uses a packet socket. After the fix to prevent writing to cloned skb's, the receiver does not see its IOAM data anymore, which makes input/forward ioam-selftests to fail. As a workaround, ioam6_parser now uses an IPv6 raw socket and leverages ancillary data to get hop-by-hop options. As a consequence, the hook is "after" the IOAM data insertion by the receiver and all tests are working again. Signed-off-by: Justin Iurman <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-02-22Fix write to cloned skb in ipv6_hop_ioam()Justin Iurman1-0/+10
ioam6_fill_trace_data() writes inside the skb payload without ensuring it's writeable (e.g., not cloned). This function is called both from the input and output path. The output path (ioam6_iptunnel) already does the check. This commit provides a fix for the input path, inside ipv6_hop_ioam(). It also updates ip6_parse_tlv() to refresh the network header pointer ("nh") when returning from ipv6_hop_ioam(). Fixes: 9ee11f0fff20 ("ipv6: ioam: Data plane support for Pre-allocated Trace") Reported-by: Paolo Abeni <[email protected]> Signed-off-by: Justin Iurman <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-02-22phonet/pep: fix racy skb_queue_empty() useRémi Denis-Courmont1-9/+32
The receive queues are protected by their respective spin-lock, not the socket lock. This could lead to skb_peek() unexpectedly returning NULL or a pointer to an already dequeued socket buffer. Fixes: 9641458d3ec4 ("Phonet: Pipe End Point for Phonet Pipes protocol") Signed-off-by: Rémi Denis-Courmont <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-02-22phonet: take correct lock to peek at the RX queueRémi Denis-Courmont1-2/+2
The receive queue is protected by its embedded spin-lock, not the socket lock, so we need the former lock here (and only that one). Fixes: 107d0d9b8d9a ("Phonet: Phonet datagram transport protocol") Reported-by: Luosili <[email protected]> Signed-off-by: Rémi Denis-Courmont <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-02-22erofs: fix refcount on the metabuf used for inode lookupSandeep Dhavale1-14/+14
In erofs_find_target_block() when erofs_dirnamecmp() returns 0, we do not assign the target metabuf. This causes the caller erofs_namei()'s erofs_put_metabuf() at the end to be not effective leaving the refcount on the page. As the page from metabuf (buf->page) is never put, such page cannot be migrated or reclaimed. Fix it now by putting the metabuf from previous loop and assigning the current metabuf to target before returning so caller erofs_namei() can do the final put as it was intended. Fixes: 500edd095648 ("erofs: use meta buffers for inode lookup") Cc: <[email protected]> # 5.18+ Signed-off-by: Sandeep Dhavale <[email protected]> Reviewed-by: Gao Xiang <[email protected]> Reviewed-by: Jingbo Xu <[email protected]> Reviewed-by: Chao Yu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Gao Xiang <[email protected]>
2024-02-21net: sparx5: Add spinlock for frame transmission from CPUHoratiu Vultur3-0/+4
Both registers used when doing manual injection or fdma injection are shared between all the net devices of the switch. It was noticed that when having two process which each of them trying to inject frames on different ethernet ports, that the HW started to behave strange, by sending out more frames then expected. When doing fdma injection it is required to set the frame in the DCB and then make sure that the next pointer of the last DCB is invalid. But because there is no locks for this, then easily this pointer between the DCB can be broken and then it would create a loop of DCBs. And that means that the HW will continuously transmit these frames in a loop. Until the SW will break this loop. Therefore to fix this issue, add a spin lock for when accessing the registers for manual or fdma injection. Signed-off-by: Horatiu Vultur <[email protected]> Reviewed-by: Daniel Machon <[email protected]> Fixes: f3cad2611a77 ("net: sparx5: add hostmode with phylink support") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-21net/sched: flower: Add lock protection when remove filter handleJianbo Liu1-1/+4
As IDR can't protect itself from the concurrent modification, place idr_remove() under the protection of tp->lock. Fixes: 08a0063df3ae ("net/sched: flower: Move filter handle initialization earlier") Signed-off-by: Jianbo Liu <[email protected]> Reviewed-by: Cosmin Ratiu <[email protected]> Reviewed-by: Gal Pressman <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-21devlink: fix port dump cmd typeJiri Pirko1-1/+1
Unlike other commands, due to a c&p error, port dump fills-up cmd with wrong value, different from port-get request cmd, port-get doit reply and port notification. Fix it by filling cmd with value DEVLINK_CMD_PORT_NEW. Skimmed through devlink userspace implementations, none of them cares about this cmd value. Only ynl, for which, this is actually a fix, as it expects doit and dumpit ops rsp_value to be the same. Omit the fixes tag, even thought this is fix, better to target this for next release. Fixes: bfcd3a466172 ("Introduce devlink infrastructure") Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-21net: stmmac: Fix EST offset for dwmac 5.10Kurt Kanzenbach1-1/+1
Fix EST offset for dwmac 5.10. Currently configuring Qbv doesn't work as expected. The schedule is configured, but never confirmed: |[ 128.250219] imx-dwmac 428a0000.ethernet eth1: configured EST The reason seems to be the refactoring of the EST code which set the wrong EST offset for the dwmac 5.10. After fixing this it works as before: |[ 106.359577] imx-dwmac 428a0000.ethernet eth1: configured EST |[ 128.430715] imx-dwmac 428a0000.ethernet eth1: EST: SWOL has been switched Tested on imx93. Fixes: c3f3b97238f6 ("net: stmmac: Refactor EST implementation") Signed-off-by: Kurt Kanzenbach <[email protected]> Reviewed-by: Serge Semin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-21Merge branch 'tools-ynl-fix-impossible-errors'Jakub Kicinski1-4/+15
Jakub Kicinski says: ==================== tools: ynl: fix impossible errors Fix bugs discovered while I was hacking in low level stuff in YNL and kept breaking the socket, exercising the "impossible" error paths. v1: https://lore.kernel.org/all/[email protected]/ ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-21tools: ynl: don't leak mcast_groups on init errorJakub Kicinski1-1/+7
Make sure to free the already-parsed mcast_groups if we don't get an ack from the kernel when reading family info. This is part of the ynl_sock_create() error path, so we won't get a call to ynl_sock_destroy() to free them later. Fixes: 86878f14d71a ("tools: ynl: user space helpers") Acked-by: Nicolas Dichtel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-21tools: ynl: make sure we always pass yarg to mnl_cb_runJakub Kicinski1-3/+8
There is one common error handler in ynl - ynl_cb_error(). It expects priv to be a pointer to struct ynl_parse_arg AKA yarg. To avoid potential crashes if we encounter a stray NLMSG_ERROR always pass yarg as priv (or a struct which has it as the first member). ynl_cb_null() has a similar problem directly - it expects yarg but priv passed by the caller is ys. Found by code inspection. Fixes: 86878f14d71a ("tools: ynl: user space helpers") Acked-by: Nicolas Dichtel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-21net: mctp: put sock on tag allocation failureJeremy Kerr1-1/+1
We may hold an extra reference on a socket if a tag allocation fails: we optimistically allocate the sk_key, and take a ref there, but do not drop if we end up not using the allocated key. Ensure we're dropping the sock on this failure by doing a proper unref rather than directly kfree()ing. Fixes: de8a6b15d965 ("net: mctp: add an explicit reference from a mctp_sk_key to sock") Signed-off-by: Jeremy Kerr <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/ce9b61e44d1cdae7797be0c5e3141baf582d23a0.1707983487.git.jk@codeconstruct.com.au Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-22netfilter: nf_tables: use kzalloc for hook allocationFlorian Westphal1-1/+1
KMSAN reports unitialized variable when registering the hook, reg->hook_ops_type == NF_HOOK_OP_BPF) ~~~~~~~~~~~ undefined This is a small structure, just use kzalloc to make sure this won't happen again when new fields get added to nf_hook_ops. Fixes: 7b4b2fa37587 ("netfilter: annotate nf_tables base hook ops") Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-02-22netfilter: nf_tables: register hooks last when adding new chain/flowtablePablo Neira Ayuso1-38/+40
Register hooks last when adding chain/flowtable to ensure that packets do not walk over datastructure that is being released in the error path without waiting for the rcu grace period. Fixes: 91c7b38dc9f0 ("netfilter: nf_tables: use new transaction infrastructure to handle chain") Fixes: 3b49e2e94e6e ("netfilter: nf_tables: add flow table netlink frontend") Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-02-22netfilter: nft_flow_offload: release dst in case direct xmit path is usedPablo Neira Ayuso1-0/+1
Direct xmit does not use it since it calls dev_queue_xmit() to send packets, hence it calls dst_release(). kmemleak reports: unreferenced object 0xffff88814f440900 (size 184): comm "softirq", pid 0, jiffies 4294951896 hex dump (first 32 bytes): 00 60 5b 04 81 88 ff ff 00 e6 e8 82 ff ff ff ff .`[............. 21 0b 50 82 ff ff ff ff 00 00 00 00 00 00 00 00 !.P............. backtrace (crc cb2bf5d6): [<000000003ee17107>] kmem_cache_alloc+0x286/0x340 [<0000000021a5de2c>] dst_alloc+0x43/0xb0 [<00000000f0671159>] rt_dst_alloc+0x2e/0x190 [<00000000fe5092c9>] __mkroute_output+0x244/0x980 [<000000005fb96fb0>] ip_route_output_flow+0xc0/0x160 [<0000000045367433>] nf_ip_route+0xf/0x30 [<0000000085da1d8e>] nf_route+0x2d/0x60 [<00000000d1ecd1cb>] nft_flow_route+0x171/0x6a0 [nft_flow_offload] [<00000000d9b2fb60>] nft_flow_offload_eval+0x4e8/0x700 [nft_flow_offload] [<000000009f447dbb>] expr_call_ops_eval+0x53/0x330 [nf_tables] [<00000000072e1be6>] nft_do_chain+0x17c/0x840 [nf_tables] [<00000000d0551029>] nft_do_chain_inet+0xa1/0x210 [nf_tables] [<0000000097c9d5c6>] nf_hook_slow+0x5b/0x160 [<0000000005eccab1>] ip_forward+0x8b6/0x9b0 [<00000000553a269b>] ip_rcv+0x221/0x230 [<00000000412872e5>] __netif_receive_skb_one_core+0xfe/0x110 Fixes: fa502c865666 ("netfilter: flowtable: simplify route logic") Reported-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>