aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-05-27bpf: Allow delete from sockmap/sockhash only if update is allowedJakub Sitnicki1-3/+7
We have seen an influx of syzkaller reports where a BPF program attached to a tracepoint triggers a locking rule violation by performing a map_delete on a sockmap/sockhash. We don't intend to support this artificial use scenario. Extend the existing verifier allowed-program-type check for updating sockmap/sockhash to also cover deleting from a map. From now on only BPF programs which were previously allowed to update sockmap/sockhash can delete from these map types. Fixes: ff9105993240 ("bpf, sockmap: Prevent lock inversion deadlock in map delete elem") Reported-by: Tetsuo Handa <[email protected]> Reported-by: [email protected] Signed-off-by: Jakub Sitnicki <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Tested-by: [email protected] Acked-by: John Fastabend <[email protected]> Closes: https://syzkaller.appspot.com/bug?extid=ec941d6e24f633a59172 Link: https://lore.kernel.org/bpf/[email protected]
2024-05-27xfs: Add cond_resched to block unmap range and reflink remap pathRitesh Harjani (IBM)2-0/+2
An async dio write to a sparse file can generate a lot of extents and when we unlink this file (using rm), the kernel can be busy in umapping and freeing those extents as part of transaction processing. Similarly xfs reflink remapping path can also iterate over a million extent entries in xfs_reflink_remap_blocks(). Since we can busy loop in these two functions, so let's add cond_resched() to avoid softlockup messages like these. watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:82435] CPU: 1 PID: 82435 Comm: kworker/1:0 Tainted: G S L 6.9.0-rc5-0-default #1 Workqueue: xfs-inodegc/sda2 xfs_inodegc_worker NIP [c000000000beea10] xfs_extent_busy_trim+0x100/0x290 LR [c000000000bee958] xfs_extent_busy_trim+0x48/0x290 Call Trace: xfs_alloc_get_rec+0x54/0x1b0 (unreliable) xfs_alloc_compute_aligned+0x5c/0x144 xfs_alloc_ag_vextent_size+0x238/0x8d4 xfs_alloc_fix_freelist+0x540/0x694 xfs_free_extent_fix_freelist+0x84/0xe0 __xfs_free_extent+0x74/0x1ec xfs_extent_free_finish_item+0xcc/0x214 xfs_defer_finish_one+0x194/0x388 xfs_defer_finish_noroll+0x1b4/0x5c8 xfs_defer_finish+0x2c/0xc4 xfs_bunmapi_range+0xa4/0x100 xfs_itruncate_extents_flags+0x1b8/0x2f4 xfs_inactive_truncate+0xe0/0x124 xfs_inactive+0x30c/0x3e0 xfs_inodegc_worker+0x140/0x234 process_scheduled_works+0x240/0x57c worker_thread+0x198/0x468 kthread+0x138/0x140 start_kernel_thread+0x14/0x18 run fstests generic/175 at 2024-02-02 04:40:21 [ C17] watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679] watchdog: BUG: soft lockup - CPU#17 stuck for 23s! [xfs_io:7679] CPU: 17 PID: 7679 Comm: xfs_io Kdump: loaded Tainted: G X 6.4.0 NIP [c008000005e3ec94] xfs_rmapbt_diff_two_keys+0x54/0xe0 [xfs] LR [c008000005e08798] xfs_btree_get_leaf_keys+0x110/0x1e0 [xfs] Call Trace: 0xc000000014107c00 (unreliable) __xfs_btree_updkeys+0x8c/0x2c0 [xfs] xfs_btree_update_keys+0x150/0x170 [xfs] xfs_btree_lshift+0x534/0x660 [xfs] xfs_btree_make_block_unfull+0x19c/0x240 [xfs] xfs_btree_insrec+0x4e4/0x630 [xfs] xfs_btree_insert+0x104/0x2d0 [xfs] xfs_rmap_insert+0xc4/0x260 [xfs] xfs_rmap_map_shared+0x228/0x630 [xfs] xfs_rmap_finish_one+0x2d4/0x350 [xfs] xfs_rmap_update_finish_item+0x44/0xc0 [xfs] xfs_defer_finish_noroll+0x2e4/0x740 [xfs] __xfs_trans_commit+0x1f4/0x400 [xfs] xfs_reflink_remap_extent+0x2d8/0x650 [xfs] xfs_reflink_remap_blocks+0x154/0x320 [xfs] xfs_file_remap_range+0x138/0x3a0 [xfs] do_clone_file_range+0x11c/0x2f0 vfs_clone_file_range+0x60/0x1c0 ioctl_file_clone+0x78/0x140 sys_ioctl+0x934/0x1270 system_call_exception+0x158/0x320 system_call_vectored_common+0x15c/0x2ec Cc: Ojaswin Mujoo <[email protected]> Signed-off-by: Ritesh Harjani (IBM) <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Tested-by: Disha Goel<[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-05-27Merge tag 'pmdomain-v6.10-rc1' of ↵Linus Torvalds1-0/+11
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain fix from Ulf Hansson: - Fix regression in gpcv2 PM domain for i.MX8 * tag 'pmdomain-v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: pmdomain: imx: gpcv2: Add delay after power up handshake
2024-05-27dm: make dm_set_zones_restrictions work on the queue limitsChristoph Hellwig3-12/+14
Don't stuff the values directly into the queue without any synchronization, but instead delay applying the queue limits in the caller and let dm_set_zones_restrictions work on the limit structure. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-05-27dm: remove dm_check_zonedChristoph Hellwig1-36/+23
Fold it into the only caller in preparation to changes in the queue limits setup. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Mike Snitzer <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-05-27dm: move setting zoned_enabled to dm_table_set_restrictionsChristoph Hellwig2-4/+7
Keep it together with the rest of the zoned code. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-05-27block: remove blk_queue_max_integrity_segmentsChristoph Hellwig1-10/+0
This is unused now that all the atomic queue limit conversions are merged. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-05-27Merge tag 'vfs-6.10-rc2.fixes' of ↵Linus Torvalds14-27/+68
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix io_uring based write-through after converting cifs to use the netfs library - Fix aio error handling when doing write-through via netfs library - Fix performance regression in iomap when used with non-large folio mappings - Fix signalfd error code - Remove obsolete comment in signalfd code - Fix async request indication in netfs_perform_write() by raising BDP_ASYNC when IOCB_NOWAIT is set - Yield swap device immediately to prevent spurious EBUSY errors - Don't cross a .backup mountpoint from backup volumes in afs to avoid infinite loops - Fix a race between umount and async request completion in 9p after 9p was converted to use the netfs library * tag 'vfs-6.10-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: netfs, 9p: Fix race between umount and async request completion afs: Don't cross .backup mountpoint from backup volume swap: yield device immediately netfs: Fix setting of BDP_ASYNC from iocb flags signalfd: drop an obsolete comment signalfd: fix error return code iomap: fault in smaller chunks for non-large folio mappings filemap: add helper mapping_max_folio_size() netfs: Fix AIO error handling when doing write-through netfs: Fix io_uring based write-through
2024-05-27Documentation/core-api: correct reference to SWIOTLB_DYNAMICLukas Bulwahn1-1/+1
Commit c93f261dfc39 ("Documentation/core-api: add swiotlb documentation") accidentally refers to CONFIG_DYNAMIC_SWIOTLB in one place, while the config is actually called CONFIG_SWIOTLB_DYNAMIC. Correct the reference to the intended config option. Signed-off-by: Lukas Bulwahn <[email protected]> Reviewed-by: Petr Tesarik <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2024-05-27dma-buf: handle testing kthreads creation failureFedor Pchelkin1-0/+6
kthread creation may possibly fail inside race_signal_callback(). In such a case stop the already started threads, put the already taken references to them and return with error code. Found by Linux Verification Center (linuxtesting.org). Fixes: 2989f6451084 ("dma-buf: Add selftests for dma-fence") Cc: [email protected] Signed-off-by: Fedor Pchelkin <[email protected]> Reviewed-by: T.J. Mercier <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Christian König <[email protected]>
2024-05-27MAINTAINERS: Remove James Schulman from Cirrus audio maintainersCharles Keepax1-1/+0
James no longer works for Cirrus Logic, remove him from the list of maintainers for the Cirrus audio CODEC drivers. Signed-off-by: Charles Keepax <[email protected]> Link: https://msgid.link/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2024-05-27ASoC: wm_adsp: Add missing MODULE_DESCRIPTION()Charles Keepax1-0/+1
wm_adsp is built as a separate module and as such should include a MODULE_DESCRIPTION() macro. Signed-off-by: Charles Keepax <[email protected]> Link: https://msgid.link/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2024-05-27ASoC: cs42l43: Only restrict 44.1kHz for the ASPCharles Keepax1-2/+3
The SoundWire interface can always support 44.1kHz using flow controlled mode, and whether the ASP is in master mode should obviously only affect the ASP. Update cs42l43_startup() to only restrict the rates for the ASP DAI. Fixes: fc918cbe874e ("ASoC: cs42l43: Add support for the cs42l43") Signed-off-by: Charles Keepax <[email protected]> Link: https://msgid.link/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2024-05-27tracing/probes: fix error check in parse_btf_field()Carlos López1-0/+4
btf_find_struct_member() might return NULL or an error via the ERR_PTR() macro. However, its caller in parse_btf_field() only checks for the NULL condition. Fix this by using IS_ERR() and returning the error up the stack. Link: https://lore.kernel.org/all/[email protected]/ Fixes: c440adfbe3025 ("tracing/probes: Support BTF based data structure field access") Signed-off-by: Carlos López <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
2024-05-27netfs, 9p: Fix race between umount and async request completionDavid Howells5-0/+26
There's a problem in 9p's interaction with netfslib whereby a crash occurs because the 9p_fid structs get forcibly destroyed during client teardown (without paying attention to their refcounts) before netfslib has finished with them. However, it's not a simple case of deferring the clunking that p9_fid_put() does as that requires the p9_client record to still be present. The problem is that netfslib has to unlock pages and clear the IN_PROGRESS flag before destroying the objects involved - including the fid - and, in any case, nothing checks to see if writeback completed barring looking at the page flags. Fix this by keeping a count of outstanding I/O requests (of any type) and waiting for it to quiesce during inode eviction. Reported-by: [email protected] Link: https://lore.kernel.org/all/[email protected]/ Reported-by: [email protected] Link: https://lore.kernel.org/all/[email protected]/ Reported-by: [email protected] Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: David Howells <[email protected]> Link: https://lore.kernel.org/r/[email protected] Tested-by: [email protected] Reviewed-by: Dominique Martinet <[email protected]> cc: Eric Van Hensbergen <[email protected]> cc: Latchesar Ionkov <[email protected]> cc: Christian Schoenebeck <[email protected]> cc: Jeff Layton <[email protected]> cc: Steve French <[email protected]> cc: Hillf Danton <[email protected]> cc: [email protected] cc: [email protected] cc: [email protected] cc: [email protected] cc: [email protected] Reported-and-tested-by: [email protected] Signed-off-by: Christian Brauner <[email protected]>
2024-05-27drm/nouveau/nvif: Avoid build error due to potential integer overflowsGuenter Roeck1-6/+18
Trying to build parisc:allmodconfig with gcc 12.x or later results in the following build error. drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd': drivers/gpu/drm/nouveau/nvif/object.c:161:9: error: 'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict] 161 | memcpy(data, args->mthd.data, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor': drivers/gpu/drm/nouveau/nvif/object.c:298:17: error: 'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict] 298 | memcpy(data, args->new.data, size); gcc assumes that 'sizeof(*args) + size' can overflow, which would result in the problem. The problem is not new, only it is now no longer a warning but an error since W=1 has been enabled for the drm subsystem and since Werror is enabled for test builds. Rearrange arithmetic and use check_add_overflow() for validating the allocation size to avoid the overflow. While at it, split assignments out of if conditions. Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the subsystem") Cc: Javier Martinez Canillas <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: Danilo Krummrich <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Kees Cook <[email protected]> Cc: Christophe JAILLET <[email protected]> Cc: Joe Perches <[email protected]> Reviewed-by: Kees Cook <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Danilo Krummrich <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-05-27net: usb: smsc95xx: fix changing LED_SEL bit value updated from EEPROMParthiban Veerasooran1-4/+7
LED Select (LED_SEL) bit in the LED General Purpose IO Configuration register is used to determine the functionality of external LED pins (Speed Indicator, Link and Activity Indicator, Full Duplex Link Indicator). The default value for this bit is 0 when no EEPROM is present. If a EEPROM is present, the default value is the value of the LED Select bit in the Configuration Flags of the EEPROM. A USB Reset or Lite Reset (LRST) will cause this bit to be restored to the image value last loaded from EEPROM, or to be set to 0 if no EEPROM is present. While configuring the dual purpose GPIO/LED pins to LED outputs in the LED General Purpose IO Configuration register, the LED_SEL bit is changed as 0 and resulting the configured value from the EEPROM is cleared. The issue is fixed by using read-modify-write approach. Fixes: f293501c61c5 ("smsc95xx: configure LED outputs") Signed-off-by: Parthiban Veerasooran <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Woojung Huh <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-05-27xfs: don't open-code u64_to_user_ptrDarrick J. Wong2-7/+2
Don't open-code what the kernel already provides. Signed-off-by: "Darrick J. Wong" <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-05-27xfs: allow symlinks with short remote targetsDarrick J. Wong1-4/+24
An internal user complained about log recovery failing on a symlink ("Bad dinode after recovery") with the following (excerpted) format: core.magic = 0x494e core.mode = 0120777 core.version = 3 core.format = 2 (extents) core.nlinkv2 = 1 core.nextents = 1 core.size = 297 core.nblocks = 1 core.naextents = 0 core.forkoff = 0 core.aformat = 2 (extents) u3.bmx[0] = [startoff,startblock,blockcount,extentflag] 0:[0,12,1,0] This is a symbolic link with a 297-byte target stored in a disk block, which is to say this is a symlink with a remote target. The forkoff is 0, which is to say that there's 512 - 176 == 336 bytes in the inode core to store the data fork. Eventually, testing of generic/388 failed with the same inode corruption message during inode recovery. In writing a debugging patch to call xfs_dinode_verify on dirty inode log items when we're committing transactions, I observed that xfs/298 can reproduce the problem quite quickly. xfs/298 creates a symbolic link, adds some extended attributes, then deletes them all. The test failure occurs when the final removexattr also deletes the attr fork because that does not convert the remote symlink back into a shortform symlink. That is how we trip this test. The only reason why xfs/298 only triggers with the debug patch added is that it deletes the symlink, so the final iflush shows the inode as free. I wrote a quick fstest to emulate the behavior of xfs/298, except that it leaves the symlinks on the filesystem after inducing the "corrupt" state. Kernels going back at least as far as 4.18 have written out symlink inodes in this manner and prior to 1eb70f54c445f they did not object to reading them back in. Because we've been writing out inodes this way for quite some time, the only way to fix this is to relax the check for symbolic links. Directories don't have this problem because di_size is bumped to blocksize during the sf->data conversion. Fixes: 1eb70f54c445f ("xfs: validate inode fork size against fork format") Signed-off-by: "Darrick J. Wong" <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-05-27xfs: fix xfs_init_attr_trans not handling explicit operation codesDarrick J. Wong3-25/+35
When we were converting the attr code to use an explicit operation code instead of keying off of attr->value being null, we forgot to change the code that initializes the transaction reservation. Split the function into two helpers that handle the !remove and remove cases, then fix both callsites to handle this correctly. Fixes: c27411d4c640 ("xfs: make attr removal an explicit operation") Signed-off-by: "Darrick J. Wong" <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-05-27xfs: drop xfarray sortinfo folio on errorDarrick J. Wong1-3/+6
Chandan Babu reports the following livelock in xfs/708: run fstests xfs/708 at 2024-05-04 15:35:29 XFS (loop16): EXPERIMENTAL online scrub feature in use. Use at your own risk! XFS (loop5): Mounting V5 Filesystem e96086f0-a2f9-4424-a1d5-c75d53d823be XFS (loop5): Ending clean mount XFS (loop5): Quotacheck needed: Please wait. XFS (loop5): Quotacheck: Done. XFS (loop5): EXPERIMENTAL online scrub feature in use. Use at your own risk! INFO: task xfs_io:143725 blocked for more than 122 seconds. Not tainted 6.9.0-rc4+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:xfs_io state:D stack:0 pid:143725 tgid:143725 ppid:117661 flags:0x00004006 Call Trace: <TASK> __schedule+0x69c/0x17a0 schedule+0x74/0x1b0 io_schedule+0xc4/0x140 folio_wait_bit_common+0x254/0x650 shmem_undo_range+0x9d5/0xb40 shmem_evict_inode+0x322/0x8f0 evict+0x24e/0x560 __dentry_kill+0x17d/0x4d0 dput+0x263/0x430 __fput+0x2fc/0xaa0 task_work_run+0x132/0x210 get_signal+0x1a8/0x1910 arch_do_signal_or_restart+0x7b/0x2f0 syscall_exit_to_user_mode+0x1c2/0x200 do_syscall_64+0x72/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e The shmem code is trying to drop all the folios attached to a shmem file and gets stuck on a locked folio after a bnobt repair. It looks like the process has a signal pending, so I started looking for places where we lock an xfile folio and then deal with a fatal signal. I found a bug in xfarray_sort_scan via code inspection. This function is called to set up the scanning phase of a quicksort operation, which may involve grabbing a locked xfile folio. If we exit the function with an error code, the caller does not call xfarray_sort_scan_done to put the xfile folio. If _sort_scan returns an error code while si->folio is set, we leak the reference and never unlock the folio. Therefore, change xfarray_sort to call _scan_done on exit. This is safe to call multiple times because it sets si->folio to NULL and ignores a NULL si->folio. Also change _sort_scan to use an intermediate variable so that we never pollute si->folio with an errptr. Fixes: 232ea052775f9 ("xfs: enable sorting of xfile-backed arrays") Reported-by: Chandan Babu R <[email protected]> Signed-off-by: "Darrick J. Wong" <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-05-27xfs: Stop using __maybe_unused in xfs_alloc.cJohn Garry1-4/+2
In both xfs_alloc_cur_finish() and xfs_alloc_ag_vextent_exact(), local variable @afg is tagged as __maybe_unused. Otherwise an unused variable warning would be generated for when building with W=1 and CONFIG_XFS_DEBUG unset. In both cases, the variable is unused as it is only referenced in an ASSERT() call, which is compiled out (in this config). It is generally a poor programming style to use __maybe_unused for variables. The ASSERT() call is to verify that agbno of the end of the extent is within bounds for both functions. @afg is used as an intermediate variable to find the AG length. However xfs_verify_agbext() already exists to verify a valid extent range. The arguments for calling xfs_verify_agbext() are already available, so use that instead. An advantage of using xfs_verify_agbext() is that it verifies that both the start and the end of the extent are within the bounds of the AG and catches overflows. Suggested-by: Dave Chinner <[email protected]> Signed-off-by: John Garry <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-05-27xfs: Clear W=1 warning in xfs_iwalk_run_callbacks()John Garry1-3/+2
For CONFIG_XFS_DEBUG unset, xfs_iwalk_run_callbacks() generates the following warning for when building with W=1: fs/xfs/xfs_iwalk.c: In function ‘xfs_iwalk_run_callbacks’: fs/xfs/xfs_iwalk.c:354:42: error: variable ‘irec’ set but not used [-Werror=unused-but-set-variable] 354 | struct xfs_inobt_rec_incore *irec; | ^~~~ cc1: all warnings being treated as errors Drop @irec, as it is only an intermediate variable. Suggested-by: Christoph Hellwig <[email protected]> Signed-off-by: John Garry <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-05-27Octeontx2-pf: Free send queue buffers incase of leaf to innerHariprasad Kelam1-0/+4
There are two type of classes. "Leaf classes" that are the bottom of the class hierarchy. "Inner classes" that are neither the root class nor leaf classes. QoS rules can only specify leaf classes as targets for traffic. Root / \ / \ 1 2 /\ / \ 4 5 classes 1,4 and 5 are leaf classes. class 2 is a inner class. When a leaf class made as inner, or vice versa, resources associated with send queue (send queue buffers and transmit schedulers) are not getting freed. Fixes: 5e6808b4c68d ("octeontx2-pf: Add support for HTB offload") Signed-off-by: Hariprasad Kelam <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-05-27af_unix: Read sk->sk_hash under bindlock during bind().Kuniyuki Iwashima1-3/+6
syzkaller reported data-race of sk->sk_hash in unix_autobind() [0], and the same ones exist in unix_bind_bsd() and unix_bind_abstract(). The three bind() functions prefetch sk->sk_hash locklessly and use it later after validating that unix_sk(sk)->addr is NULL under unix_sk(sk)->bindlock. The prefetched sk->sk_hash is the hash value of unbound socket set in unix_create1() and does not change until bind() completes. There could be a chance that sk->sk_hash changes after the lockless read. However, in such a case, non-NULL unix_sk(sk)->addr is visible under unix_sk(sk)->bindlock, and bind() returns -EINVAL without using the prefetched value. The KCSAN splat is false-positive, but let's silence it by reading sk->sk_hash under unix_sk(sk)->bindlock. [0]: BUG: KCSAN: data-race in unix_autobind / unix_autobind write to 0xffff888034a9fb88 of 4 bytes by task 4468 on cpu 0: __unix_set_addr_hash net/unix/af_unix.c:331 [inline] unix_autobind+0x47a/0x7d0 net/unix/af_unix.c:1185 unix_dgram_connect+0x7e3/0x890 net/unix/af_unix.c:1373 __sys_connect_file+0xd7/0xe0 net/socket.c:2048 __sys_connect+0x114/0x140 net/socket.c:2065 __do_sys_connect net/socket.c:2075 [inline] __se_sys_connect net/socket.c:2072 [inline] __x64_sys_connect+0x40/0x50 net/socket.c:2072 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x46/0x4e read to 0xffff888034a9fb88 of 4 bytes by task 4465 on cpu 1: unix_autobind+0x28/0x7d0 net/unix/af_unix.c:1134 unix_dgram_connect+0x7e3/0x890 net/unix/af_unix.c:1373 __sys_connect_file+0xd7/0xe0 net/socket.c:2048 __sys_connect+0x114/0x140 net/socket.c:2065 __do_sys_connect net/socket.c:2075 [inline] __se_sys_connect net/socket.c:2072 [inline] __x64_sys_connect+0x40/0x50 net/socket.c:2072 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x46/0x4e value changed: 0x000000e4 -> 0x000001e3 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 4465 Comm: syz-executor.0 Not tainted 6.8.0-12822-gcd51db110a7e #12 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Fixes: afd20b9290e1 ("af_unix: Replace the big lock with small locks.") Reported-by: syzkaller <[email protected]> Signed-off-by: Kuniyuki Iwashima <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-05-27af_unix: Annotate data-race around unix_sk(sk)->addr.Kuniyuki Iwashima1-4/+6
Once unix_sk(sk)->addr is assigned under net->unx.table.locks and unix_sk(sk)->bindlock, *(unix_sk(sk)->addr) and unix_sk(sk)->path are fully set up, and unix_sk(sk)->addr is never changed. unix_getname() and unix_copy_addr() access the two fields locklessly, and commit ae3b564179bf ("missing barriers in some of unix_sock ->addr and ->path accesses") added smp_store_release() and smp_load_acquire() pairs. In other functions, we still read unix_sk(sk)->addr locklessly to check if the socket is bound, and KCSAN complains about it. [0] Given these functions have no dependency for *(unix_sk(sk)->addr) and unix_sk(sk)->path, READ_ONCE() is enough to annotate the data-race. Note that it is safe to access unix_sk(sk)->addr locklessly if the socket is found in the hash table. For example, the lockless read of otheru->addr in unix_stream_connect() is safe. Note also that newu->addr there is of the child socket that is still not accessible from userspace, and smp_store_release() publishes the address in case the socket is accept()ed and unix_getname() / unix_copy_addr() is called. [0]: BUG: KCSAN: data-race in unix_bind / unix_listen write (marked) to 0xffff88805f8d1840 of 8 bytes by task 13723 on cpu 0: __unix_set_addr_hash net/unix/af_unix.c:329 [inline] unix_bind_bsd net/unix/af_unix.c:1241 [inline] unix_bind+0x881/0x1000 net/unix/af_unix.c:1319 __sys_bind+0x194/0x1e0 net/socket.c:1847 __do_sys_bind net/socket.c:1858 [inline] __se_sys_bind net/socket.c:1856 [inline] __x64_sys_bind+0x40/0x50 net/socket.c:1856 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x46/0x4e read to 0xffff88805f8d1840 of 8 bytes by task 13724 on cpu 1: unix_listen+0x72/0x180 net/unix/af_unix.c:734 __sys_listen+0xdc/0x160 net/socket.c:1881 __do_sys_listen net/socket.c:1890 [inline] __se_sys_listen net/socket.c:1888 [inline] __x64_sys_listen+0x2e/0x40 net/socket.c:1888 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x4f/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x46/0x4e value changed: 0x0000000000000000 -> 0xffff88807b5b1b40 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 13724 Comm: syz-executor.4 Not tainted 6.8.0-12822-gcd51db110a7e #12 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzkaller <[email protected]> Signed-off-by: Kuniyuki Iwashima <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-05-27selftests: hsr: Fix "File exists" errors for hsr_pingGeliang Tang1-0/+2
The hsr_ping test reports the following errors: INFO: preparing interfaces for HSRv0. INFO: Initial validation ping. INFO: Longer ping test. INFO: Cutting one link. INFO: Delay the link and drop a few packages. INFO: All good. INFO: preparing interfaces for HSRv1. RTNETLINK answers: File exists RTNETLINK answers: File exists RTNETLINK answers: File exists RTNETLINK answers: File exists RTNETLINK answers: File exists RTNETLINK answers: File exists Error: ipv4: Address already assigned. Error: ipv6: address already assigned. Error: ipv4: Address already assigned. Error: ipv6: address already assigned. Error: ipv4: Address already assigned. Error: ipv6: address already assigned. INFO: Initial validation ping. That is because the cleanup code for the 2nd round test before "setup_hsr_interfaces 1" is removed incorrectly in commit 680fda4f6714 ("test: hsr: Remove script code already implemented in lib.sh"). This patch fixes it by re-setup the namespaces using setup_ns ns1 ns2 ns3 command before "setup_hsr_interfaces 1". It deletes previous namespaces and create new ones. Fixes: 680fda4f6714 ("test: hsr: Remove script code already implemented in lib.sh") Reviewed-by: Hangbin Liu <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/6485d3005f467758d49f0f313c8c009759ba6b05.1716374462.git.tanggeliang@kylinos.cn Signed-off-by: Paolo Abeni <[email protected]>
2024-05-27platform/x86: touchscreen_dmi: Add info for the EZpad 6s Prohmtheboy1541-0/+11
The "EZpad 6s Pro" uses the same touchscreen as the "EZpad 6 Pro B", unlike the "Ezpad 6 Pro" which has its own touchscreen. Signed-off-by: hmtheboy154 <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-05-27platform/x86: touchscreen_dmi: Add info for GlobalSpace SolT IVW 11.6" tablethmtheboy1541-0/+25
This is a tablet created by GlobalSpace Technologies Limited which uses an Intel Atom x5-Z8300, 4GB of RAM & 64GB of storage. Link: https://web.archive.org/web/20171102141952/http://globalspace.in/11.6-device.html Signed-off-by: hmtheboy154 <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-05-27platform/x86: touchscreen_dmi: Add support for setting touchscreen ↵Hans de Goede2-3/+98
properties from cmdline On x86/ACPI platforms touchscreens mostly just work without needing any device/model specific configuration. But in some cases (mostly with Silead and Goodix touchscreens) it is still necessary to manually specify various touchscreen-properties on a per model basis. touchscreen_dmi is a special place for DMI quirks for this, but it can be challenging for users to figure out the right property values, especially for Silead touchscreens where non of these can be read back from the touchscreen-controller. ATM users can only test touchscreen properties by editing touchscreen_dmi.c and then building a completely new kernel which makes it unnecessary difficult for users to test and submit properties when necessary for their laptop / tablet model. Add support for specifying properties on the kernel commandline to allow users to easily figure out the right settings. See the added documentation in kernel-parameters.txt for the commandline syntax. Cc: Gregor Riepl <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-05-27platform/x86: thinkpad_acpi: Select INPUT_SPARSEKMAP in KconfigSteven Rostedt (Google)1-0/+1
Now that drivers/platform/x86/thinkpad_acpi.c uses sparse_keymap_report_event(), it must select INPUT_SPARSEKMAP in its Kconfig option otherwise the build fails with: ld: vmlinux.o: in function `tpacpi_input_send_key': thinkpad_acpi.c:(.text+0xd4d27f): undefined reference to `sparse_keymap_report_event' ld: vmlinux.o: in function `hotkey_init': thinkpad_acpi.c:(.init.text+0x66cb6): undefined reference to `sparse_keymap_setup' Fixes: 42f7b965de9d ("platform/x86: thinkpad_acpi: Switch to using sparse-keymap helpers") Signed-off-by: Steven Rostedt (Google) <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hans de Goede <[email protected]>
2024-05-27platform/x86: x86-android-tablets: Add "select LEDS_CLASS"Hans de Goede1-0/+2
Since the x86-android-tablets now calls devm_led_classdev_register_ext() it needs to select LEDS_CLASS as well as LEDS_CLASS' NEW_LEDS dependency. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-05-27platform/x86: ISST: fix use-after-free in tpmi_sst_dev_remove()Harshit Mogalapalli1-1/+1
In tpmi_sst_dev_remove(), tpmi_sst is dereferenced after being freed. Fix this by reordering the kfree() post the dereference. Fixes: 9d1d36268f3d ("platform/x86: ISST: Support partitioned systems") Signed-off-by: Harshit Mogalapalli <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Acked-by: Srinivas Pandruvada <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hans de Goede <[email protected]>
2024-05-27enic: Validate length of nl attributes in enic_set_vf_portRoded Zats1-0/+12
enic_set_vf_port assumes that the nl attribute IFLA_PORT_PROFILE is of length PORT_PROFILE_MAX and that the nl attributes IFLA_PORT_INSTANCE_UUID, IFLA_PORT_HOST_UUID are of length PORT_UUID_MAX. These attributes are validated (in the function do_setlink in rtnetlink.c) using the nla_policy ifla_port_policy. The policy defines IFLA_PORT_PROFILE as NLA_STRING, IFLA_PORT_INSTANCE_UUID as NLA_BINARY and IFLA_PORT_HOST_UUID as NLA_STRING. That means that the length validation using the policy is for the max size of the attributes and not on exact size so the length of these attributes might be less than the sizes that enic_set_vf_port expects. This might cause an out of bands read access in the memcpys of the data of these attributes in enic_set_vf_port. Fixes: f8bd909183ac ("net: Add ndo_{set|get}_vf_port support for enic dynamic vnics") Signed-off-by: Roded Zats <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-05-27ata: ahci: Do not apply Intel PCS quirk on Intel Alder LakeJason Nader1-1/+0
Commit b8b8b4e0c052 ("ata: ahci: Add Intel Alder Lake-P AHCI controller to low power chipsets list") added Intel Alder Lake to the ahci_pci_tbl. Because of the way that the Intel PCS quirk was implemented, having an explicit entry in the ahci_pci_tbl caused the Intel PCS quirk to be applied. (The quirk was not being applied if there was no explict entry.) Thus, entries that were added to the ahci_pci_tbl also got the Intel PCS quirk applied. The quirk was cleaned up in commit 7edbb6059274 ("ahci: clean up intel_pcs_quirk"), such that it is clear which entries that actually applies the Intel PCS quirk. Newer Intel AHCI controllers do not need the Intel PCS quirk, and applying it when not needed actually breaks some platforms. Do not apply the Intel PCS quirk for Intel Alder Lake. This is in line with how things worked before commit b8b8b4e0c052 ("ata: ahci: Add Intel Alder Lake-P AHCI controller to low power chipsets list"), such that certain platforms using Intel Alder Lake will work once again. Cc: [email protected] # 6.7 Fixes: b8b8b4e0c052 ("ata: ahci: Add Intel Alder Lake-P AHCI controller to low power chipsets list") Signed-off-by: Jason Nader <[email protected]> Signed-off-by: Niklas Cassel <[email protected]>
2024-05-27ALSA: hda/realtek: Adjust G814JZR to use SPI init for ampLuke D. Jones1-1/+1
The 2024 ASUS ROG G814J model is much the same as the 2023 model and the 2023 16" version. We can use the same Cirrus Amp quirk. Fixes: 811dd426a9b1 ("ALSA: hda/realtek: Add quirks for Asus ROG 2024 laptops using CS35L41") Signed-off-by: Luke D. Jones <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2024-05-27ALSA: core: Remove debugfs at disconnectionTakashi Iwai2-11/+19
The card-specific debugfs entries are removed at the last stage of card free phase, and it's performed after synchronization of the closes of all opened fds. This works fine for most cases, but it can be potentially problematic for a hotplug device like USB-audio. Due to the nature of snd_card_free_when_closed(), the card free isn't called immediately after the driver removal for a hotplug device, but it's left until the last fd is closed. It implies that the card debugfs entries also remain. Meanwhile, when a new device is inserted before the last close and the very same card slot is assigned, the driver tries to create the card debugfs root again on the very same path. This conflicts with the remaining entry, and results in the kernel warning such as: debugfs: Directory 'card0' with parent 'sound' already present! with the missing debugfs entry afterwards. For avoiding such conflicts, remove debugfs entries at the device disconnection phase instead. The jack kctl debugfs entries get removed in snd_jack_dev_disconnect() instead of each kctl private_free. Fixes: 2d670ea2bd53 ("ALSA: jack: implement software jack injection via debugfs") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2024-05-27fs: smb: common: add missing MODULE_DESCRIPTION() macrosJeff Johnson2-0/+2
Fix the 'make W=1' warnings: WARNING: modpost: missing MODULE_DESCRIPTION() in fs/smb/common/cifs_arc4.o WARNING: modpost: missing MODULE_DESCRIPTION() in fs/smb/common/cifs_md4.o Signed-off-by: Jeff Johnson <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-05-27Merge tag 'drm-misc-fixes-2024-05-23' of ↵Dave Airlie4-25/+28
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: buddy: - stop using PAGE_SIZE shmem-helper: - avoid kernel panic in mmap() tests: - buddy: fix PAGE_SIZE dependency Signed-off-by: Dave Airlie <[email protected]> From: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-05-26bcachefs: Use copy_folio_from_iter_atomic()Matthew Wilcox (Oracle)1-3/+3
copy_page_from_iter_atomic() will be removed at some point. Also fixup a comment for folios. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>
2024-05-27regmap-i2c: Subtract reg size from max_writeJim Wylder1-1/+2
Currently, when an adapter defines a max_write_len quirk, the data will be chunked into data sizes equal to the max_write_len quirk value. But the payload will be increased by the size of the register address before transmission. The resulting value always ends up larger than the limit set by the quirk. Avoid this error by setting regmap's max_write to the quirk's max_write_len minus the number of bytes for the register and padding. This allows the chunking to work correctly for this limited case without impacting other use-cases. Signed-off-by: Jim Wylder <[email protected]> Link: https://msgid.link/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2024-05-27firewire: add missing MODULE_DESCRIPTION() to test modulesJeff Johnson2-0/+2
Fix the 'make W=1' warnings: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/firewire/uapi-test.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/firewire/packet-serdes-test.o Signed-off-by: Jeff Johnson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Sakamoto <[email protected]>
2024-05-26Linux 6.10-rc1Linus Torvalds1-3/+3
2024-05-26mm: percpu: Include smp.h in alloc_tag.hKent Overstreet1-0/+1
percpu.h depends on smp.h, but doesn't include it directly because of circular header dependency issues; percpu.h is needed in a bunch of low level headers. This fixes a randconfig build error on mips: include/linux/alloc_tag.h: In function '__alloc_tag_ref_set': include/asm-generic/percpu.h:31:40: error: implicit declaration of function 'raw_smp_processor_id' [-Werror=implicit-function-declaration] Reported-by: kernel test robot <[email protected]> Fixes: 24e44cc22aa3 ("mm: percpu: enable per-cpu allocation tagging") Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Kent Overstreet <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2024-05-26Merge tag 'perf-tools-fixes-for-v6.10-1-2024-05-26' of ↵Linus Torvalds4-103/+68
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tool fix from Arnaldo Carvalho de Melo: "Revert a patch causing a regression. This made a simple 'perf record -e cycles:pp make -j199' stop working on the Ampere ARM64 system Linus uses to test ARM64 kernels". * tag 'perf-tools-fixes-for-v6.10-1-2024-05-26' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: Revert "perf parse-events: Prefer sysfs/JSON hardware events over legacy"
2024-05-26bcachefs: Fix sb-downgrade validationKent Overstreet1-3/+10
Superblock downgrade entries are only two byte aligned, but section sizes are 8 byte aligned, which means we have to be careful about overrun checks; an entry that crosses the end of the section is allowed (and ignored) as long as it has zero errors. Signed-off-by: Kent Overstreet <[email protected]>
2024-05-26bcachefs: Fix debug assertKent Overstreet1-1/+1
Reported-by: [email protected] Signed-off-by: Kent Overstreet <[email protected]>
2024-05-26Revert "perf parse-events: Prefer sysfs/JSON hardware events over legacy"Arnaldo Carvalho de Melo4-103/+68
This reverts commit 617824a7f0f73e4de325cf8add58e55b28c12493. This made a simple 'perf record -e cycles:pp make -j199' stop working on the Ampere ARM64 system Linus uses to test ARM64 kernels, as discussed at length in the threads in the Link tags below. The fix provided by Ian wasn't acceptable and work to fix this will take time we don't have at this point, so lets revert this and work on it on the next devel cycle. Reported-by: Linus Torvalds <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Bhaskar Chowdhury <[email protected]> Cc: Ethan Adams <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tycho Andersen <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/lkml/CAHk-=wi5Ri=yR2jBVk-4HzTzpoAWOgstr1LEvg_-OXtJvXXJOA@mail.gmail.com Link: https://lore.kernel.org/lkml/CAHk-=wiWvtFyedDNpoV7a8Fq_FpbB+F5KmWK2xPY3QoYseOf_A@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-05-26hwrng: core - Remove add_early_randomnessHerbert Xu1-43/+4
A potential deadlock was reported with the config file at https://web.archive.org/web/20240522052129/https://0x0.st/XPN_.txt In this particular configuration, the deadlock doesn't exist because the warning triggered at a point before modules were even available. However, the deadlock can be real because any module loaded would invoke async_synchronize_full. The issue is spurious for software crypto algorithms which aren't themselves involved in async probing. However, it would be hard to avoid for a PCI crypto driver using async probing. In this particular call trace, the problem is easily avoided because the only reason the module is being requested during probing is the add_early_randomness call in the hwrng core. This feature is vestigial since there is now a kernel thread dedicated to doing exactly this. So remove add_early_randomness as it is no longer needed. Reported-by: Nícolas F. R. A. Prado <[email protected]> Reported-by: Eric Biggers <[email protected]> Fixes: 1b6d7f9eb150 ("tpm: add session encryption protection to tpm2_get_random()") Link: https://lore.kernel.org/r/119dc5ed-f159-41be-9dda-1a056f29888d@notapiano/ Signed-off-by: Herbert Xu <[email protected]> Reviewed-by: Jarkko Sakkinen <[email protected]> Tested-by: Nícolas F. R. A. Prado <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2024-05-25Merge tag '6.10-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds6-6/+34
Pull smb client fixes from Steve French: - two important netfs integration fixes - including for a data corruption and also fixes for multiple xfstests - reenable swap support over SMB3 * tag '6.10-rc-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: Fix missing set of remote_i_size cifs: Fix smb3_insert_range() to move the zero_point cifs: update internal version number smb3: reenable swapfiles over SMB3 mounts