aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-04-28fs/9p: Fix bit operation logic errorEric Van Hensbergen2-2/+2
This re-introduces a fix that somehow got dropped during rebase of the current series in for-next. When writeback is enabled, opens are forced to support both read and write operations but with the logic error other flags may be dropped unintentionaly. Reported-by: Christophe Jaillet <[email protected]> Signed-off-by: Eric Van Hensbergen <[email protected]>
2023-04-28ext4: clean up error handling in __ext4_fill_super()Theodore Ts'o1-22/+29
There were two ways to return an error code; one was via setting the 'err' variable, and the second, if err was zero, was via the 'ret' variable. This was both confusing and fragile, and when code was factored out of __ext4_fill_super(), some of the error codes returned by the original code was replaced by -EINVAL, and in one case, the error code was placed by 0, triggering a kernel null pointer dereference. Clean this up by removing the 'ret' variable, leaving only one way to set the error code to be returned, and restore the errno codes that were returned via the the mount system call as they were before we started refactoring __ext4_fill_super(). Signed-off-by: Theodore Ts'o <[email protected]> Reviewed-by: Jason Yan <[email protected]>
2023-04-28ext4: reflect error codes from ext4_multi_mount_protect() to its callersTheodore Ts'o2-8/+17
This will allow more fine-grained errno codes to be returned by the mount system call. Cc: Andreas Dilger <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]>
2023-04-28ext4: fix lost error code reporting in __ext4_fill_super()Theodore Ts'o1-1/+2
When code was factored out of __ext4_fill_super() into ext4_percpu_param_init() the error return was discarded. This meant that it was possible for __ext4_fill_super() to return zero, indicating success, without the struct super getting completely filled in, leading to a potential NULL pointer dereference. Reported-by: [email protected] Fixes: 1f79467c8a6b ("ext4: factor out ext4_percpu_param_init() ...") Link: https://syzkaller.appspot.com/bug?id=6dac47d5e58af770c0055f680369586ec32e144c Signed-off-by: Theodore Ts'o <[email protected]> Reviewed-by: Jason Yan <[email protected]>
2023-04-28ext4: fix unused iterator variable warningsNathan Chancellor1-4/+3
When CONFIG_QUOTA is disabled, there are warnings around unused iterator variables: fs/ext4/super.c: In function 'ext4_put_super': fs/ext4/super.c:1262:13: error: unused variable 'i' [-Werror=unused-variable] 1262 | int i, err; | ^ fs/ext4/super.c: In function '__ext4_fill_super': fs/ext4/super.c:5200:22: error: unused variable 'i' [-Werror=unused-variable] 5200 | unsigned int i; | ^ cc1: all warnings being treated as errors The kernel has updated to GNU11, allowing the variables to be declared within the for loop. Do so to clear up the warnings. Fixes: dcbf87589d90 ("ext4: factor out ext4_flex_groups_free()") Signed-off-by: Nathan Chancellor <[email protected]> Reviewed-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Jan Kara <[email protected]> Reviewed-by: Jason Yan <[email protected]> Link: https://lore.kernel.org/r/20230420-ext4-unused-variables-super-c-v1-1-138b6db6c21c@kernel.org Signed-off-by: Theodore Ts'o <[email protected]>
2023-04-28ext4: fix use-after-free read in ext4_find_extent for bigalloc + inlineYe Bin1-1/+2
Syzbot found the following issue: loop0: detected capacity change from 0 to 2048 EXT4-fs (loop0): mounted filesystem 00000000-0000-0000-0000-000000000000 without journal. Quota mode: none. ================================================================== BUG: KASAN: use-after-free in ext4_ext_binsearch_idx fs/ext4/extents.c:768 [inline] BUG: KASAN: use-after-free in ext4_find_extent+0x76e/0xd90 fs/ext4/extents.c:931 Read of size 4 at addr ffff888073644750 by task syz-executor420/5067 CPU: 0 PID: 5067 Comm: syz-executor420 Not tainted 6.2.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1b1/0x290 lib/dump_stack.c:106 print_address_description+0x74/0x340 mm/kasan/report.c:306 print_report+0x107/0x1f0 mm/kasan/report.c:417 kasan_report+0xcd/0x100 mm/kasan/report.c:517 ext4_ext_binsearch_idx fs/ext4/extents.c:768 [inline] ext4_find_extent+0x76e/0xd90 fs/ext4/extents.c:931 ext4_clu_mapped+0x117/0x970 fs/ext4/extents.c:5809 ext4_insert_delayed_block fs/ext4/inode.c:1696 [inline] ext4_da_map_blocks fs/ext4/inode.c:1806 [inline] ext4_da_get_block_prep+0x9e8/0x13c0 fs/ext4/inode.c:1870 ext4_block_write_begin+0x6a8/0x2290 fs/ext4/inode.c:1098 ext4_da_write_begin+0x539/0x760 fs/ext4/inode.c:3082 generic_perform_write+0x2e4/0x5e0 mm/filemap.c:3772 ext4_buffered_write_iter+0x122/0x3a0 fs/ext4/file.c:285 ext4_file_write_iter+0x1d0/0x18f0 call_write_iter include/linux/fs.h:2186 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x7dc/0xc50 fs/read_write.c:584 ksys_write+0x177/0x2a0 fs/read_write.c:637 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f4b7a9737b9 RSP: 002b:00007ffc5cac3668 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f4b7a9737b9 RDX: 00000000175d9003 RSI: 0000000020000200 RDI: 0000000000000004 RBP: 00007f4b7a933050 R08: 0000000000000000 R09: 0000000000000000 R10: 000000000000079f R11: 0000000000000246 R12: 00007f4b7a9330e0 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Above issue is happens when enable bigalloc and inline data feature. As commit 131294c35ed6 fixed delayed allocation bug in ext4_clu_mapped for bigalloc + inline. But it only resolved issue when has inline data, if inline data has been converted to extent(ext4_da_convert_inline_data_to_extent) before writepages, there is no EXT4_STATE_MAY_INLINE_DATA flag. However i_data is still store inline data in this scene. Then will trigger UAF when find extent. To resolve above issue, there is need to add judge "ext4_has_inline_data(inode)" in ext4_clu_mapped(). Fixes: 131294c35ed6 ("ext4: fix delayed allocation bug in ext4_clu_mapped for bigalloc + inline") Reported-by: [email protected] Reviewed-by: Jan Kara <[email protected]> Reviewed-by: Ye Bin <[email protected]> Reviewed-by: Tudor Ambarus <[email protected]> Tested-by: Tudor Ambarus <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2023-04-28Merge tag 'x86_mm_for_6.4' of ↵Linus Torvalds35-149/+1701
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 LAM (Linear Address Masking) support from Dave Hansen: "Add support for the new Linear Address Masking CPU feature. This is similar to ARM's Top Byte Ignore and allows userspace to store metadata in some bits of pointers without masking it out before use" * tag 'x86_mm_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/iommu/sva: Do not allow to set FORCE_TAGGED_SVA bit from outside x86/mm/iommu/sva: Fix error code for LAM enabling failure due to SVA selftests/x86/lam: Add test cases for LAM vs thread creation selftests/x86/lam: Add ARCH_FORCE_TAGGED_SVA test cases for linear-address masking selftests/x86/lam: Add inherit test cases for linear-address masking selftests/x86/lam: Add io_uring test cases for linear-address masking selftests/x86/lam: Add mmap and SYSCALL test cases for linear-address masking selftests/x86/lam: Add malloc and tag-bits test cases for linear-address masking x86/mm/iommu/sva: Make LAM and SVA mutually exclusive iommu/sva: Replace pasid_valid() helper with mm_valid_pasid() mm: Expose untagging mask in /proc/$PID/status x86/mm: Provide arch_prctl() interface for LAM x86/mm: Reduce untagged_addr() overhead for systems without LAM x86/uaccess: Provide untagged_addr() and remove tags before address check mm: Introduce untagged_addr_remote() x86/mm: Handle LAM on context switch x86: CPUID and CR3/CR4 flags for Linear Address Masking x86: Allow atomic MM_CONTEXT flags setting x86/mm: Rework address range check in get_user() and put_user()
2023-04-28writeback: fix call of incorrect macroMaxim Korotkov1-1/+1
the variable 'history' is of type u16, it may be an error that the hweight32 macro was used for it I guess macro hweight16 should be used Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 2a81490811d0 ("writeback: implement foreign cgroup inode detection") Signed-off-by: Maxim Korotkov <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-04-28Merge tag 'md-next-2023-04-28' of ↵Jens Axboe2-4/+47
https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-6.4/block Pull MD fixes from Song: "1. Improve raid5 sequential IO performance on spinning disks, which fixes a regression since v6.0, by Jan Kara. 2. Fix bitmap offset types, which fixes an issue introduced in this merge window, by Jonathan Derrick." * tag 'md-next-2023-04-28' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md: Fix bitmap offset type in sb writer md/raid5: Improve performance for sequential IO
2023-04-28Merge tag 'x86_tdx_for_6.4' of ↵Linus Torvalds4-42/+51
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 tdx update from Dave Hansen: "The original tdx hypercall assembly code took two flags in %RSI to tweak its behavior at runtime. PeterZ recently axed one flag in commit e80a48bade61 ("x86/tdx: Remove TDX_HCALL_ISSUE_STI"). Kill the other flag too and tweak the 'output' mode with an assembly macro instead. This results in elimination of one push/pop pair and overall easier to read assembly. - Do conditional __tdx_hypercall() 'output' processing via an assembly macro argument rather than a runtime register" * tag 'x86_tdx_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tdx: Drop flags from __tdx_hypercall()
2023-04-28Merge tag 'x86_fpu_for_6.4' of ↵Linus Torvalds2-0/+103
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fpu updates from Dave Hansen: "There's no _actual_ kernel functionality here. This expands the documentation around AMX support including some code examples. The example code also exposed the fact that hardware architecture constants as part of the ABI, but there's no easy place that they get defined for apps. Adding them to a uabi header will eventually make life easier for consumers of the ABI. Summary: - Improve AMX documentation along with example code - Explicitly make some hardware constants part of the uabi" * tag 'x86_fpu_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation/x86: Explain the state component permission for guests Documentation/x86: Add the AMX enabling example x86/arch_prctl: Add AMX feature numbers as ABI constants Documentation/x86: Explain the purpose for dynamic features
2023-04-28Merge tag 'x86_cache_for_6.4' of ↵Linus Torvalds1-24/+19
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 resctrl update from Dave Hansen: "Reduce redundant counter reads with resctrl refactoring" * tag 'x86_cache_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Avoid redundant counter read in __mon_event_count()
2023-04-28Merge tag 'x86_cleanups_for_v6.4_rc1' of ↵Linus Torvalds9-63/+12
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: - Unify duplicated __pa() and __va() definitions - Simplify sysctl tables registration - Remove unused symbols - Correct function name in comment * tag 'x86_cleanups_for_v6.4_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Centralize __pa()/__va() definitions x86: Simplify one-level sysctl registration for itmt_kern_table x86: Simplify one-level sysctl registration for abi_table2 x86/platform/intel-mid: Remove unused definitions from intel-mid.h x86/uaccess: Remove memcpy_page_flushcache() x86/entry: Change stale function name in comment to error_return()
2023-04-28md: Fix bitmap offset type in sb writerJonathan Derrick1-3/+3
Bitmap offset is allowed to be negative, indicating that bitmap precedes metadata. Change the type back from sector_t to loff_t to satisfy conditionals and calculations. Reported-by: Dan Carpenter <[email protected]> Link: https://lore.kernel.org/linux-raid/CAPhsuW6HuaUJ5WcyPajVgUfkQFYp2D_cy1g6qxN4CU_gP2=z7g@mail.gmail.com/ Fixes: 10172f200b67 ("md: Fix types in sb writer") Signed-off-by: Jonathan Derrick <[email protected]> Suggested-by: Yu Kuai <[email protected]> Reviewed-by: Yu Kuai <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-04-28md/raid5: Improve performance for sequential IOJan Kara1-1/+44
Commit 7e55c60acfbb ("md/raid5: Pivot raid5_make_request()") changed the order in which requests for underlying disks are created. Since for large sequential IO adding of requests frequently races with md_raid5 thread submitting bios to underlying disks, this results in a change in IO pattern because intermediate states of new order of request creation result in more smaller discontiguous requests. For RAID5 on top of three rotational disks our performance testing revealed this results in regression in write throughput: iozone -a -s 131072000 -y 4 -q 8 -i 0 -i 1 -R before 7e55c60acfbb: KB reclen write rewrite read reread 131072000 4 493670 525964 524575 513384 131072000 8 540467 532880 512028 513703 after 7e55c60acfbb: KB reclen write rewrite read reread 131072000 4 421785 456184 531278 509248 131072000 8 459283 456354 528449 543834 To reduce the amount of discontiguous requests we can start generating requests with the stripe with the lowest chunk offset as that has the best chance of being adjacent to IO queued previously. This improves the performance to: KB reclen write rewrite read reread 131072000 4 497682 506317 518043 514559 131072000 8 514048 501886 506453 504319 restoring big part of the regression. Fixes: 7e55c60acfbb ("md/raid5: Pivot raid5_make_request()") Cc: [email protected] # v6.0+ Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]> Signed-off-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-04-28lsm: move hook comments docs to security/security.cRandy Dunlap3-5/+5
Fix one kernel-doc warning, but invesigating that led to other kernel-doc movement (lsm_hooks.h to security.c) that needs to be fixed also. include/linux/lsm_hooks.h:1: warning: no structured comments found Fixes: e261301c851a ("lsm: move the remaining LSM hook comments to security/security.c") Fixes: 1cd2aca64a5d ("lsm: move the io_uring hook comments to security/security.c") Fixes: 452b670c7222 ("lsm: move the perf hook comments to security/security.c") Fixes: 55e853201a9e ("lsm: move the bpf hook comments to security/security.c") Fixes: b14faf9c94a6 ("lsm: move the audit hook comments to security/security.c") Fixes: 1427ddbe5cc1 ("lsm: move the binder hook comments to security/security.c") Fixes: 43fad2821876 ("lsm: move the sysv hook comments to security/security.c") Fixes: ecc419a44535 ("lsm: move the key hook comments to security/security.c") Fixes: 742b99456e86 ("lsm: move the xfrm hook comments to security/security.c") Fixes: ac318aed5498 ("lsm: move the Infiniband hook comments to security/security.c") Fixes: 4a49f592e931 ("lsm: move the SCTP hook comments to security/security.c") Fixes: 6b6bbe8c02a1 ("lsm: move the socket hook comments to security/security.c") Fixes: 2c2442fd46cd ("lsm: move the AF_UNIX hook comments to security/security.c") Fixes: 2bcf51bf2f03 ("lsm: move the netlink hook comments to security/security.c") Fixes: 130c53bfee4b ("lsm: move the task hook comments to security/security.c") Fixes: a0fd6480de48 ("lsm: move the file hook comments to security/security.c") Fixes: 9348944b775d ("lsm: move the kernfs hook comments to security/security.c") Fixes: 916e32584dfa ("lsm: move the inode hook comments to security/security.c") Fixes: 08526a902cc4 ("lsm: move the filesystem hook comments to security/security.c") Fixes: 36819f185590 ("lsm: move the fs_context hook comments to security/security.c") Fixes: 1661372c912d ("lsm: move the program execution hook comments to security/security.c") Signed-off-by: Randy Dunlap <[email protected]> Cc: Paul Moore <[email protected]> Cc: James Morris <[email protected]> Cc: "Serge E. Hallyn" <[email protected]> Cc: [email protected] Cc: Jonathan Corbet <[email protected]> Cc: [email protected] Cc: KP Singh <[email protected]> Cc: [email protected] Signed-off-by: Paul Moore <[email protected]>
2023-04-28ext4: fix i_disksize exceeding i_size problem in paritally written caseZhihao Cheng1-0/+3
It is possible for i_disksize can exceed i_size, triggering a warning. generic_perform_write copied = iov_iter_copy_from_user_atomic(len) // copied < len ext4_da_write_end | ext4_update_i_disksize | new_i_size = pos + copied; | WRITE_ONCE(EXT4_I(inode)->i_disksize, newsize) // update i_disksize | generic_write_end | copied = block_write_end(copied, len) // copied = 0 | if (unlikely(copied < len)) | if (!PageUptodate(page)) | copied = 0; | if (pos + copied > inode->i_size) // return false if (unlikely(copied == 0)) goto again; if (unlikely(iov_iter_fault_in_readable(i, bytes))) { status = -EFAULT; break; } We get i_disksize greater than i_size here, which could trigger WARNING check 'i_size_read(inode) < EXT4_I(inode)->i_disksize' while doing dio: ext4_dio_write_iter iomap_dio_rw __iomap_dio_rw // return err, length is not aligned to 512 ext4_handle_inode_extension WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize) // Oops WARNING: CPU: 2 PID: 2609 at fs/ext4/file.c:319 CPU: 2 PID: 2609 Comm: aa Not tainted 6.3.0-rc2 RIP: 0010:ext4_file_write_iter+0xbc7 Call Trace: vfs_write+0x3b1 ksys_write+0x77 do_syscall_64+0x39 Fix it by updating 'copied' value before updating i_disksize just like ext4_write_inline_data_end() does. A reproducer can be found in the buganizer link below. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217209 Fixes: 64769240bd07 ("ext4: Add delayed allocation support in data=writeback mode") Signed-off-by: Zhihao Cheng <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2023-04-28tpm: Re-enable TPM chip boostrapping non-tpm_tis TPM driversJarkko Sakkinen4-11/+28
TPM chip bootstrapping was removed from tpm_chip_register(), and it was relocated to tpm_tis_core. This breaks all drivers which are not based on tpm_tis because the chip will not get properly initialized. Take the corrective steps: 1. Rename tpm_chip_startup() as tpm_chip_bootstrap() and make it one-shot. 2. Call tpm_chip_bootstrap() in tpm_chip_register(), which reverts the things as tehy used to be. Cc: Lino Sanfilippo <[email protected]> Fixes: 548eb516ec0f ("tpm, tpm_tis: startup chip before testing for interrupts") Reported-by: Pengfei Xu <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Tested-by: Pengfei Xu <[email protected]> Signed-off-by: Jarkko Sakkinen <[email protected]>
2023-04-28crypto: engine - fix crypto_queue backlog handlingOlivier Bacon2-3/+6
CRYPTO_TFM_REQ_MAY_BACKLOG tells the crypto driver that it should internally backlog requests until the crypto hw's queue becomes full. At that point, crypto_engine backlogs the request and returns -EBUSY. Calling driver such as dm-crypt then waits until the complete() function is called with a status of -EINPROGRESS before sending a new request. The problem lies in the call to complete() with a value of -EINPROGRESS that is made when a backlog item is present on the queue. The call is done before the successful execution of the crypto request. In the case that do_one_request() returns < 0 and the retry support is available, the request is put back in the queue. This leads upper drivers to send a new request even if the queue is still full. The problem can be reproduced by doing a large dd into a crypto dm-crypt device. This is pretty easy to see when using Freescale CAAM crypto driver and SWIOTLB dma. Since the actual amount of requests that can be hold in the queue is unlimited we get IOs error and dma allocation. The fix is to call complete with a value of -EINPROGRESS only if the request is not enqueued back in crypto_queue. This is done by calling complete() later in the code. In order to delay the decision, crypto_queue is modified to correctly set the backlog pointer when a request is enqueued back. Fixes: 6a89f492f8e5 ("crypto: engine - support for parallel requests based on retry mechanism") Co-developed-by: Sylvain Ouellet <[email protected]> Signed-off-by: Sylvain Ouellet <[email protected]> Signed-off-by: Olivier Bacon <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2023-04-28crypto: sun8i-ss - Fix a test in sun8i_ss_setup_ivs()Christophe JAILLET1-1/+1
SS_ENCRYPTION is (0 << 7 = 0), so the test can never be true. Use a direct comparison to SS_ENCRYPTION instead. The same king of test is already done the same way in sun8i_ss_run_task(). Fixes: 359e893e8af4 ("crypto: sun8i-ss - rework handling of IV") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2023-04-28ALSA: emu10k1: use more existing defines instead of open-coded numbersOswald Buddenhagen7-65/+69
Using the *_MASK defines for "maximal value" is debatable. I got the idea from FreeBSD, and it sorta makes sense to me. Some hunks look a bit incomplete, because code that is going to be subsequently removed is not touched here. Signed-off-by: Oswald Buddenhagen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2023-04-28net: dsa: mv88e6xxx: add mv88e6321 rsvd2cpuAngelo Dureghello1-0/+1
Add rsvd2cpu capability for mv88e6321 model, to allow proper bpdu processing. Signed-off-by: Angelo Dureghello <[email protected]> Fixes: 51c901a775621 ("net: dsa: mv88e6xxx: distinguish Global 2 Rsvd2CPU") Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-28net: ipv6: fix skb hash for some RST packetsAntoine Tenart1-1/+1
The skb hash comes from sk->sk_txhash when using TCP, except for some IPv6 RST packets. This is because in tcp_v6_send_reset when not in TIME_WAIT the hash is taken from sk->sk_hash, while it should come from sk->sk_txhash as those two hashes are not computed the same way. Packetdrill script to test the above, 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress) +0 > (flowlabel 0x1) S 0:0(0) <...> // Wrong ack seq, trigger a rst. +0 < S. 0:0(0) ack 0 win 4000 // Check the flowlabel matches prior one from SYN. +0 > (flowlabel 0x1) R 0:0(0) <...> Fixes: 9258b8b1be2e ("ipv6: tcp: send consistent autoflowlabel in RST packets") Signed-off-by: Antoine Tenart <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-28selftests: srv6: make srv6_end_dt46_l3vpn_test more robustAndrea Mayer1-5/+5
On some distributions, the rp_filter is automatically set (=1) by default on a netdev basis (also on VRFs). In an SRv6 End.DT46 behavior, decapsulated IPv4 packets are routed using the table associated with the VRF bound to that tunnel. During lookup operations, the rp_filter can lead to packet loss when activated on the VRF. Therefore, we chose to make this selftest more robust by explicitly disabling the rp_filter during tests (as it is automatically set by some Linux distributions). Fixes: 03a0b567a03d ("selftests: seg6: add selftest for SRv6 End.DT46 Behavior") Reported-by: Hangbin Liu <[email protected]> Signed-off-by: Andrea Mayer <[email protected]> Tested-by: Hangbin Liu <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-28atlantic:hw_atl2:hw_atl2_utils_fw: Remove unnecessary (void*) conversionswuych1-2/+2
Pointer variables of void * type do not require type cast. Signed-off-by: wuych <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-28sit: update dev->needed_headroom in ipip6_tunnel_bind_dev()Cong Wang1-3/+5
When a tunnel device is bound with the underlying device, its dev->needed_headroom needs to be updated properly. IPv4 tunnels already do the same in ip_tunnel_bind_dev(). Otherwise we may not have enough header room for skb, especially after commit b17f709a2401 ("gue: TX support for using remote checksum offload option"). Fixes: 32b8a8e59c9c ("sit: add IPv4 over IPv4 support") Reported-by: Palash Oswal <[email protected]> Link: https://lore.kernel.org/netdev/CAGyP=7fDcSPKu6nttbGwt7RXzE3uyYxLjCSE97J64pRxJP8jPA@mail.gmail.com/ Cc: Kuniyuki Iwashima <[email protected]> Cc: Eric Dumazet <[email protected]> Signed-off-by: Cong Wang <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-28net/sched: cls_api: remove block_cb from driver_list before freeingVlad Buslov1-0/+1
Error handler of tcf_block_bind() frees the whole bo->cb_list on error. However, by that time the flow_block_cb instances are already in the driver list because driver ndo_setup_tc() callback is called before that up the call chain in tcf_block_offload_cmd(). This leaves dangling pointers to freed objects in the list and causes use-after-free[0]. Fix it by also removing flow_block_cb instances from driver_list before deallocating them. [0]: [ 279.868433] ================================================================== [ 279.869964] BUG: KASAN: slab-use-after-free in flow_block_cb_setup_simple+0x631/0x7c0 [ 279.871527] Read of size 8 at addr ffff888147e2bf20 by task tc/2963 [ 279.873151] CPU: 6 PID: 2963 Comm: tc Not tainted 6.3.0-rc6+ #4 [ 279.874273] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 279.876295] Call Trace: [ 279.876882] <TASK> [ 279.877413] dump_stack_lvl+0x33/0x50 [ 279.878198] print_report+0xc2/0x610 [ 279.878987] ? flow_block_cb_setup_simple+0x631/0x7c0 [ 279.879994] kasan_report+0xae/0xe0 [ 279.880750] ? flow_block_cb_setup_simple+0x631/0x7c0 [ 279.881744] ? mlx5e_tc_reoffload_flows_work+0x240/0x240 [mlx5_core] [ 279.883047] flow_block_cb_setup_simple+0x631/0x7c0 [ 279.884027] tcf_block_offload_cmd.isra.0+0x189/0x2d0 [ 279.885037] ? tcf_block_setup+0x6b0/0x6b0 [ 279.885901] ? mutex_lock+0x7d/0xd0 [ 279.886669] ? __mutex_unlock_slowpath.constprop.0+0x2d0/0x2d0 [ 279.887844] ? ingress_init+0x1c0/0x1c0 [sch_ingress] [ 279.888846] tcf_block_get_ext+0x61c/0x1200 [ 279.889711] ingress_init+0x112/0x1c0 [sch_ingress] [ 279.890682] ? clsact_init+0x2b0/0x2b0 [sch_ingress] [ 279.891701] qdisc_create+0x401/0xea0 [ 279.892485] ? qdisc_tree_reduce_backlog+0x470/0x470 [ 279.893473] tc_modify_qdisc+0x6f7/0x16d0 [ 279.894344] ? tc_get_qdisc+0xac0/0xac0 [ 279.895213] ? mutex_lock+0x7d/0xd0 [ 279.896005] ? __mutex_lock_slowpath+0x10/0x10 [ 279.896910] rtnetlink_rcv_msg+0x5fe/0x9d0 [ 279.897770] ? rtnl_calcit.isra.0+0x2b0/0x2b0 [ 279.898672] ? __sys_sendmsg+0xb5/0x140 [ 279.899494] ? do_syscall_64+0x3d/0x90 [ 279.900302] ? entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 279.901337] ? kasan_save_stack+0x2e/0x40 [ 279.902177] ? kasan_save_stack+0x1e/0x40 [ 279.903058] ? kasan_set_track+0x21/0x30 [ 279.903913] ? kasan_save_free_info+0x2a/0x40 [ 279.904836] ? ____kasan_slab_free+0x11a/0x1b0 [ 279.905741] ? kmem_cache_free+0x179/0x400 [ 279.906599] netlink_rcv_skb+0x12c/0x360 [ 279.907450] ? rtnl_calcit.isra.0+0x2b0/0x2b0 [ 279.908360] ? netlink_ack+0x1550/0x1550 [ 279.909192] ? rhashtable_walk_peek+0x170/0x170 [ 279.910135] ? kmem_cache_alloc_node+0x1af/0x390 [ 279.911086] ? _copy_from_iter+0x3d6/0xc70 [ 279.912031] netlink_unicast+0x553/0x790 [ 279.912864] ? netlink_attachskb+0x6a0/0x6a0 [ 279.913763] ? netlink_recvmsg+0x416/0xb50 [ 279.914627] netlink_sendmsg+0x7a1/0xcb0 [ 279.915473] ? netlink_unicast+0x790/0x790 [ 279.916334] ? iovec_from_user.part.0+0x4d/0x220 [ 279.917293] ? netlink_unicast+0x790/0x790 [ 279.918159] sock_sendmsg+0xc5/0x190 [ 279.918938] ____sys_sendmsg+0x535/0x6b0 [ 279.919813] ? import_iovec+0x7/0x10 [ 279.920601] ? kernel_sendmsg+0x30/0x30 [ 279.921423] ? __copy_msghdr+0x3c0/0x3c0 [ 279.922254] ? import_iovec+0x7/0x10 [ 279.923041] ___sys_sendmsg+0xeb/0x170 [ 279.923854] ? copy_msghdr_from_user+0x110/0x110 [ 279.924797] ? ___sys_recvmsg+0xd9/0x130 [ 279.925630] ? __perf_event_task_sched_in+0x183/0x470 [ 279.926656] ? ___sys_sendmsg+0x170/0x170 [ 279.927529] ? ctx_sched_in+0x530/0x530 [ 279.928369] ? update_curr+0x283/0x4f0 [ 279.929185] ? perf_event_update_userpage+0x570/0x570 [ 279.930201] ? __fget_light+0x57/0x520 [ 279.931023] ? __switch_to+0x53d/0xe70 [ 279.931846] ? sockfd_lookup_light+0x1a/0x140 [ 279.932761] __sys_sendmsg+0xb5/0x140 [ 279.933560] ? __sys_sendmsg_sock+0x20/0x20 [ 279.934436] ? fpregs_assert_state_consistent+0x1d/0xa0 [ 279.935490] do_syscall_64+0x3d/0x90 [ 279.936300] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 279.937311] RIP: 0033:0x7f21c814f887 [ 279.938085] Code: 0a 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10 [ 279.941448] RSP: 002b:00007fff11efd478 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 279.942964] RAX: ffffffffffffffda RBX: 0000000064401979 RCX: 00007f21c814f887 [ 279.944337] RDX: 0000000000000000 RSI: 00007fff11efd4e0 RDI: 0000000000000003 [ 279.945660] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [ 279.947003] R10: 00007f21c8008708 R11: 0000000000000246 R12: 0000000000000001 [ 279.948345] R13: 0000000000409980 R14: 000000000047e538 R15: 0000000000485400 [ 279.949690] </TASK> [ 279.950706] Allocated by task 2960: [ 279.951471] kasan_save_stack+0x1e/0x40 [ 279.952338] kasan_set_track+0x21/0x30 [ 279.953165] __kasan_kmalloc+0x77/0x90 [ 279.954006] flow_block_cb_setup_simple+0x3dd/0x7c0 [ 279.955001] tcf_block_offload_cmd.isra.0+0x189/0x2d0 [ 279.956020] tcf_block_get_ext+0x61c/0x1200 [ 279.956881] ingress_init+0x112/0x1c0 [sch_ingress] [ 279.957873] qdisc_create+0x401/0xea0 [ 279.958656] tc_modify_qdisc+0x6f7/0x16d0 [ 279.959506] rtnetlink_rcv_msg+0x5fe/0x9d0 [ 279.960392] netlink_rcv_skb+0x12c/0x360 [ 279.961216] netlink_unicast+0x553/0x790 [ 279.962044] netlink_sendmsg+0x7a1/0xcb0 [ 279.962906] sock_sendmsg+0xc5/0x190 [ 279.963702] ____sys_sendmsg+0x535/0x6b0 [ 279.964534] ___sys_sendmsg+0xeb/0x170 [ 279.965343] __sys_sendmsg+0xb5/0x140 [ 279.966132] do_syscall_64+0x3d/0x90 [ 279.966908] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 279.968407] Freed by task 2960: [ 279.969114] kasan_save_stack+0x1e/0x40 [ 279.969929] kasan_set_track+0x21/0x30 [ 279.970729] kasan_save_free_info+0x2a/0x40 [ 279.971603] ____kasan_slab_free+0x11a/0x1b0 [ 279.972483] __kmem_cache_free+0x14d/0x280 [ 279.973337] tcf_block_setup+0x29d/0x6b0 [ 279.974173] tcf_block_offload_cmd.isra.0+0x226/0x2d0 [ 279.975186] tcf_block_get_ext+0x61c/0x1200 [ 279.976080] ingress_init+0x112/0x1c0 [sch_ingress] [ 279.977065] qdisc_create+0x401/0xea0 [ 279.977857] tc_modify_qdisc+0x6f7/0x16d0 [ 279.978695] rtnetlink_rcv_msg+0x5fe/0x9d0 [ 279.979562] netlink_rcv_skb+0x12c/0x360 [ 279.980388] netlink_unicast+0x553/0x790 [ 279.981214] netlink_sendmsg+0x7a1/0xcb0 [ 279.982043] sock_sendmsg+0xc5/0x190 [ 279.982827] ____sys_sendmsg+0x535/0x6b0 [ 279.983703] ___sys_sendmsg+0xeb/0x170 [ 279.984510] __sys_sendmsg+0xb5/0x140 [ 279.985298] do_syscall_64+0x3d/0x90 [ 279.986076] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 279.987532] The buggy address belongs to the object at ffff888147e2bf00 which belongs to the cache kmalloc-192 of size 192 [ 279.989747] The buggy address is located 32 bytes inside of freed 192-byte region [ffff888147e2bf00, ffff888147e2bfc0) [ 279.992367] The buggy address belongs to the physical page: [ 279.993430] page:00000000550f405c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x147e2a [ 279.995182] head:00000000550f405c order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 279.996713] anon flags: 0x200000000010200(slab|head|node=0|zone=2) [ 279.997878] raw: 0200000000010200 ffff888100042a00 0000000000000000 dead000000000001 [ 279.999384] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000 [ 280.000894] page dumped because: kasan: bad access detected [ 280.002386] Memory state around the buggy address: [ 280.003338] ffff888147e2be00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 280.004781] ffff888147e2be80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 280.006224] >ffff888147e2bf00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 280.007700] ^ [ 280.008592] ffff888147e2bf80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 280.010035] ffff888147e2c000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 280.011564] ================================================================== Fixes: 59094b1e5094 ("net: sched: use flow block API") Signed-off-by: Vlad Buslov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-28mISDN: Use list_count_nodes()Christophe JAILLET1-13/+2
count_list_member() really looks the same as list_count_nodes(), so use the latter instead of hand writing it. The first one return an int and the other a size_t, but that should be fine. It is really unlikely that we get so many parties in a conference. Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-28tcp: fix skb_copy_ubufs() vs BIG TCPEric Dumazet1-6/+14
David Ahern reported crashes in skb_copy_ubufs() caused by TCP tx zerocopy using hugepages, and skb length bigger than ~68 KB. skb_copy_ubufs() assumed it could copy all payload using up to MAX_SKB_FRAGS order-0 pages. This assumption broke when BIG TCP was able to put up to 512 KB per skb. We did not hit this bug at Google because we use CONFIG_MAX_SKB_FRAGS=45 and limit gso_max_size to 180000. A solution is to use higher order pages if needed. v2: add missing __GFP_COMP, or we leak memory. Fixes: 7c4e983c4f3c ("net: allow gso_max_size to exceed 65536") Reported-by: David Ahern <[email protected]> Link: https://lore.kernel.org/netdev/[email protected]/T/ Signed-off-by: Eric Dumazet <[email protected]> Cc: Xin Long <[email protected]> Cc: Willem de Bruijn <[email protected]> Cc: Coco Li <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-28net/ncsi: clear Tx enable mode when handling a Config required AENCosmo Chou1-0/+1
ncsi_channel_is_tx() determines whether a given channel should be used for Tx or not. However, when reconfiguring the channel by handling a Configuration Required AEN, there is a misjudgment that the channel Tx has already been enabled, which results in the Enable Channel Network Tx command not being sent. Clear the channel Tx enable flag before reconfiguring the channel to avoid the misjudgment. Fixes: 8d951a75d022 ("net/ncsi: Configure multi-package, multi-channel modes with failover") Signed-off-by: Cosmo Chou <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-28i3c: ast2600: fix register setting for 545 ohm pullupsJeremy Kerr1-2/+2
The 2k register setting is zero, OR-ing it in doesn't parallel the 2k and 750 ohm pullups. We need a separate value for the 545 ohm setting. Reported-by: Lukwinski Zbigniew <[email protected]> Signed-off-by: Jeremy Kerr <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]>
2023-04-28i3c: ast2600: enable IBI supportJeremy Kerr1-0/+21
The ast2600 i3c hardware is capable of IBIs, but we need a workaround for a hardware issue with the I3C state machine handling IBI payloads of specific lengths when PEC is not enabled. To avoid this, we need to unconditionally enable PECs, at the consquence of losing a byte of data when the device does not send a PEC. Enable IBIs on the ast2600 platform, including an implementation of the PEC workaround, which prints a warning when triggered. Signed-off-by: Jeremy Kerr <[email protected]> Reviewed-by: Joel Stanley <[email protected]> Link: https://lore.kernel.org/r/ba923b96d6d129024c975e8a0472c5b2fcb3af32.1680161823.git.jk@codeconstruct.com.au Signed-off-by: Alexandre Belloni <[email protected]>
2023-04-28i3c: dw: Add a platform facility for IBI PEC workaroundsJeremy Kerr2-0/+29
On the AST2600 i3c controller, we'll need to apply a workaround for a hardware issue with IBI payloads. Introduce a platform hook to allow dw i3c platform implementations to modify the DAT entry in IBI enable/disable to allow this workaround in a future change. Signed-off-by: Jeremy Kerr <[email protected]> Reviewed-by: Joel Stanley <[email protected]> Link: https://lore.kernel.org/r/d5d76a8d2336d2a71886537f42e71d51db184df6.1680161823.git.jk@codeconstruct.com.au Signed-off-by: Alexandre Belloni <[email protected]>
2023-04-28i3c: dw: Add support for in-band interruptsJeremy Kerr2-3/+289
This change adds support for receiving and dequeueing i3c IBIs. By setting struct dw_i3c_master->ibi_capable before probe, a platform implementation can select the IBI-enabled version of the i3c_master_ops, enabling the global IBI infrastrcture for that controller. Signed-off-by: Jeremy Kerr <[email protected]> Reviewed-by: Joel Stanley <[email protected]> Link: https://lore.kernel.org/r/79daeefd7ccb7c935d0c159149df21a6c9a73ffa.1680161823.git.jk@codeconstruct.com.au Signed-off-by: Alexandre Belloni <[email protected]>
2023-04-28i3c: dw: Turn DAT array entry into a structJeremy Kerr2-12/+21
In an upcoming change, we will want to store additional data about the devices we have in the data address table. Change the type of the DAT entries into a struct, which currently just has the address data. Signed-off-by: Jeremy Kerr <[email protected]> Reviewed-by: Joel Stanley <[email protected]> Link: https://lore.kernel.org/r/9dc0d9e2857e851a0cf04819df48e5d31921f83e.1680161823.git.jk@codeconstruct.com.au Signed-off-by: Alexandre Belloni <[email protected]>
2023-04-28i3c: dw: Create a generic fifo read functionJeremy Kerr1-4/+10
In a future change we'll want to read from the IBI FIFO too, so turn dw_i3c_read_rx_fifo() into a generic read with the FIFO register as a parameter. Signed-off-by: Jeremy Kerr <[email protected]> Reviewed-by: Joel Stanley <[email protected]> Link: https://lore.kernel.org/r/827204789583dd86addffb47ecaeab9d67cf95d5.1680161823.git.jk@codeconstruct.com.au Signed-off-by: Alexandre Belloni <[email protected]>
2023-04-28i3c: Allow OF-alias-based persistent bus numberingJeremy Kerr1-5/+25
Parse the /aliases node to assign any fixed bus numbers, as is done with the i2c subsystem. Numbering for non-aliased busses will start after the highest fixed bus number. This allows an alias node such as: aliases { i3c0 = &bus_a, i3c4 = &bus_b, }; to set the numbering for a set of i3c controllers: /* fixed-numbered bus, assigned "i3c-0" */ bus_a: i3c-master { }; /* another fixed-numbered bus, assigned "i3c-4" */ bus_b: i3c-master { }; /* dynamic-numbered bus, likely assigned "i3c-5" */ bus_c: i3c-master { }; If no i3c device aliases are present, the numbering will stay as-is, starting from 0. Signed-off-by: Jeremy Kerr <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]>
2023-04-28i3c: ast2600: Add AST2600 platform-specific driverJeremy Kerr4-0/+189
Now that we have platform-specific infrastructure for the dw i3c driver, add platform support for the ASPEED AST2600 SoC. The AST2600 has a small set of "i3c global" registers, providing platform-level i3c configuration outside of the i3c core. For the ast2600, we need a couple of extra setup operations: - on probe: find the i3c global register set and parse the SDA pullup resistor values - on init: set the pullups accordingly, and set the i3c instance IDs Signed-off-by: Jeremy Kerr <[email protected]> Reviewed-by: Joel Stanley <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]>
2023-04-28dt-bindings: i3c: Add AST2600 i3c controllerJeremy Kerr1-0/+72
Add a devicetree binding for the ast2600 i3c controller hardware. This is heavily based on the designware i3c core, plus a reset facility and two platform-specific properties: - sda-pullup-ohms: to specify the value of the configurable pullup resistors on the SDA line - aspeed,global-regs: to reference the (ast2600-specific) i3c global register block, and the device index to use within it. Reviewed-by: Krzysztof Kozlowski <[email protected]> (on v1) Signed-off-by: Jeremy Kerr <[email protected]> Reviewed-by: Joel Stanley <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]>
2023-04-28i3c: dw: Add infrastructure for platform-specific implementationsJeremy Kerr2-34/+97
The dw i3c core can be integrated into various SoC devices. Platforms that use this core may need a little configuration that is specific to that platform. Add some infrastructure to allow platform-specific behaviour: common probe/remove functions, a set of platform hook operations, and a pointer for platform-specific data in struct dw_i3c_master. Move the common api into a new (i3c local) header file. Platforms will provide their own struct platform_driver, which allocates struct dw_i3c_master, does any platform-specific probe behaviour, and calls into the common probe. A future change will add new platform support that uses this infrastructure. Signed-off-by: Jeremy Kerr <[email protected]> Reviewed-by: Joel Stanley <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]>
2023-04-28rtc: armada38x: use devm_platform_ioremap_resource_byname()Ye Xingchen1-5/+2
Convert platform_get_resource_byname(),devm_ioremap_resource() to a single call to devm_platform_ioremap_resource_byname(), as this is exactly what this function does. Signed-off-by: Ye Xingchen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]>
2023-04-28rtc: sunplus: use devm_platform_ioremap_resource_byname()Ye Xingchen1-2/+1
Convert platform_get_resource_byname(),devm_ioremap_resource() to a single call to devm_platform_ioremap_resource_byname(), as this is exactly what this function does. Signed-off-by: Ye Xingchen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]>
2023-04-28rtc: jz4740: Make sure clock provider gets removedLars-Peter Clausen1-1/+2
The jz4740 RTC driver registers a clock provider, but never removes it. This leaves a stale clock provider behind that references freed clocks when the device is unbound. Use the managed `devm_of_clk_add_hw_provider()` instead of `of_clk_add_hw_provider()` to make sure the provider gets automatically removed on unbind. Fixes: 5ddfa148de8c ("rtc: jz4740: Register clock provider for the CLK32K pin") Signed-off-by: Lars-Peter Clausen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexandre Belloni <[email protected]>
2023-04-27Merge tag 'mm-nonmm-stable-2023-04-27-16-01' of ↵Linus Torvalds68-377/+1032
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Mainly singleton patches all over the place. Series of note are: - updates to scripts/gdb from Glenn Washburn - kexec cleanups from Bjorn Helgaas" * tag 'mm-nonmm-stable-2023-04-27-16-01' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (50 commits) mailmap: add entries for Paul Mackerras libgcc: add forward declarations for generic library routines mailmap: add entry for Oleksandr ocfs2: reduce ioctl stack usage fs/proc: add Kthread flag to /proc/$pid/status ia64: fix an addr to taddr in huge_pte_offset() checkpatch: introduce proper bindings license check epoll: rename global epmutex scripts/gdb: add GDB convenience functions $lx_dentry_name() and $lx_i_dentry() scripts/gdb: create linux/vfs.py for VFS related GDB helpers uapi/linux/const.h: prefer ISO-friendly __typeof__ delayacct: track delays from IRQ/SOFTIRQ scripts/gdb: timerlist: convert int chunks to str scripts/gdb: print interrupts scripts/gdb: raise error with reduced debugging information scripts/gdb: add a Radix Tree Parser lib/rbtree: use '+' instead of '|' for setting color. proc/stat: remove arch_idle_time() checkpatch: check for misuse of the link tags checkpatch: allow Closes tags with links ...
2023-04-27Merge tag 'mm-stable-2023-04-27-15-30' of ↵Linus Torvalds306-7984/+11566
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of switching from a user process to a kernel thread. - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav. - zsmalloc performance improvements from Sergey Senozhatsky. - Yue Zhao has found and fixed some data race issues around the alteration of memcg userspace tunables. - VFS rationalizations from Christoph Hellwig: - removal of most of the callers of write_one_page() - make __filemap_get_folio()'s return value more useful - Luis Chamberlain has changed tmpfs so it no longer requires swap backing. Use `mount -o noswap'. - Qi Zheng has made the slab shrinkers operate locklessly, providing some scalability benefits. - Keith Busch has improved dmapool's performance, making part of its operations O(1) rather than O(n). - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd, permitting userspace to wr-protect anon memory unpopulated ptes. - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather than exclusive, and has fixed a bunch of errors which were caused by its unintuitive meaning. - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature, which causes minor faults to install a write-protected pte. - Vlastimil Babka has done some maintenance work on vma_merge(): cleanups to the kernel code and improvements to our userspace test harness. - Cleanups to do_fault_around() by Lorenzo Stoakes. - Mike Rapoport has moved a lot of initialization code out of various mm/ files and into mm/mm_init.c. - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for DRM, but DRM doesn't use it any more. - Lorenzo has also coverted read_kcore() and vread() to use iterators and has thereby removed the use of bounce buffers in some cases. - Lorenzo has also contributed further cleanups of vma_merge(). - Chaitanya Prakash provides some fixes to the mmap selftesting code. - Matthew Wilcox changes xfs and afs so they no longer take sleeping locks in ->map_page(), a step towards RCUification of pagefaults. - Suren Baghdasaryan has improved mmap_lock scalability by switching to per-VMA locking. - Frederic Weisbecker has reworked the percpu cache draining so that it no longer causes latency glitches on cpu isolated workloads. - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig logic. - Liu Shixin has changed zswap's initialization so we no longer waste a chunk of memory if zswap is not being used. - Yosry Ahmed has improved the performance of memcg statistics flushing. - David Stevens has fixed several issues involving khugepaged, userfaultfd and shmem. - Christoph Hellwig has provided some cleanup work to zram's IO-related code paths. - David Hildenbrand has fixed up some issues in the selftest code's testing of our pte state changing. - Pankaj Raghav has made page_endio() unneeded and has removed it. - Peter Xu contributed some rationalizations of the userfaultfd selftests. - Yosry Ahmed has fixed an issue around memcg's page recalim accounting. - Chaitanya Prakash has fixed some arm-related issues in the selftests/mm code. - Longlong Xia has improved the way in which KSM handles hwpoisoned pages. - Peter Xu fixes a few issues with uffd-wp at fork() time. - Stefan Roesch has changed KSM so that it may now be used on a per-process and per-cgroup basis. * tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm,unmap: avoid flushing TLB in batch if PTE is inaccessible shmem: restrict noswap option to initial user namespace mm/khugepaged: fix conflicting mods to collapse_file() sparse: remove unnecessary 0 values from rc mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area() hugetlb: pte_alloc_huge() to replace huge pte_alloc_map() maple_tree: fix allocation in mas_sparse_area() mm: do not increment pgfault stats when page fault handler retries zsmalloc: allow only one active pool compaction context selftests/mm: add new selftests for KSM mm: add new KSM process and sysfs knobs mm: add new api to enable ksm per process mm: shrinkers: fix debugfs file permissions mm: don't check VMA write permissions if the PTE/PMD indicates write permissions migrate_pages_batch: fix statistics for longterm pin retry userfaultfd: use helper function range_in_vma() lib/show_mem.c: use for_each_populated_zone() simplify code mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list() fs/buffer: convert create_page_buffers to folio_create_buffers fs/buffer: add folio_create_empty_buffers helper ...
2023-04-27docs nbd: userspace NBD now favors github over sourceforgeEric Blake1-1/+1
While the sourceforge site for userspace NBD still exists, the code repository moved to github several years ago. Then with a recent patch[1], the github landing page contains just as much information as the sourceforge page, so we might as well point to a single location that also provides the code. [1] https://lists.debian.org/nbd/2023/03/msg00051.html Signed-off-by: Eric Blake <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-04-27block nbd: use req.cookie instead of req.handleEric Blake1-3/+3
The NBD spec was recently changed [1] to refer to the opaque client identifier as a 'cookie' rather than a 'handle', but has for a much longer time listed it as a 64-bit value, and declares that all values in the NBD protocol are sent in network byte order (big-endian). Because the value is opaque to the server, it doesn't usually matter what endianness we send as the client - as long as we are consistent that either we byte-swap on both write and read, or on neither, then we can match server replies back to our requests. That said, our internal use of the cookie is as a 64-bit number (well, as two 32-bit numbers concatenated together), rather than as 8 individual bytes; so prior to this commit, we ARE leaking the native endianness of our internals as a client out to the server. We don't know of any server that will actually inspect the opaque value and behave differently depending on whether a little-endian or big-endian client is sending requests, but since we DO log the cookie value, a wireshark capture of the network traffic is easier to correlate back to the kernel traffic of a big-endian host (where the u64 and char[8] representations are the same) than of a little-endian host (where if wireshark honors the NBD spec and displays a u64 in network byte order, it is byte-swapped from what the kernel logged). The fix in this patch is thus two-part: it now consistently uses network byte order for the opaque value (no difference to a big-endian machine, but an extra byteswap on a little-endian machine; probably in the noise compared to the overhead of network traffic in general), and now uses a 64-bit integer instead of char[8] as its preferred access to the opaque value (direct assignment instead of memcpy()). Signed-off-by: Eric Blake <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-04-27uapi nbd: add cookie alias to handleEric Blake1-3/+9
The uapi <linux/nbd.h> header declares a 'char handle[8]' per request; which is overloaded in English (are you referring to "handle" the verb, such as handling a signal or writing a callback handler, or "handle" the noun, the value used in a lookup table to correlate a response back to the request). Many user-space NBD implementations (both servers and clients) have instead used 'uint64_t cookie' or similar, as it is easier to directly assign an integer than to futz around with memcpy. In fact, upstream documentation is now encouraging this shift in terminology: https://github.com/NetworkBlockDevice/nbd/commit/ca4392eb2b Accomplish this by use of an anonymous union to provide the alias for anyone getting the definition from the uapi; this does not break existing clients, while exposing the nicer name for those who prefer it. Note that block/nbd.c still uses the term handle (in fact, it actually combines a 32-bit cookie and a 32-bit tag into the 64-bit handle), but that internal usage is not changed by the public uapi, since no compliant NBD server has any reason to inspect or alter the 64 bits sent over the socket. Signed-off-by: Eric Blake <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-04-27uapi nbd: improve doc links to userspace specEric Blake1-3/+12
The uapi <linux/nbd.h> header intentionally documents only the NBD server features that the kernel module will utilize as a client. But while it already had one mention of skipped bits due to userspace extensions, it did not actually direct the reader to the canonical source to learn about those extensions. While touching comments, fix an outdated reference that listed only READ and WRITE as commands. Signed-off-by: Eric Blake <[email protected]> Reviewed-by: Ming Lei <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2023-04-27Merge tag 'mips_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds77-1356/+657
Pull MIPS updates from Thomas Bogendoerfer: - added support for Huawei B593u-12 - added support for virt board aligned to QEMU MIPS virt board - added support for doing DMA coherence on a per device base - reworked handling of RALINK SoCs - cleanup for Loongon64 barriers - removed deprecated support for MIPS_CMP SMP handling method - removed support Sibyte CARMEL and CHRINE boards - cleanups and fixes * tag 'mips_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (59 commits) MIPS: uprobes: Restore thread.trap_nr MIPS: Don't clear _PAGE_SPECIAL in _PAGE_CHG_MASK MIPS: Sink body of check_bugs_early() into its only call site MIPS: Mark check_bugs() as __init Revert "MIPS: generic: Enable all CPUs supported by virt board in Kconfig" MIPS: octeon_switch: Remove duplicated labels MIPS: loongson2ef: Add missing break in cs5536_isa MIPS: Remove set_swbp() in uprobes.c MIPS: Use def_bool y for ARCH_SUPPORTS_UPROBES MIPS: fw: Allow firmware to pass a empty env MIPS: Remove deprecated CONFIG_MIPS_CMP MIPS: lantiq: remove unused function declaration MIPS: Drop unused positional parameter in local_irq_{dis,en}able MIPS: mm: Remove local_cache_flush_page MIPS: Remove no longer used ide.h MIPS: mm: Remove unused *cache_page_indexed flush functions MIPS: generic: Enable all CPUs supported by virt board in Kconfig MIPS: Add board config for virt board MIPS: Octeon: Disable CVMSEG by default on other platforms MIPS: Loongson: Don't select platform features with CPU ...