aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-05-25ocfs2: dlmfs: don't clear USER_LOCK_ATTACHED when destroying lockJunxiao Bi1-1/+0
The following function is the only place that checks USER_LOCK_ATTACHED. This flag is set when lock request is granted through user_ast() and only the following function will clear it. Checking of this flag here is to make sure ocfs2_dlm_unlock is not issued if this lock is never granted. For example, lock file is created and then get removed, open file never happens. Clearing the flag here is not necessary because this is the only function that checks it, if another flow is executing user_dlm_destroy_lock(), it will bail out at the beginning because of USER_LOCK_IN_TEARDOWN and never check USER_LOCK_ATTACHED. Drop the clear, so we don't need take care of it for the following error handling patch. int user_dlm_destroy_lock(struct user_lock_res *lockres) { ... status = 0; if (!(lockres->l_flags & USER_LOCK_ATTACHED)) { spin_unlock(&lockres->l_lock); goto bail; } lockres->l_flags &= ~USER_LOCK_ATTACHED; lockres->l_flags |= USER_LOCK_BUSY; spin_unlock(&lockres->l_lock); status = ocfs2_dlm_unlock(conn, &lockres->l_lksb, DLM_LKF_VALBLK); if (status) { user_log_dlm_error("ocfs2_dlm_unlock", status, lockres); goto bail; } ... } V1 discussion with Joseph: https://lore.kernel.org/all/[email protected]/T/ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Junxiao Bi <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-19fs/ntfs: remove redundant variable idxColin Ian King1-2/+2
The variable idx is assigned a value and is never read. The variable is not used and is redundant, remove it. Cleans up clang scan build warning: warning: Although the value stored to 'idx' is used in the enclosing expression, the value is never actually read from 'idx' [deadcode.DeadStores] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Colin Ian King <[email protected]> Reviewed-by: Anton Altaparmakov <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-19fat: remove time truncations in vfat_create/vfat_mkdirChung-Chiang Cheng3-27/+0
All the timestamps in vfat_create() and vfat_mkdir() come from fat_time_fat2unix() which ensures time granularity. We don't need to truncate them to fit FAT's format. Moreover, fat_truncate_crtime() and fat_timespec64_trunc_10ms() are also removed because there is no caller anymore. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Chung-Chiang Cheng <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-19fat: report creation time in statxChung-Chiang Cheng3-5/+20
creation time is no longer mixed with change time. Add an in-memory field for it, and report it in statx if supported. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Chung-Chiang Cheng <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-19fat: ignore ctime updates, and keep ctime identical to mtime in memoryChung-Chiang Cheng2-10/+10
FAT supports creation time but not change time, and there was no corresponding timestamp for creation time in previous VFS. The original implementation took the compromise of saving the in-memory change time into the on-disk creation time field, but this would lead to compatibility issues with non-linux systems. To address this issue, this patch changes the behavior of ctime. It will no longer be loaded and stored from the creation time on disk. Instead of that, it'll be consistent with the in-memory mtime and share the same on-disk field. All updates to mtime will also be applied to ctime in memory, while all updates to ctime will be ignored. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Chung-Chiang Cheng <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-19fat: split fat_truncate_time() into separate functionsChung-Chiang Cheng2-27/+53
Separate fat_truncate_time() to each timestamps for later creation time work. This patch does not introduce any functional changes, it's merely refactoring change. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Chung-Chiang Cheng <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-19MAINTAINERS: add Muchun as a memcg reviewerMuchun Song1-0/+1
I have been focusing on mm for the past two years. e.g. developing, fixing bugs, reviewing. I have fixed lots of races (including memcg). I would like to help people working on memcg or related by reviewing their work. Let me be Cc'd on patches related to memcg. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muchun Song <[email protected]> Acked-by: Shakeel Butt <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Johannes Weiner <[email protected]> Acked-by: Roman Gushchin <[email protected]> Acked-by: FanJun Kong <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13proc/sysctl: make protected_* world readableJulius Hemanth Pitti1-4/+4
protected_* files have 600 permissions which prevents non-superuser from reading them. Container like "AWS greengrass" refuse to launch unless protected_hardlinks and protected_symlinks are set. When containers like these run with "userns-remap" or "--user" mapping container's root to non-superuser on host, they fail to run due to denied read access to these files. As these protections are hardly a secret, and do not possess any security risk, making them world readable. Though above greengrass usecase needs read access to only protected_hardlinks and protected_symlinks files, setting all other protected_* files to 644 to keep consistency. Link: http://lkml.kernel.org/r/[email protected] Fixes: 800179c9b8a1 ("fs: add link restrictions") Signed-off-by: Julius Hemanth Pitti <[email protected]> Acked-by: Kees Cook <[email protected]> Acked-by: Luis Chamberlain <[email protected]> Cc: Iurii Zaikin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-12ia64: mca: drop redundant spinlock initializationHaowen Bai1-1/+0
mlogbuf_rlock has declared and initialized by DEFINE_SPINLOCK, so we don't need to spin_lock_init again, drop it. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Haowen Bai <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-12tty: fix deadlock caused by calling printk() under tty_port->lockQi Zheng1-1/+2
pty_write() invokes kmalloc() which may invoke a normal printk() to print failure message. This can cause a deadlock in the scenario reported by syz-bot below: CPU0 CPU1 CPU2 ---- ---- ---- lock(console_owner); lock(&port_lock_key); lock(&port->lock); lock(&port_lock_key); lock(&port->lock); lock(console_owner); As commit dbdda842fe96 ("printk: Add console owner and waiter logic to load balance console writes") said, such deadlock can be prevented by using printk_deferred() in kmalloc() (which is invoked in the section guarded by the port->lock). But there are too many printk() on the kmalloc() path, and kmalloc() can be called from anywhere, so changing printk() to printk_deferred() is too complicated and inelegant. Therefore, this patch chooses to specify __GFP_NOWARN to kmalloc(), so that printk() will not be called, and this deadlock problem can be avoided. Syzbot reported the following lockdep error: ====================================================== WARNING: possible circular locking dependency detected 5.4.143-00237-g08ccc19a-dirty #10 Not tainted ------------------------------------------------------ syz-executor.4/29420 is trying to acquire lock: ffffffff8aedb2a0 (console_owner){....}-{0:0}, at: console_trylock_spinning kernel/printk/printk.c:1752 [inline] ffffffff8aedb2a0 (console_owner){....}-{0:0}, at: vprintk_emit+0x2ca/0x470 kernel/printk/printk.c:2023 but task is already holding lock: ffff8880119c9158 (&port->lock){-.-.}-{2:2}, at: pty_write+0xf4/0x1f0 drivers/tty/pty.c:120 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&port->lock){-.-.}-{2:2}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159 tty_port_tty_get drivers/tty/tty_port.c:288 [inline] <-- lock(&port->lock); tty_port_default_wakeup+0x1d/0xb0 drivers/tty/tty_port.c:47 serial8250_tx_chars+0x530/0xa80 drivers/tty/serial/8250/8250_port.c:1767 serial8250_handle_irq.part.0+0x31f/0x3d0 drivers/tty/serial/8250/8250_port.c:1854 serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1827 [inline] <-- lock(&port_lock_key); serial8250_default_handle_irq+0xb2/0x220 drivers/tty/serial/8250/8250_port.c:1870 serial8250_interrupt+0xfd/0x200 drivers/tty/serial/8250/8250_core.c:126 __handle_irq_event_percpu+0x109/0xa50 kernel/irq/handle.c:156 [...] -> #1 (&port_lock_key){-.-.}-{2:2}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159 serial8250_console_write+0x184/0xa40 drivers/tty/serial/8250/8250_port.c:3198 <-- lock(&port_lock_key); call_console_drivers kernel/printk/printk.c:1819 [inline] console_unlock+0x8cb/0xd00 kernel/printk/printk.c:2504 vprintk_emit+0x1b5/0x470 kernel/printk/printk.c:2024 <-- lock(console_owner); vprintk_func+0x8d/0x250 kernel/printk/printk_safe.c:394 printk+0xba/0xed kernel/printk/printk.c:2084 register_console+0x8b3/0xc10 kernel/printk/printk.c:2829 univ8250_console_init+0x3a/0x46 drivers/tty/serial/8250/8250_core.c:681 console_init+0x49d/0x6d3 kernel/printk/printk.c:2915 start_kernel+0x5e9/0x879 init/main.c:713 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241 -> #0 (console_owner){....}-{0:0}: [...] lock_acquire+0x127/0x340 kernel/locking/lockdep.c:4734 console_trylock_spinning kernel/printk/printk.c:1773 [inline] <-- lock(console_owner); vprintk_emit+0x307/0x470 kernel/printk/printk.c:2023 vprintk_func+0x8d/0x250 kernel/printk/printk_safe.c:394 printk+0xba/0xed kernel/printk/printk.c:2084 fail_dump lib/fault-inject.c:45 [inline] should_fail+0x67b/0x7c0 lib/fault-inject.c:144 __should_failslab+0x152/0x1c0 mm/failslab.c:33 should_failslab+0x5/0x10 mm/slab_common.c:1224 slab_pre_alloc_hook mm/slab.h:468 [inline] slab_alloc_node mm/slub.c:2723 [inline] slab_alloc mm/slub.c:2807 [inline] __kmalloc+0x72/0x300 mm/slub.c:3871 kmalloc include/linux/slab.h:582 [inline] tty_buffer_alloc+0x23f/0x2a0 drivers/tty/tty_buffer.c:175 __tty_buffer_request_room+0x156/0x2a0 drivers/tty/tty_buffer.c:273 tty_insert_flip_string_fixed_flag+0x93/0x250 drivers/tty/tty_buffer.c:318 tty_insert_flip_string include/linux/tty_flip.h:37 [inline] pty_write+0x126/0x1f0 drivers/tty/pty.c:122 <-- lock(&port->lock); n_tty_write+0xa7a/0xfc0 drivers/tty/n_tty.c:2356 do_tty_write drivers/tty/tty_io.c:961 [inline] tty_write+0x512/0x930 drivers/tty/tty_io.c:1045 __vfs_write+0x76/0x100 fs/read_write.c:494 [...] other info that might help us debug this: Chain exists of: console_owner --> &port_lock_key --> &port->lock Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: b6da31b2c07c ("tty: Fix data race in tty_insert_flip_string_fixed_flag") Signed-off-by: Qi Zheng <[email protected]> Acked-by: Jiri Slaby <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Cc: Akinobu Mita <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Steven Rostedt (Google) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-12relay: remove redundant assignment to pointer bufColin Ian King1-1/+1
Pointer buf is being assigned a value that is not being read, buf is being re-assigned in the next starement. The assignment is redundant and can be removed. Cleans up clang scan build warning: kernel/relay.c:443:8: warning: Although the value stored to 'buf' is used in the enclosing expression, the value is never actually read from 'buf' [deadcode.DeadStores] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Colin Ian King <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Kalle Valo <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-12fs/ntfs3: validate BOOT sectors_per_clustersRandy Dunlap1-3/+7
When the NTFS BOOT sectors_per_clusters field is > 0x80, it represents a shift value. Make sure that the shift value is not too large before using it (NTFS max cluster size is 2MB). Return -EVINVAL if it too large. This prevents negative shift values and shift values that are larger than the field size. Prevents this UBSAN error: UBSAN: shift-out-of-bounds in ../fs/ntfs3/super.c:673:16 shift exponent -192 is negative Link: https://lkml.kernel.org/r/[email protected] Fixes: 82cae269cfa9 ("fs/ntfs3: Add initialization of super block") Signed-off-by: Randy Dunlap <[email protected]> Reported-by: [email protected] Reviewed-by: Namjae Jeon <[email protected]> Cc: Konstantin Komarov <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Kari Argillander <[email protected]> Cc: Namjae Jeon <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-12lib/string_helpers: fix not adding strarray to device's resource listPuyou Lu1-0/+3
Add allocated strarray to device's resource list. This is a must to automatically release strarray when the device disappears. Without this fix we have a memory leak in the few drivers which use devm_kasprintf_strarray(). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: acdb89b6c87a ("lib/string_helpers: Introduce managed variant of kasprintf_strarray()") Signed-off-by: Puyou Lu <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Cc: Tejun Heo <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-12kernel/crash_core.c: remove redundant check of ck_cmdlinelizhe1-3/+0
At the end of get_last_crashkernel(), the judgement of ck_cmdline is obviously unnecessary and causes redundance, let's clean it up. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: lizhe <[email protected]> Acked-by: Baoquan He <[email protected]> Acked-by: Philipp Rudo <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Dave Young <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-12ELF, uapi: fixup ELF_ST_TYPE definitionAlexey Dobriyan1-1/+1
This is very theoretical compile failure: ELF_ST_TYPE(st_info = A) Cast will bind first and st_info will stop being lvalue: error: lvalue required as left operand of assignment Given that the only use of this macro is ELF_ST_TYPE(sym->st_info) where st_info is "unsigned char" I've decided to remove cast especially given that companion macro ELF_ST_BIND doesn't use cast. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alexey Dobriyan <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-09ipc/mqueue: use get_tree_nodev() in mqueue_get_tree()Waiman Long1-0/+14
When running the stress-ng clone benchmark with multiple testing threads, it was found that there were significant spinlock contention in sget_fc(). The contended spinlock was the sb_lock. It is under heavy contention because the following code in the critcal section of sget_fc(): hlist_for_each_entry(old, &fc->fs_type->fs_supers, s_instances) { if (test(old, fc)) goto share_extant_sb; } After testing with added instrumentation code, it was found that the benchmark could generate thousands of ipc namespaces with the corresponding number of entries in the mqueue's fs_supers list where the namespaces are the key for the search. This leads to excessive time in scanning the list for a match. Looking back at the mqueue calling sequence leading to sget_fc(): mq_init_ns() => mq_create_mount() => fc_mount() => vfs_get_tree() => mqueue_get_tree() => get_tree_keyed() => vfs_get_super() => sget_fc() Currently, mq_init_ns() is the only mqueue function that will indirectly call mqueue_get_tree() with a newly allocated ipc namespace as the key for searching. As a result, there will never be a match with the exising ipc namespaces stored in the mqueue's fs_supers list. So using get_tree_keyed() to do an existing ipc namespace search is just a waste of time. Instead, we could use get_tree_nodev() to eliminate the useless search. By doing so, we can greatly reduce the sb_lock hold time and avoid the spinlock contention problem in case a large number of ipc namespaces are present. Of course, if the code is modified in the future to allow mqueue_get_tree() to be called with an existing ipc namespace instead of a new one, we will have to use get_tree_keyed() in this case. The following stress-ng clone benchmark command was run on a 2-socket 48-core Intel system: ./stress-ng --clone 32 --verbose --oomable --metrics-brief -t 20 The "bogo ops/s" increased from 5948.45 before patch to 9137.06 after patch. This is an increase of 54% in performance. Link: https://lkml.kernel.org/r/[email protected] Fixes: 935c6912b198 ("ipc: Convert mqueue fs to fs_context") Signed-off-by: Waiman Long <[email protected]> Cc: Al Viro <[email protected]> Cc: David Howells <[email protected]> Cc: Manfred Spraul <[email protected]> Cc: Davidlohr Bueso <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-09ipc: update semtimedop() to use hrtimerPrakash Sangappa1-12/+11
semtimedop() should be converted to use hrtimer like it has been done for most of the system calls with timeouts. This system call already takes a struct timespec as an argument and can therefore provide finer granularity timed wait. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Prakash Sangappa <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Reviewed-by: Davidlohr Bueso <[email protected]> Reviewed-by: Manfred Spraul <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-09ipc/sem: remove redundant assignmentsMichal Orzel1-2/+0
Get rid of redundant assignments which end up in values not being read either because they are overwritten or the function ends. Reported by clang-tidy [deadcode.DeadStores] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Michal Orzel <[email protected]> Reviewed-by: Tom Rix <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-09initramfs: support cpio extraction with file checksumsDavid Disseldorp1-5/+24
Add support for extraction of checksum-enabled "070702" cpio archives, specified in Documentation/driver-api/early-userspace/buffer-format.rst. Fail extraction if the calculated file data checksum doesn't match the value carried in the header. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Disseldorp <[email protected]> Suggested-by: Matthew Wilcox (Oracle) <[email protected]> Cc: Al Viro <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Martin Wilck <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-09gen_init_cpio: support file checksum archivingDavid Disseldorp1-9/+45
Documentation/driver-api/early-userspace/buffer-format.rst includes the specification for checksum-enabled cpio archives. Implement support for this format in gen_init_cpio via a new '-c' parameter. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Disseldorp <[email protected]> Suggested-by: Matthew Wilcox (Oracle) <[email protected]> Cc: Al Viro <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Martin Wilck <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-09gen_init_cpio: fix short read file handlingDavid Disseldorp1-19/+25
When processing a "file" entry, gen_init_cpio attempts to allocate a buffer large enough to stage the entire contents of the source file. It then attempts to fill the buffer via a single read() call and subsequently writes out the entire buffer length, without checking that read() returned the full length, potentially writing uninitialized buffer memory. Fix this by breaking up file I/O into 64k chunks and only writing the length returned by the prior read() call. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Disseldorp <[email protected]> Reviewed-by: Martin Wilck <[email protected]> Cc: Al Viro <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-09initramfs: add INITRAMFS_PRESERVE_MTIME Kconfig optionDavid Disseldorp2-12/+26
initramfs cpio mtime preservation, as implemented in commit 889d51a10712 ("initramfs: add option to preserve mtime from initramfs cpio images"), uses a linked list to defer directory mtime processing until after all other items in the cpio archive have been processed. This is done to ensure that parent directory mtimes aren't overwritten via subsequent child creation. The lkml link below indicates that the mtime retention use case was for embedded devices with applications running exclusively out of initramfs, where the 32-bit mtime value provided a rough file version identifier. Linux distributions which discard an extracted initramfs immediately after the root filesystem has been mounted may want to avoid the unnecessary overhead. This change adds a new INITRAMFS_PRESERVE_MTIME Kconfig option, which can be used to disable on-by-default mtime retention and in turn speed up initramfs extraction, particularly for cpio archives with large directory counts. Benchmarks with a one million directory cpio archive extracted 20 times demonstrated: mean extraction time (s) std dev INITRAMFS_PRESERVE_MTIME=y 3.808 0.006 INITRAMFS_PRESERVE_MTIME unset 3.056 0.004 The above extraction times were measured using ftrace (initcall_finish - initcall_start) values for populate_rootfs() with initramfs_async disabled. [[email protected]: rebase atop dir_entry.name flexible array member and drop separate initramfs_mtime.h header] Link: https://lkml.org/lkml/2008/9/3/424 Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Disseldorp <[email protected]> Reviewed-by: Martin Wilck <[email protected]> Cc: Al Viro <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-09initramfs: make dir_entry.name a flexible array memberDavid Disseldorp1-4/+6
dir_entry.name is currently allocated via a separate kstrdup(). Change it to a flexible array member and allocate it along with struct dir_entry. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Disseldorp <[email protected]> Acked-by: Christian Brauner <[email protected]> Cc: Al Viro <[email protected]> Cc: Martin Wilck <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-09initramfs: refactor do_header() cpio magic checksDavid Disseldorp1-5/+4
Patch series "initramfs: "crc" cpio format and INITRAMFS_PRESERVE_MTIME", v7. This patchset does some minor initramfs refactoring and allows cpio entry mtime preservation to be disabled via a new Kconfig INITRAMFS_PRESERVE_MTIME option. Patches 4/6 to 6/6 implement support for creation and extraction of "crc" cpio archives, which carry file data checksums. Basic tests for this functionality can be found at https://github.com/rapido-linux/rapido/pull/163 This patch (of 6): do_header() is called for each cpio entry and fails if the first six bytes don't match "newc" magic. The magic check includes a special case error message if POSIX.1 ASCII (cpio -H odc) magic is detected. This special case POSIX.1 check can be nested under the "newc" mismatch code path to avoid calling memcmp() twice in a non-error case. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Disseldorp <[email protected]> Reviewed-by: Martin Wilck <[email protected]> Acked-by: Christian Brauner <[email protected]> Cc: Al Viro <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-09proc: fix dentry/inode overinstantiating under /proc/${pid}/netAlexey Dobriyan2-0/+6
When a process exits, /proc/${pid}, and /proc/${pid}/net dentries are flushed. However some leaf dentries like /proc/${pid}/net/arp_cache aren't. That's because respective PDEs have proc_misc_d_revalidate() hook which returns 1 and leaves dentries/inodes in the LRU. Force revalidation/lookup on everything under /proc/${pid}/net by inheriting proc_net_dentry_ops. [[email protected]: coding-style cleanups] Link: https://lkml.kernel.org/r/[email protected] Fixes: c6c75deda813 ("proc: fix lookup in /proc/net subdirectories after setns(2)") Signed-off-by: Alexey Dobriyan <[email protected]> Reported-by: hui li <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29fs: sysv: check sbi->s_firstdatazone in complete_read_superLiu Shixin1-1/+3
sbi->s_firstinodezone is initialized to 2 and sbi->s_firstdatazone is read from sbd. There's no guarantee that sbi->s_firstdatazone must bigger than sbi->s_firstinodezone. If sbi->s_firstdatazone less than 2, the filesystem can still be mounted unexpetly. At this point, sbi->s_ninodes flip to very large value and this filesystem is broken. We can observe this by executing 'df' command. When we execute, we will get an error message: "sysv_count_free_inodes: unable to read inode table" Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liu Shixin <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29kernel: make taskstats available from all net namespacesxu xin1-0/+1
If getdelays runs in a non-init network namespace, it will fail in getting delayacct stats even if it has privilege of root user, which seems to be not very reasonable. We can simply reproduce this by executing commands: unshare -n getdelays -d -p <pid> I don't think net namespace should be an obstacle to the normal execution of getdelay function. So let's make it available from all net namespaces. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: xu xin <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Yang Yang <[email protected]> Cc: "Dr. Thomas Orgis" <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: Ismael Luceno <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29taskstats: version 12 with thread group and exe infoDr. Thomas Orgis7-7/+473
The task exit struct needs some crucial information to be able to provide an enhanced version of process and thread accounting. This change provides: 1. ac_tgid in additon to ac_pid 2. thread group execution walltime in ac_tgetime 3. flag AGROUP in ac_flag to indicate the last task in a thread group / process 4. device ID and inode of task's /proc/self/exe in ac_exe_dev and ac_exe_inode 5. tools/accounting/procacct as demonstrator When a task exits, taskstats are reported to userspace including the task's pid and ppid, but without the id of the thread group this task is part of. Without the tgid, the stats of single tasks cannot be correlated to each other as a thread group (process). The taskstats documentation suggests that on process exit a data set consisting of accumulated stats for the whole group is produced. But such an additional set of stats is only produced for actually multithreaded processes, not groups that had only one thread, and also those stats only contain data about delay accounting and not the more basic information about CPU and memory resource usage. Adding the AGROUP flag to be set when the last task of a group exited enables determination of process end also for single-threaded processes. My applicaton basically does enhanced process accounting with summed cputime, biggest maxrss, tasks per process. The data is not available with the traditional BSD process accounting (which is not designed to be extensible) and the taskstats interface allows more efficient on-the-fly grouping and summing of the stats, anyway, without intermediate disk writes. Furthermore, I do carry statistics on which exact program binary is used how often with associated resources, getting a picture on how important which parts of a collection of installed scientific software in different versions are, and how well they put load on the machine. This is enabled by providing information on /proc/self/exe for each task. I assume the two 64-bit fields for device ID and inode are more appropriate than the possibly large resolved path to keep the data volume down. Add the tgid to the stats to complete task identification, the flag AGROUP to mark the last task of a group, the group wallclock time, and inode-based identification of the associated executable file. Add tools/accounting/procacct.c as a simplified fork of getdelays.c to demonstrate process and thread accounting. [[email protected]: fix version number in comment] Link: https://lkml.kernel.org/r/20220405003601.7a5f6008@plasteblaster Link: https://lkml.kernel.org/r/20220331004106.64e5616b@plasteblaster Signed-off-by: Dr. Thomas Orgis <[email protected]> Reviewed-by: Ismael Luceno <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: xu xin <[email protected]> Cc: Yang Yang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29rapidio: remove unnecessary use of list iteratorJakob Koschel1-2/+2
req->map is set in the valid case and always equals 'map' if the break was hit. It therefore is unnecessary to use the list iterator variable and the use of 'map' can be replaced with req->map. This is done in preparation to limit the scope of a list iterator to the list traversal loop [1]. Link: https://lore.kernel.org/all/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jakob Koschel <[email protected]> Reviewed-by: John Hubbard <[email protected]> Cc: Matt Porter <[email protected]> Cc: Alexandre Bounine <[email protected]> Cc: Kees Cook <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: "Brian Johannesmeyer" <[email protected]> Cc: Cristiano Giuffrida <[email protected]> Cc: "Bos, H.J." <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29kexec: remove redundant assignmentsMichal Orzel1-2/+0
Get rid of redundant assignments which end up in values not being read either because they are overwritten or the function ends. Reported by clang-tidy [deadcode.DeadStores] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Michal Orzel <[email protected]> Acked-by: Baoquan He <[email protected]> Cc: Eric Biederman <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Michal Orzel <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29MAINTAINERS: remove redundant file of PTRACE SUPPORT entryTiezhu Yang1-1/+0
In MAINTAINERS PTRACE SUPPORT entry, the file include/uapi/linux/ptrace.h is redundant, remove it. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Tiezhu Yang <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29ptrace: fix wrong comment of PT_DTRACETiezhu Yang1-1/+1
PT_DTRACE is only used on um now, fix the wrong comment. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Tiezhu Yang <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29ptrace: remove redudant check of #ifdef PTRACE_SINGLESTEPTiezhu Yang1-6/+0
Patch series "ptrace: do some cleanup". This patch (of 3): PTRACE_SINGLESTEP is always defined as 9 in include/uapi/linux/ptrace.h, remove redudant check of #ifdef PTRACE_SINGLESTEP. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Tiezhu Yang <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29fat: add ratelimit to fat*_ent_bread()OGAWA Hirofumi1-3/+4
fat*_ent_bread() can be the cause of too many report on I/O error path. So use fat_msg_ratelimit() instead. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: OGAWA Hirofumi <[email protected]> Reported-by: qianfan <[email protected]> Tested-by: qianfan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29fatfs: add FAT messages to printk indexJonathan Lassoff2-5/+18
In order for end users to quickly react to new issues that come up in production, it is proving useful to leverage the printk indexing system. This printk index enables kernel developers to use calls to printk() with changeable ad-hoc format strings (as they always have; no change of expectations), while enabling end users to examine format strings to detect changes. Since end users are using regular expressions to match messages printed through printk(), being able to detect changes in chosen format strings from release to release provides a useful signal to review printk()-matching regular expressions for any necessary updates. So that detailed FAT messages are captured by this printk index, this patch wraps fat_msg with a macro. [[email protected]: coding-style cleanups] Link: https://lkml.kernel.org/r/8aaa2dd7995e820292bb40d2120ab69756662c65.1648688136.git.jof@thejof.com Signed-off-by: Jonathan Lassoff <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Tested-by: Petr Mladek <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29fatfs: remove redundant judgmentYubo Feng1-4/+2
iput() has already judged the incoming parameter, so there is no need to repeat outside. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yubo Feng <[email protected]> Reported-by: Hulk Robot <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29init/Kconfig: remove USELIB syscall by defaultKees Cook1-2/+2
The uselib syscall has been long deprecated. There's no need to keep this enabled by default under X86_32. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Cc: Masahiro Yamada <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29list: fix a data-race around ep->rdllistKuniyuki Iwashima1-3/+3
ep_poll() first calls ep_events_available() with no lock held and checks if ep->rdllist is empty by list_empty_careful(), which reads rdllist->prev. Thus all accesses to it need some protection to avoid store/load-tearing. Note INIT_LIST_HEAD_RCU() already has the annotation for both prev and next. Commit bf3b9f6372c4 ("epoll: Add busy poll support to epoll with socket fds.") added the first lockless ep_events_available(), and commit c5a282e9635e ("fs/epoll: reduce the scope of wq lock in epoll_wait()") made some ep_events_available() calls lockless and added single call under a lock, finally commit e59d3c64cba6 ("epoll: eliminate unnecessary lock for zero timeout") made the last ep_events_available() lockless. BUG: KCSAN: data-race in do_epoll_wait / do_epoll_wait write to 0xffff88810480c7d8 of 8 bytes by task 1802 on cpu 0: INIT_LIST_HEAD include/linux/list.h:38 [inline] list_splice_init include/linux/list.h:492 [inline] ep_start_scan fs/eventpoll.c:622 [inline] ep_send_events fs/eventpoll.c:1656 [inline] ep_poll fs/eventpoll.c:1806 [inline] do_epoll_wait+0x4eb/0xf40 fs/eventpoll.c:2234 do_epoll_pwait fs/eventpoll.c:2268 [inline] __do_sys_epoll_pwait fs/eventpoll.c:2281 [inline] __se_sys_epoll_pwait+0x12b/0x240 fs/eventpoll.c:2275 __x64_sys_epoll_pwait+0x74/0x80 fs/eventpoll.c:2275 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff88810480c7d8 of 8 bytes by task 1799 on cpu 1: list_empty_careful include/linux/list.h:329 [inline] ep_events_available fs/eventpoll.c:381 [inline] ep_poll fs/eventpoll.c:1797 [inline] do_epoll_wait+0x279/0xf40 fs/eventpoll.c:2234 do_epoll_pwait fs/eventpoll.c:2268 [inline] __do_sys_epoll_pwait fs/eventpoll.c:2281 [inline] __se_sys_epoll_pwait+0x12b/0x240 fs/eventpoll.c:2275 __x64_sys_epoll_pwait+0x74/0x80 fs/eventpoll.c:2275 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0xffff88810480c7d0 -> 0xffff888103c15098 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 1799 Comm: syz-fuzzer Tainted: G W 5.17.0-rc7-syzkaller-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Link: https://lkml.kernel.org/r/[email protected] Fixes: e59d3c64cba6 ("epoll: eliminate unnecessary lock for zero timeout") Fixes: c5a282e9635e ("fs/epoll: reduce the scope of wq lock in epoll_wait()") Fixes: bf3b9f6372c4 ("epoll: Add busy poll support to epoll with socket fds.") Signed-off-by: Kuniyuki Iwashima <[email protected]> Reported-by: [email protected] Cc: Al Viro <[email protected]>, Andrew Morton <[email protected]> Cc: Kuniyuki Iwashima <[email protected]> Cc: Kuniyuki Iwashima <[email protected]> Cc: "Soheil Hassas Yeganeh" <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: "Sridhar Samudrala" <[email protected]> Cc: Alexander Duyck <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29pipe: make poll_usage boolean and annotate its accessKuniyuki Iwashima2-2/+2
Patch series "Fix data-races around epoll reported by KCSAN." This series suppresses a false positive KCSAN's message and fixes a real data-race. This patch (of 2): pipe_poll() runs locklessly and assigns 1 to poll_usage. Once poll_usage is set to 1, it never changes in other places. However, concurrent writes of a value trigger KCSAN, so let's make KCSAN happy. BUG: KCSAN: data-race in pipe_poll / pipe_poll write to 0xffff8880042f6678 of 4 bytes by task 174 on cpu 3: pipe_poll (fs/pipe.c:656) ep_item_poll.isra.0 (./include/linux/poll.h:88 fs/eventpoll.c:853) do_epoll_wait (fs/eventpoll.c:1692 fs/eventpoll.c:1806 fs/eventpoll.c:2234) __x64_sys_epoll_wait (fs/eventpoll.c:2246 fs/eventpoll.c:2241 fs/eventpoll.c:2241) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113) write to 0xffff8880042f6678 of 4 bytes by task 177 on cpu 1: pipe_poll (fs/pipe.c:656) ep_item_poll.isra.0 (./include/linux/poll.h:88 fs/eventpoll.c:853) do_epoll_wait (fs/eventpoll.c:1692 fs/eventpoll.c:1806 fs/eventpoll.c:2234) __x64_sys_epoll_wait (fs/eventpoll.c:2246 fs/eventpoll.c:2241 fs/eventpoll.c:2241) do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113) Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 177 Comm: epoll_race Not tainted 5.17.0-58927-gf443e374ae13 #6 Hardware name: Red Hat KVM, BIOS 1.11.0-2.amzn2 04/01/2014 Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 3b844826b6c6 ("pipe: avoid unnecessary EPOLLET wakeups under normal loads") Signed-off-by: Kuniyuki Iwashima <[email protected]> Cc: Alexander Duyck <[email protected]> Cc: Al Viro <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Kuniyuki Iwashima <[email protected]> Cc: "Soheil Hassas Yeganeh" <[email protected]> Cc: "Sridhar Samudrala" <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29lib: remove back_str initializationTom Rix1-1/+1
Clang static analysis reports this false positive glob.c:48:32: warning: Assigned value is garbage or undefined char const *back_pat = NULL, *back_str = back_str; ^~~~~~~~ ~~~~~~~~ back_str is set after back_pat and it's use is protected by the !back_pat check. It is not necessary to initialize back_str, so remove the initialization. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Tom Rix <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Cc: Nathan Chancellor <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29lib/string.c: simplify str[c]spnRasmus Villemoes1-19/+6
Use strchr(), which makes them a lot shorter, and more obviously symmetric in their treatment of accept/reject. It also saves a little bit of .text; bloat-o-meter for an arm build says Function old new delta strcspn 92 76 -16 strspn 108 76 -32 While here, also remove a stray empty line before EXPORT_SYMBOL(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Rasmus Villemoes <[email protected]> Cc: Andy Shevchenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29lib/test_string.c: add strspn and strcspn testsRasmus Villemoes1-0/+33
Before refactoring strspn() and strcspn(), add some simple test cases. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Rasmus Villemoes <[email protected]> Cc: Andy Shevchenko <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29lib/Kconfig.debug: remove more CONFIG_..._VALUE indirectionsRasmus Villemoes3-24/+3
As in "kernel/panic.c: remove CONFIG_PANIC_ON_OOPS_VALUE indirection", use the IS_ENABLED() helper rather than having a hidden config option. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Rasmus Villemoes <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Kees Cook <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29lib/test_meminit: optimize do_kmem_cache_rcu_persistent() testXiaoke Wang1-2/+10
To make the test more robust, there are the following changes: 1. add a check for the return value of kmem_cache_alloc(). 2. properly release the object `buf` on several error paths. 3. release the objects of `used_objects` if we never hit `saved_ptr`. 4. destroy the created cache by default. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Xiaoke Wang <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Marco Elver <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Xiaoke Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29get_maintainer: Honor mailmap for in file emailsRob Herring1-0/+1
Add support to also use the mailmap for 'in file' email addresses. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Rob Herring <[email protected]> Reported-by: Marc Zyngier <[email protected]> Acked-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29kernel: pid_namespace: use NULL instead of using plain integer as pointerHaowen Bai1-1/+1
This fixes the following sparse warnings: kernel/pid_namespace.c:55:77: warning: Using plain integer as NULL pointer Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Haowen Bai <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29net: unexport csum_and_copy_{from,to}_userChristoph Hellwig4-7/+0
csum_and_copy_from_user and csum_and_copy_to_user are exported by a few architectures, but not actually used in modular code. Drop the exports. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Acked-by: Michael Ellerman <[email protected]> (powerpc) Cc: David Miller <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29vmcore: convert read_from_oldmem() to take an iov_iterMatthew Wilcox (Oracle)3-32/+25
Remove the read_from_oldmem() wrapper introduced earlier and convert all the remaining callers to pass an iov_iter. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Baoquan He <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Amit Daniel Kachhap <[email protected]> Cc: Al Viro <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29vmcore: convert __read_vmcore to use an iov_iterMatthew Wilcox (Oracle)1-52/+30
This gets rid of copy_to() and let us use proc_read_iter() instead of proc_read(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Baoquan He <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-29vmcore: convert copy_oldmem_page() to take an iov_iterMatthew Wilcox (Oracle)12-260/+91
Patch series "Convert vmcore to use an iov_iter", v5. For some reason several people have been sending bad patches to fix compiler warnings in vmcore recently. Here's how it should be done. Compile-tested only on x86. As noted in the first patch, s390 should take this conversion a bit further, but I'm not inclined to do that work myself. This patch (of 3): Instead of passing in a 'buf' and 'userbuf' argument, pass in an iov_iter. s390 needs more work to pass the iov_iter down further, or refactor, but I'd be more comfortable if someone who can test on s390 did that work. It's more convenient to convert the whole of read_from_oldmem() to take an iov_iter at the same time, so rename it to read_from_oldmem_iter() and add a temporary read_from_oldmem() wrapper that creates an iov_iter. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Baoquan He <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Heiko Carstens <[email protected]> Signed-off-by: Andrew Morton <[email protected]>