aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2021-06-29dax: fix ENOMEM handling in grab_mapping_entry()Jan Kara1-1/+2
grab_mapping_entry() has a bug in handling of ENOMEM condition. Suppose we have a PMD entry at index i which we are downgrading to a PTE entry. grab_mapping_entry() will set pmd_downgrade to true, lock the entry, clear the entry in xarray, and decrement mapping->nrpages. The it will call: entry = dax_make_entry(pfn_to_pfn_t(0), flags); dax_lock_entry(xas, entry); which inserts new PTE entry into xarray. However this may fail allocating the new node. We handle this by: if (xas_nomem(xas, mapping_gfp_mask(mapping) & ~__GFP_HIGHMEM)) goto retry; however pmd_downgrade stays set to true even though 'entry' returned from get_unlocked_entry() will be NULL now. And we will go again through the downgrade branch. This is mostly harmless except that mapping->nrpages is decremented again and we temporarily have an invalid entry stored in xarray. Fix the problem by setting pmd_downgrade to false each time we lookup the entry we work with so that it matches the entry we found. Link: https://lkml.kernel.org/r/[email protected] Fixes: b15cd800682f ("dax: Convert page fault handlers to XArray") Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Dan Williams <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: "Aneesh Kumar K.V" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-06-29ocfs2: remove redundant initialization of variable retColin Ian King1-1/+1
The variable ret is being initialized with a value that is never read, the assignment is redundant and can be removed. Addresses-Coverity: ("Unused value") Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Colin Ian King <[email protected]> Acked-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-06-29ocfs2: replace simple_strtoull() with kstrtoull()Chen Huang1-2/+3
simple_strtoull() is deprecated in some situation since it does not check for the range overflow, use kstrtoull() instead. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Chen Huang <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Joseph Qi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-06-29ocfs2: remove repeated uptodate check for bufferWan Jiabing1-2/+1
In commit 60f91826ca62 ("buffer: Avoid setting buffer bits that are already set"), function set_buffer_##name was added a test_bit() to check buffer, which is the same as function buffer_##name. The !buffer_uptodate(bh) here is a repeated check. Remove it. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Wan Jiabing <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-06-29ocfs2: remove redundant assignment to pointer queueColin Ian King1-1/+1
The pointer queue is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Colin Ian King <[email protected]> Acked-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-06-29ocfs2: fix snprintf() checkingDan Carpenter2-11/+3
The snprintf() function returns the number of bytes which would have been printed if the buffer was large enough. In other words it can return ">= remain" but this code assumes it returns "== remain". The run time impact of this bug is not very severe. The next iteration through the loop would trigger a WARN() when we pass a negative limit to snprintf(). We would then return success instead of -E2BIG. The kernel implementation of snprintf() will never return negatives so there is no need to check and I have deleted that dead code. Link: https://lkml.kernel.org/r/20210511135350.GV1955@kadam Fixes: a860f6eb4c6a ("ocfs2: sysfile interfaces for online file check") Fixes: 74ae4e104dfc ("ocfs2: Create stack glue sysfs files.") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-06-29ocfs2: remove unnecessary INIT_LIST_HEAD()Yang Yingliang1-2/+0
The list_head o2hb_node_events is initialized statically. It is unnecessary to initialize by INIT_LIST_HEAD(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yang Yingliang <[email protected]> Reported-by: Hulk Robot <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Joseph Qi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-06-29squashfs: add option to panic on errorsVincent Whitchurch3-1/+91
Add an errors=panic mount option to make squashfs trigger a panic when errors are encountered, similar to several other filesystems. This allows a kernel dump to be saved using which the corruption can be analysed and debugged. Inspired by a pre-fs_context patch by Anton Eliasson. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vincent Whitchurch <[email protected]> Signed-off-by: Phillip Lougher <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-06-29ntfs: fix validity check for file name attributeDesmond Cheong Zhi Xi1-1/+1
When checking the file name attribute, we want to ensure that it fits within the bounds of ATTR_RECORD. To do this, we should check that (attr record + file name offset + file name length) < (attr record + attr record length). However, the original check did not include the file name offset in the calculation. This means that corrupted on-disk metadata might not caught by the incorrect file name check, and lead to an invalid memory access. An example can be seen in the crash report of a memory corruption error found by Syzbot: https://syzkaller.appspot.com/bug?id=a1a1e379b225812688566745c3e2f7242bffc246 Adding the file name offset to the validity check fixes this error and passes the Syzbot reproducer test. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Desmond Cheong Zhi Xi <[email protected]> Reported-by: [email protected] Tested-by: [email protected] Acked-by: Anton Altaparmakov <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-06-29Merge branch 'leases-devel'Trond Myklebust6-23/+125
Signed-off-by: Trond Myklebust <[email protected]>
2021-06-29NFSv4: setlease should return EAGAIN if locks are not availableTrond Myklebust1-2/+2
Instead of returning ENOLCK when we can't hand out a lease, we should be returning EAGAIN. Signed-off-by: Trond Myklebust <[email protected]>
2021-06-29NFS: nfs_find_open_context() may only select open filesTrond Myklebust1-0/+4
If a file has already been closed, then it should not be selected to support further I/O. Signed-off-by: Trond Myklebust <[email protected]> [Trond: Fix an invalid pointer deref reported by Colin Ian King]
2021-06-29gfs2: Clean up gfs2_unstuff_dinodeAndreas Gruenbacher5-36/+36
Split __gfs2_unstuff_inode off from gfs2_unstuff_dinode and clean up the code a little. All remaining callers now pass NULL as the page argument of gfs2_unstuff_dinode, so remove that argument. Signed-off-by: Andreas Gruenbacher <[email protected]>
2021-06-29gfs2: Unstuff before locking page in gfs2_page_mkwriteAndreas Gruenbacher1-10/+12
In gfs2_page_mkwrite, unstuff inodes before locking the page. That way, we won't have to pass in the locked page to gfs2_unstuff_inode, and gfs2_unstuff_inode can look up and lock the page itself. Signed-off-by: Andreas Gruenbacher <[email protected]>
2021-06-29gfs2: Clean up the error handling in gfs2_page_mkwriteAndreas Gruenbacher1-23/+40
We're setting an error number so that block_page_mkwrite_return translates it into the corresponding VM_FAULT_* code in several places, but this is getting confusing, so set the VM_FAULT_* codes directly instead. (No change in functionality.) Signed-off-by: Andreas Gruenbacher <[email protected]>
2021-06-28Merge branch 'for-linus' of ↵Linus Torvalds3-10/+14
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace rlimit handling update from Eric Biederman: "This is the work mainly by Alexey Gladkov to limit rlimits to the rlimits of the user that created a user namespace, and to allow users to have stricter limits on the resources created within a user namespace." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: cred: add missing return error code when set_cred_ucounts() failed ucounts: Silence warning in dec_rlimit_ucounts ucounts: Set ucount_max to the largest positive value the type can hold kselftests: Add test to check for rlimit changes in different user namespaces Reimplement RLIMIT_MEMLOCK on top of ucounts Reimplement RLIMIT_SIGPENDING on top of ucounts Reimplement RLIMIT_MSGQUEUE on top of ucounts Reimplement RLIMIT_NPROC on top of ucounts Use atomic_t for ucounts reference counting Add a reference to ucounts for each cred Increase size of ucounts to atomic_long_t
2021-06-28Merge tag 'fallthrough-fixes-clang-5.14-rc1' of ↵Linus Torvalds18-20/+23
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull fallthrough fixes from Gustavo Silva: "Fix many fall-through warnings when building with Clang 12.0.0 and '-Wimplicit-fallthrough' so that we at some point will be able to enable that warning by default" * tag 'fallthrough-fixes-clang-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (26 commits) rxrpc: Fix fall-through warnings for Clang drm/nouveau/clk: Fix fall-through warnings for Clang drm/nouveau/therm: Fix fall-through warnings for Clang drm/nouveau: Fix fall-through warnings for Clang xfs: Fix fall-through warnings for Clang xfrm: Fix fall-through warnings for Clang tipc: Fix fall-through warnings for Clang sctp: Fix fall-through warnings for Clang rds: Fix fall-through warnings for Clang net/packet: Fix fall-through warnings for Clang net: netrom: Fix fall-through warnings for Clang ide: Fix fall-through warnings for Clang hwmon: (max6621) Fix fall-through warnings for Clang hwmon: (corsair-cpro) Fix fall-through warnings for Clang firewire: core: Fix fall-through warnings for Clang braille_console: Fix fall-through warnings for Clang ipv4: Fix fall-through warnings for Clang qlcnic: Fix fall-through warnings for Clang bnxt_en: Fix fall-through warnings for Clang netxen_nic: Fix fall-through warnings for Clang ...
2021-06-28Merge tag 'pstore-v5.14-rc1' of ↵Linus Torvalds1-247/+156
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore updates from Kees Cook: "Use normal block device I/O path for pstore/blk. (Christoph Hellwig, Kees Cook, Pu Lehui)" * tag 'pstore-v5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: pstore/blk: Include zone in pstore_device_info pstore/blk: Fix kerndoc and redundancy on blkdev param pstore/blk: Use the normal block device I/O path pstore/blk: Move verify_size() macro out of function pstore/blk: Improve failure reporting
2021-06-28Merge tag 'for-5.14-tag' of ↵Linus Torvalds44-1413/+2028
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "A normal mix of improvements, core changes and features that user have been missing or complaining about. User visible changes: - new sysfs exports: - add sysfs knob to limit scrub IO bandwidth per device - device stats are also available in /sys/fs/btrfs/FSID/devinfo/DEVID/error_stats - support cancellable resize and device delete ioctls - change how the empty value is interpreted when setting a property, so far we have only 'btrfs.compression' and we need to distinguish a reset to defaults and setting "do not compress", in general the empty value will always mean 'reset to defaults' for any other property, for compression it's either 'no' or 'none' to forbid compression Performance improvements: - no need for full sync when truncation does not touch extents, reported run time change is -12% - avoid unnecessary logging of xattrs during fast fsyncs (+17% throughput, -17% runtime on xattr stress workload) Core: - preemptive flushing improvements and fixes - adjust clamping logic on multi-threaded workloads to avoid flushing too soon - take into account global block reserve, may help on almost full filesystems - continue flushing when there are enough pending delalloc and ordered bytes - simplify logic around conditional transaction commit, a workaround used in the past for throttling that's been superseded by ticket reservations that manage the throttling in a better way - subpage blocksize preparation: - submit read time repair only for each corrupted sector - scrub repair now works with sectors and not pages - free space cache (v1) works with sectors and not pages - more fine grained bio tracking for extents - subpage support in page callbacks, extent callbacks, end io callbacks - simplify transaction abort logic and always abort and don't check various potentially unreliable stats tracked by the transaction - exclusive operations can do more checks when started and allow eg. cancellation of the same running operation - ensure relocation never runs while we have send operations running, e.g. when zoned background auto reclaim starts Fixes: - zoned: more sanity checks of write pointer - improve error handling in delayed inodes - send: - fix invalid path for unlink operations after parent orphanization - fix crash when memory allocations trigger reclaim - skip compression of we have only one page (can't make things better) - empty value of a property newly means reset to default Other: - lots of cleanups, comment updates, yearly typo fixing - disable build on platforms having page size 256K" * tag 'for-5.14-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (101 commits) btrfs: remove unused btrfs_fs_info::total_pinned btrfs: rip out btrfs_space_info::total_bytes_pinned btrfs: rip the first_ticket_bytes logic from fail_all_tickets btrfs: remove FLUSH_DELAYED_REFS from data ENOSPC flushing btrfs: rip out may_commit_transaction btrfs: send: fix crash when memory allocations trigger reclaim btrfs: ensure relocation never runs while we have send operations running btrfs: shorten integrity checker extent data mount option btrfs: switch mount option bits to enums and use wider type btrfs: props: change how empty value is interpreted btrfs: compression: don't try to compress if we don't have enough pages btrfs: fix unbalanced unlock in qgroup_account_snapshot() btrfs: sysfs: export dev stats in devinfo directory btrfs: fix typos in comments btrfs: remove a stale comment for btrfs_decompress_bio() btrfs: send: use list_move_tail instead of list_del/list_add_tail btrfs: disable build on platforms having page size 256K btrfs: send: fix invalid path for unlink operations after parent orphanization btrfs: inline wait_current_trans_commit_start in its caller btrfs: sink wait_for_unblock parameter to async commit ...
2021-06-28Merge tag 'erofs-for-5.14-rc1' of ↵Linus Torvalds18-41/+3
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs updates from Gao Xiang: "No noticable change available for this cycle. Just a bugfix related to sb chksum feature, two minor cleanups and Chao's email address update: - fix wrong error code overwritten due to sb checksum feature - two minor cleanups - update Chao's email address" * tag 'erofs-for-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: MAINTAINERS: erofs: update my email address erofs: clean up file headers & footers erofs: remove the occupied parameter from z_erofs_pagevec_enqueue() erofs: fix error return code in erofs_read_superblock()
2021-06-28Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds2-15/+35
Pull fscrypt updates from Eric Biggers: "A couple bug fixes for fs/crypto/: - Fix handling of major dirhash values that happen to be 0. - Fix cases where keys were derived differently on big endian systems than on little endian systems (affecting some newer features only)" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: fix derivation of SipHash keys on big endian CPUs fscrypt: don't ignore minor_hash when hash is 0
2021-06-28once: implement DO_ONCE_LITE for non-fast-path "do once" functionalityTanner Love1-10/+3
Certain uses of "do once" functionality reside outside of fast path, and so do not require jump label patching via static keys, making existing DO_ONCE undesirable in such cases. Replace uses of __section(".data.once") with DO_ONCE_LITE(_IF)? This patch changes the return values of xfs_printk_once, printk_once, and printk_deferred_once. Before, they returned whether the print was performed, but now, they always return true. This is okay because the return values of the following macros are entirely ignored throughout the kernel: - xfs_printk_once - xfs_warn_once - xfs_notice_once - xfs_info_once - printk_once - pr_emerg_once - pr_alert_once - pr_crit_once - pr_err_once - pr_warn_once - pr_notice_once - pr_info_once - pr_devel_once - pr_debug_once - printk_deferred_once - orc_warn Changes v3: - Expand commit message to explain why changing return values of xfs_printk_once, printk_once, printk_deferred_once is benign v2: - Fix i386 build warnings Signed-off-by: Tanner Love <[email protected]> Acked-by: Eric Dumazet <[email protected]> Acked-by: Mahesh Bandewar <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-06-29ceph: take reference to req->r_parent at point of assignmentJeff Layton4-1/+11
Currently, we set the r_parent pointer but then don't take a reference to it until we submit the request. If we end up freeing the req before that point, then we'll do a iput when we shouldn't. Instead, take the inode reference in the callers, so that it's always safe to call ceph_mdsc_put_request on the req, even before submission. Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: Luis Henriques <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-29ceph: eliminate ceph_async_iput()Jeff Layton6-69/+25
Now that we don't need to hold session->s_mutex or the snap_rwsem when calling ceph_check_caps, we can eliminate ceph_async_iput and just use normal iput calls. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-29ceph: don't take s_mutex in ceph_flush_snapsJeff Layton2-14/+4
Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: Luis Henriques <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-29ceph: don't take s_mutex in try_flush_capsJeff Layton1-14/+2
The s_mutex doesn't protect anything in this codepath. Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: Luis Henriques <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-29ceph: don't take s_mutex or snap_rwsem in ceph_check_capsJeff Layton1-61/+11
These locks appear to be completely unnecessary. Almost all of this function is done under the inode->i_ceph_lock, aside from the actual sending of the message. Don't take either lock in this function. Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: Luis Henriques <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-29ceph: eliminate session->s_gen_ttl_lockJeff Layton5-29/+17
Turn s_cap_gen field into an atomic_t, and just rely on the fact that we hold the s_mutex when changing the s_cap_ttl field. Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: Luis Henriques <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-29ceph: allow ceph_put_mds_session to take NULL or ERR_PTRJeff Layton4-10/+8
...to simplify some error paths. Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: Luis Henriques <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-29ceph: clean up locking annotation for ceph_get_snap_realm and ↵Jeff Layton1-4/+4
__lookup_snap_realm They both say that the snap_rwsem must be held for write, but I don't see any real reason for it, and it's not currently always called that way. The lookup is just walking the rbtree, so holding it for read should be fine there. The "get" is bumping the refcount and (possibly) removing it from the empty list. I see no need to hold the snap_rwsem for write for that. Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: Ilya Dryomov <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-29ceph: add some lockdep assertions around snaprealm handlingJeff Layton1-0/+16
Turn some comments into lockdep asserts. Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: Ilya Dryomov <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-29ceph: decoding error in ceph_update_snap_realm should return -EIOJeff Layton1-1/+1
Currently ceph_update_snap_realm returns -EINVAL when it hits a decoding error, which is the wrong error code. -EINVAL implies that the user gave us a bogus argument to a syscall or something similar. -EIO is more descriptive when we hit a decoding error. Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: Ilya Dryomov <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-29ceph: add IO size metrics supportXiubo Li5-28/+119
This will collect IO's total size and then calculate the average size, and also will collect the min/max IO sizes. The debugfs will show the size metrics in bytes and will let the userspace applications to switch to what they need. URL: https://tracker.ceph.com/issues/49913 Signed-off-by: Xiubo Li <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-29ceph: update and rename __update_latency helper to __update_stdevXiubo Li1-21/+35
The new __update_stdev() helper will only compute the standard deviation. Signed-off-by: Xiubo Li <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-29ceph: simplify the metrics structXiubo Li2-78/+46
Signed-off-by: Xiubo Li <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-28ceph: make ceph_queue_cap_snap staticJeff Layton2-2/+1
Signed-off-by: Jeff Layton <[email protected]> Reviewed-by: Xiubo Li <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-28ceph: make ceph_netfs_read_ops staticWei Yongjun1-1/+1
The sparse tool complains as follows: fs/ceph/addr.c:316:37: warning: symbol 'ceph_netfs_read_ops' was not declared. Should it be static? This symbol is not used outside of addr.c, so mark it static. Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wei Yongjun <[email protected]> Reviewed-by: Ilya Dryomov <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-28ceph: remove bogus checks and WARN_ONs from ceph_set_page_dirtyJeff Layton1-9/+1
The checks for page->mapping are odd, as set_page_dirty is an address_space operation, and I don't see where it would be called on a non-pagecache page. The warning about the page lock also seems bogus. The comment over set_page_dirty() says that it can be called without the page lock in some rare cases. I don't think we want to warn if that's the case. Reported-by: Matthew Wilcox <[email protected]> Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2021-06-28Merge tag 'sched-core-2021-06-28' of ↵Linus Torvalds5-9/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler udpates from Ingo Molnar: - Changes to core scheduling facilities: - Add "Core Scheduling" via CONFIG_SCHED_CORE=y, which enables coordinated scheduling across SMT siblings. This is a much requested feature for cloud computing platforms, to allow the flexible utilization of SMT siblings, without exposing untrusted domains to information leaks & side channels, plus to ensure more deterministic computing performance on SMT systems used by heterogenous workloads. There are new prctls to set core scheduling groups, which allows more flexible management of workloads that can share siblings. - Fix task->state access anti-patterns that may result in missed wakeups and rename it to ->__state in the process to catch new abuses. - Load-balancing changes: - Tweak newidle_balance for fair-sched, to improve 'memcache'-like workloads. - "Age" (decay) average idle time, to better track & improve workloads such as 'tbench'. - Fix & improve energy-aware (EAS) balancing logic & metrics. - Fix & improve the uclamp metrics. - Fix task migration (taskset) corner case on !CONFIG_CPUSET. - Fix RT and deadline utilization tracking across policy changes - Introduce a "burstable" CFS controller via cgroups, which allows bursty CPU-bound workloads to borrow a bit against their future quota to improve overall latencies & batching. Can be tweaked via /sys/fs/cgroup/cpu/<X>/cpu.cfs_burst_us. - Rework assymetric topology/capacity detection & handling. - Scheduler statistics & tooling: - Disable delayacct by default, but add a sysctl to enable it at runtime if tooling needs it. Use static keys and other optimizations to make it more palatable. - Use sched_clock() in delayacct, instead of ktime_get_ns(). - Misc cleanups and fixes. * tag 'sched-core-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits) sched/doc: Update the CPU capacity asymmetry bits sched/topology: Rework CPU capacity asymmetry detection sched/core: Introduce SD_ASYM_CPUCAPACITY_FULL sched_domain flag psi: Fix race between psi_trigger_create/destroy sched/fair: Introduce the burstable CFS controller sched/uclamp: Fix uclamp_tg_restrict() sched/rt: Fix Deadline utilization tracking during policy change sched/rt: Fix RT utilization tracking during policy change sched: Change task_struct::state sched,arch: Remove unused TASK_STATE offsets sched,timer: Use __set_current_state() sched: Add get_current_state() sched,perf,kvm: Fix preemption condition sched: Introduce task_is_running() sched: Unbreak wakeups sched/fair: Age the average idle time sched/cpufreq: Consider reduced CPU capacity in energy calculation sched/fair: Take thermal pressure into account while estimating energy thermal/cpufreq_cooling: Update offline CPUs per-cpu thermal_pressure sched/fair: Return early from update_tg_cfs_load() if delta == 0 ...
2021-06-28writeback: fix obtain a reference to a freeing memcg cssMuchun Song1-2/+7
The caller of wb_get_create() should pin the memcg, because wb_get_create() relies on this guarantee. The rcu read lock only can guarantee that the memcg css returned by css_from_id() cannot be released, but the reference of the memcg can be zero. rcu_read_lock() memcg_css = css_from_id() wb_get_create(memcg_css) cgwb_create(memcg_css) // css_get can change the ref counter from 0 back to 1 css_get(memcg_css) rcu_read_unlock() Fix it by holding a reference to the css before calling wb_get_create(). This is not a problem I encountered in the real world. Just the result of a code review. Fixes: 682aa8e1a6a1 ("writeback: implement unlocked_inode_to_wb transaction and use it for stat updates") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Muchun Song <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Jan Kara <[email protected]>
2021-06-28f2fs: remove false alarm on iget failure during GCJaegeuk Kim1-3/+1
This patch removes setting SBI_NEED_FSCK when GC gets an error on f2fs_iget, since f2fs_iget can give ENOMEM and others by race condition. If we set this critical fsck flag, we'll get EIO during fsync via the below code path. In f2fs_inplace_write_data(), if (is_sbi_flag_set(sbi, SBI_NEED_FSCK) || f2fs_cp_error(sbi)) { err = -EIO; goto drop_bio; } Fixes: 9557727876674 ("f2fs: drop inplace IO if fs status is abnormal") Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-06-28f2fs: enable extent cache for compression files in read-onlyDaeho Jeong1-19/+20
Let's allow extent cache for RO partition. Signed-off-by: Daeho Jeong <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2021-06-28NFS: Remove unnecessary inode parameter from nfs_pageio_complete_read()Dave Wysochanski1-6/+5
Simplify nfs_pageio_complete_read() by using the inode pointer saved inside nfs_pageio_descriptor. Signed-off-by: Dave Wysochanski <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2021-06-28nfs: update has_sec_mnt_opts after cloning lsm options from parentScott Mayhew1-4/+8
After calling security_sb_clone_mnt_opts() in nfs_get_root(), it's necessary to copy the value of has_sec_mnt_opts from the cloned super_block's nfs_server. Otherwise, calls to nfs_compare_super() using this super_block may not return the correct result, leading to mount failures. For example, mounting an nfs server with the following in /etc/exports: /export *(rw,insecure,crossmnt,no_root_squash,security_label) and having /export/scratch on a separate block device. mount -o v4.2,context=system_u:object_r:root_t:s0 server:/export/test /mnt/test mount -o v4.2,context=system_u:object_r:swapfile_t:s0 server:/export/scratch /mnt/scratch The second mount would fail with "mount.nfs: /mnt/scratch is busy or already mounted or sharecache fail" and "SELinux: mount invalid. Same superblock, different security settings for..." would appear in the syslog. Also while we're in there, replace several instances of "NFS_SB(s)" with "server", which was already declared at the top of the nfs_get_root(). Fixes: ec1ade6a0448 ("nfs: account for selinux security context when deciding to share superblock") Signed-off-by: Scott Mayhew <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2021-06-28exfat: avoid incorrectly releasing for root inodeChen Li1-1/+1
In d_make_root, when we fail to allocate dentry for root inode, we will iput root inode and returned value is NULL in this function. So we do not need to release this inode again at d_make_root's caller. Signed-off-by: Chen Li <[email protected]> Signed-off-by: Namjae Jeon <[email protected]>
2021-06-28orangefs: fix orangefs df output.Mike Marshall1-1/+1
Orangefs df output is whacky. Walt Ligon suggested this might fix it. It seems way more in line with reality now... Signed-off-by: Mike Marshall <[email protected]>
2021-06-28orangefs: readahead adjustmentMike Marshall1-4/+3
Matthew Wilcox suggested that perhaps it could be "possible for rac->file to be NULL if the caller doesn't have a struct file"... Signed-off-by: Mike Marshall <[email protected]>
2021-06-28gfs2: Fix error handling in init_statfsAndreas Gruenbacher1-0/+1
On an error path, init_statfs calls iput(pn) after pn has already been put. Fix that by setting pn to NULL after the initial iput. Fixes: 97fd734ba17e ("gfs2: lookup local statfs inodes prior to journal recovery") Cc: [email protected] # v5.10+ Reported-by: Jing Xiangfeng <[email protected]> Signed-off-by: Andreas Gruenbacher <[email protected]>
2021-06-28gfs2: Fix underflow in gfs2_page_mkwriteAndreas Gruenbacher1-2/+2
On filesystems with a block size smaller than PAGE_SIZE and non-empty files smaller then PAGE_SIZE, gfs2_page_mkwrite could end up allocating excess blocks beyond the end of the file, similar to fallocate. This doesn't make sense; fix it. Reported-by: Bob Peterson <[email protected]> Fixes: 184b4e60853d ("gfs2: Fix end-of-file handling in gfs2_page_mkwrite") Cc: [email protected] # v5.5+ Signed-off-by: Andreas Gruenbacher <[email protected]>
2021-06-28gfs2: Use list_move_tail instead of list_del/list_add_tailBaokun Li1-2/+1
Using list_move_tail() instead of list_del() + list_add_tail(). Reported-by: Hulk Robot <[email protected]> Signed-off-by: Baokun Li <[email protected]> Signed-off-by: Andreas Gruenbacher <[email protected]>