aboutsummaryrefslogtreecommitdiff
path: root/fs/xfs
AgeCommit message (Collapse)AuthorFilesLines
2021-01-22xfs: lift writable fs check up into log worker taskBrian Foster1-10/+8
The log covering helper checks whether the filesystem is writable to determine whether to cover the log. The helper is currently only called from the background log worker. In preparation to reuse the helper from freezing contexts, lift the check into xfs_log_worker(). Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: sync lazy sb accounting on quiesce of read-only mountsBrian Foster3-10/+22
xfs_log_sbcount() syncs the superblock specifically to accumulate the in-core percpu superblock counters and commit them to disk. This is required to maintain filesystem consistency across quiesce (freeze, read-only mount/remount) or unmount when lazy superblock accounting is enabled because individual transactions do not update the superblock directly. This mechanism works as expected for writable mounts, but xfs_log_sbcount() skips the update for read-only mounts. Read-only mounts otherwise still allow log recovery and write out an unmount record during log quiesce. If a read-only mount performs log recovery, it can modify the in-core superblock counters and write an unmount record when the filesystem unmounts without ever syncing the in-core counters. This leaves the filesystem with a clean log but in an inconsistent state with regard to lazy sb counters. Update xfs_log_sbcount() to use the same logic xfs_log_unmount_write() uses to determine when to write an unmount record. This ensures that lazy accounting is always synced before the log is cleaned. Refactor this logic into a new helper to distinguish between a writable filesystem and a writable log. Specifically, the log is writable unless the filesystem is mounted with the norecovery mount option, the underlying log device is read-only, or the filesystem is shutdown. Drop the freeze state check because the update is already allowed during the freezing process and no context calls this function on an already frozen fs. Also, retain the shutdown check in xfs_log_unmount_write() to catch the case where the preceding log force might have triggered a shutdown. Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Gao Xiang <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Bill O'Donnell <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: set inode size after creating symlinkJeffrey Mitchell1-0/+1
When XFS creates a new symlink, it writes its size to disk but not to the VFS inode. This causes i_size_read() to return 0 for that symlink until it is re-read from disk, for example when the system is rebooted. I found this inconsistency while protecting directories with eCryptFS. The command "stat path/to/symlink/in/ecryptfs" will report "Size: 0" if the symlink was created after the last reboot on an XFS root. Call i_size_write() in xfs_symlink() Signed-off-by: Jeffrey Mitchell <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-01-22xfs: don't drain buffer lru on freeze and read-only remountBrian Foster3-7/+20
xfs_buftarg_drain() is called from xfs_log_quiesce() to ensure the buffer cache is reclaimed during unmount. xfs_log_quiesce() is also called from xfs_quiesce_attr(), however, which means that cache state is completely drained for filesystem freeze and read-only remount. While technically harmless, this is unnecessarily heavyweight. Both freeze and read-only mounts allow reads and thus allow population of the buffer cache. Therefore, the transitional sequence in either case really only needs to quiesce outstanding writes to return the filesystem in a generally read-only state. Additionally, some users have reported that attempts to freeze a filesystem concurrent with a read-heavy workload causes the freeze process to stall for a significant amount of time. This occurs because, as mentioned above, the read workload repopulates the buffer LRU while the freeze task attempts to drain it. To improve this situation, replace the drain in xfs_log_quiesce() with a buffer I/O quiesce and lift the drain into the unmount path. This removes buffer LRU reclaim from freeze and read-only [re]mount, but ensures the LRU is still drained before the filesystem unmounts. Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: rename xfs_wait_buftarg() to xfs_buftarg_drain()Brian Foster5-17/+17
xfs_wait_buftarg() is vaguely named and somewhat overloaded. Its primary purpose is to reclaim all buffers from the provided buffer target LRU. In preparation to refactor xfs_wait_buftarg() into serialization and LRU draining components, rename the function and associated helpers to something more descriptive. This patch has no functional changes with the minor exception of renaming a tracepoint. Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Fix assert failure in xfs_setattr_size()Yumei Huang1-1/+1
An assert failure is triggered by syzkaller test due to ATTR_KILL_PRIV is not cleared before xfs_setattr_size. As ATTR_KILL_PRIV is not checked/used by xfs_setattr_size, just remove it from the assert. Signed-off-by: Yumei Huang <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: fix up non-directory creation in SGID directoriesChristoph Hellwig1-7/+7
XFS always inherits the SGID bit if it is set on the parent inode, while the generic inode_init_owner does not do this in a few cases where it can create a possible security problem, see commit 0fa3ecd87848 ("Fix up non-directory creation in SGID directories") for details. Switch XFS to use the generic helper for the normal path to fix this, just keeping the simple field inheritance open coded for the case of the non-sgid case with the bsdgrpid mount option. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Christian Brauner <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: remove a stale comment from xfs_file_aio_write_checks()Eric Biggers1-6/+0
The comment in xfs_file_aio_write_checks() about calling file_modified() after dropping the ilock doesn't make sense, because the code that unconditionally acquires and drops the ilock was removed by commit 467f78992a07 ("xfs: reduce ilock hold times in xfs_file_aio_write_checks"). Remove this outdated comment. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Eric Biggers <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Introduce error injection to allocate only minlen size extents for filesChandan Babu R5-25/+159
This commit adds XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag which helps userspace test programs to get xfs_bmap_btalloc() to always allocate minlen sized extents. This is required for test programs which need a guarantee that minlen extents allocated for a file do not get merged with their existing neighbours in the inode's BMBT. "Inode fork extent overflow check" for Directories, Xattrs and extension of realtime inodes need this since the file offset at which the extents are being allocated cannot be explicitly controlled from userspace. One way to use this error tag is to, 1. Consume all of the free space by sequentially writing to a file. 2. Punch alternate blocks of the file. This causes CNTBT to contain sufficient number of one block sized extent records. 3. Inject XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag. After step 3, xfs_bmap_btalloc() will issue space allocation requests for minlen sized extents only. ENOSPC error code is returned to userspace when there aren't any "one block sized" extents left in any of the AGs. Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Process allocated extent in a separate functionChandan Babu R1-29/+45
This commit moves over the code in xfs_bmap_btalloc() which is responsible for processing an allocated extent to a new function. Apart from xfs_bmap_btalloc(), the new function will be invoked by another function introduced in a future commit. Reviewed-by: Allison Henderson <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Compute bmap extent alignments in a separate functionChandan Babu R1-37/+52
This commit moves over the code which computes stripe alignment and extent size hint alignment into a separate function. Apart from xfs_bmap_btalloc(), the new function will be used by another function introduced in a future commit. Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Remove duplicate assert statement in xfs_bmap_btalloc()Chandan Babu R1-1/+0
The check for verifying if the allocated extent is from an AG whose index is greater than or equal to that of tp->t_firstblock is already done a couple of statements earlier in the same function. Hence this commit removes the redundant assert statement. Reviewed-by: Allison Henderson <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Introduce error injection to reduce maximum inode fork extent countChandan Babu R3-1/+10
This commit adds XFS_ERRTAG_REDUCE_MAX_IEXTENTS error tag which enables userspace programs to test "Inode fork extent count overflow detection" by reducing maximum possible inode fork extent count to 10. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when swapping extentsChandan Babu R2-0/+23
Removing an initial range of source/donor file's extent and adding a new extent (from donor/source file) in its place will cause extent count to increase by 1. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when remapping an extentChandan Babu R1-0/+11
Remapping an extent involves unmapping the existing extent and mapping in the new extent. When unmapping, an extent containing the entire unmap range can be split into two extents, i.e. | Old extent | hole | Old extent | Hence extent count increases by 1. Mapping in the new extent into the destination file can increase the extent count by 1. Reviewed-by: Allison Henderson <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when moving extent from cow to data forkChandan Babu R2-0/+14
Moving an extent to data fork can cause a sub-interval of an existing extent to be unmapped. This will increase extent count by 1. Mapping in the new extent can increase the extent count by 1 again i.e. | Old extent | New extent | Old extent | Hence number of extents increases by 2. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when writing to unwritten extentChandan Babu R2-0/+14
A write to a sub-interval of an existing unwritten extent causes the original extent to be split into 3 extents i.e. | Unwritten | Real | Unwritten | Hence extent count can increase by 2. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when adding/removing xattrsChandan Babu R2-0/+23
Adding/removing an xattr can cause XFS_DA_NODE_MAXDEPTH extents to be added. One extra extent for dabtree in case a local attr is large enough to cause a double split. It can also cause extent count to increase proportional to the size of a remote xattr's value. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when renaming dir entriesChandan Babu R2-1/+46
A rename operation is essentially a directory entry remove operation from the perspective of parent directory (i.e. src_dp) of rename's source. Hence the only place where we check for extent count overflow for src_dp is in xfs_bmap_del_extent_real(). xfs_bmap_del_extent_real() returns -ENOSPC when it detects a possible extent count overflow and in response, the higher layers of directory handling code do the following: 1. Data/Free blocks: XFS lets these blocks linger until a future remove operation removes them. 2. Dabtree blocks: XFS swaps the blocks with the last block in the Leaf space and unmaps the last block. For target_dp, there are two cases depending on whether the destination directory entry exists or not. When destination directory entry does not exist (i.e. target_ip == NULL), extent count overflow check is performed only when transaction has a non-zero sized space reservation associated with it. With a zero-sized space reservation, XFS allows a rename operation to continue only when the directory has sufficient free space in its data/leaf/free space blocks to hold the new entry. When destination directory entry exists (i.e. target_ip != NULL), all we need to do is change the inode number associated with the already existing entry. Hence there is no need to perform an extent count overflow check. Signed-off-by: Chandan Babu R <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when removing dir entriesChandan Babu R1-0/+18
Directory entry removal must always succeed; Hence XFS does the following during low disk space scenario: 1. Data/Free blocks linger until a future remove operation. 2. Dabtree blocks would be swapped with the last block in the leaf space and then the new last block will be unmapped. This facility is reused during low inode extent count scenario i.e. this commit causes xfs_bmap_del_extent_real() to return -ENOSPC error code so that the above mentioned behaviour is exercised causing no change to the directory's extent count. Signed-off-by: Chandan Babu R <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when adding dir entriesChandan Babu R3-0/+28
Directory entry addition can cause the following, 1. Data block can be added/removed. A new extent can cause extent count to increase by 1. 2. Free disk block can be added/removed. Same behaviour as described above for Data block. 3. Dabtree blocks. XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these can be new extents. Hence extent count can increase by XFS_DA_NODE_MAXDEPTH. Signed-off-by: Chandan Babu R <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when punching a holeChandan Babu R3-6/+26
The extent mapping the file offset at which a hole has to be inserted will be split into two extents causing extent count to increase by 1. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when trivally adding a new extentChandan Babu R7-1/+41
When adding a new data extent (without modifying an inode's existing extents) the extent count increases only by 1. This commit checks for extent count overflow in such cases. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Add helper for checking per-inode extent count overflowChandan Babu R2-0/+25
XFS does not check for possible overflow of per-inode extent counter fields when adding extents to either data or attr fork. For e.g. 1. Insert 5 million xattrs (each having a value size of 255 bytes) and then delete 50% of them in an alternating manner. 2. On a 4k block sized XFS filesystem instance, the above causes 98511 extents to be created in the attr fork of the inode. xfsaild/loop0 2008 [003] 1475.127209: probe:xfs_inode_to_disk: (ffffffffa43fb6b0) if_nextents=98511 i_ino=131 3. The incore inode fork extent counter is a signed 32-bit quantity. However the on-disk extent counter is an unsigned 16-bit quantity and hence cannot hold 98511 extents. 4. The following incorrect value is stored in the attr extent counter, # xfs_db -f -c 'inode 131' -c 'print core.naextents' /dev/loop0 core.naextents = -32561 This commit adds a new helper function (i.e. xfs_iext_count_may_overflow()) to check for overflow of the per-inode data and xattr extent counters. Future patches will use this function to make sure that an FS operation won't cause the extent counter to overflow. Suggested-by: Darrick J. Wong <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: fix an ABBA deadlock in xfs_renameDarrick J. Wong3-20/+26
When overlayfs is running on top of xfs and the user unlinks a file in the overlay, overlayfs will create a whiteout inode and ask xfs to "rename" the whiteout file atop the one being unlinked. If the file being unlinked loses its one nlink, we then have to put the inode on the unlinked list. This requires us to grab the AGI buffer of the whiteout inode to take it off the unlinked list (which is where whiteouts are created) and to grab the AGI buffer of the file being deleted. If the whiteout was created in a higher numbered AG than the file being deleted, we'll lock the AGIs in the wrong order and deadlock. Therefore, grab all the AGI locks we think we'll need ahead of time, and in order of increasing AG number per the locking rules. Reported-by: wenli xie <[email protected]> Fixes: 93597ae8dac0 ("xfs: Fix deadlock between AGI and AGF when target_ip exists in xfs_rename()") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-01-20mm: Cleanup faultaround and finish_fault() codepathsKirill A. Shutemov1-2/+4
alloc_set_pte() has two users with different requirements: in the faultaround code, it called from an atomic context and PTE page table has to be preallocated. finish_fault() can sleep and allocate page table as needed. PTL locking rules are also strange, hard to follow and overkill for finish_fault(). Let's untangle the mess. alloc_set_pte() has gone now. All locking is explicit. The price is some code duplication to handle huge pages in faultaround path, but it should be fine, having overall improvement in readability. Link: https://lore.kernel.org/r/20201229132819.najtavneutnf7ajp@box Signed-off-by: Kirill A. Shutemov <[email protected]> [will: s/from from/from/ in comment; spotted by willy] Signed-off-by: Will Deacon <[email protected]>
2020-12-18Merge tag 'xfs-5.11-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds48-705/+695
Pull xfs updates from Darrick Wong: "In this release we add the ability to set a 'needsrepair' flag indicating that we /know/ the filesystem requires xfs_repair, but other than that, it's the usual strengthening of metadata validation and miscellaneous cleanups. Summary: - Introduce a "needsrepair" "feature" to flag a filesystem as needing a pass through xfs_repair. This is key to enabling filesystem upgrades (in xfs_db) that require xfs_repair to make minor adjustments to metadata. - Refactor parameter checking of recovered log intent items so that we actually use the same validation code as them that generate the intent items. - Various fixes to online scrub not reacting correctly to directory entries pointing to inodes that cannot be igetted. - Refactor validation helpers for data and rt volume extents. - Refactor XFS_TRANS_DQ_DIRTY out of existence. - Fix a longstanding bug where mounting with "uqnoenforce" would start user quotas in non-enforcing mode but /proc/mounts would display "usrquota", implying that they are being enforced. - Don't flag dax+reflink inodes as corruption since that is a valid (but not fully functional) combination right now. - Clean up raid stripe validation functions. - Refactor the inode allocation code to be more straightforward. - Small prep cleanup for idmapping support. - Get rid of the xfs_buf_t typedef" * tag 'xfs-5.11-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (40 commits) xfs: remove xfs_buf_t typedef fs/xfs: convert comma to semicolon xfs: open code updating i_mode in xfs_set_acl xfs: remove xfs_vn_setattr_nonsize xfs: kill ialloced in xfs_dialloc() xfs: spilt xfs_dialloc() into 2 functions xfs: move xfs_dialloc_roll() into xfs_dialloc() xfs: move on-disk inode allocation out of xfs_ialloc() xfs: introduce xfs_dialloc_roll() xfs: convert noroom, okalloc in xfs_dialloc() to bool xfs: don't catch dax+reflink inodes as corruption in verifier xfs: fix the forward progress assertion in xfs_iwalk_run_callbacks xfs: remove unneeded return value check for *init_cursor() xfs: introduce xfs_validate_stripe_geometry() xfs: show the proper user quota options xfs: remove the unused XFS_B_FSB_OFFSET macro xfs: remove unnecessary null check in xfs_generic_create xfs: directly return if the delta equal to zero xfs: check tp->t_dqinfo value instead of the XFS_TRANS_DQ_DIRTY flag xfs: delete duplicated tp->t_dqinfo null check and allocation ...
2020-12-16xfs: remove xfs_buf_t typedefDave Chinner15-78/+78
Prepare for kernel xfs_buf alignment by getting rid of the xfs_buf_t typedef from userspace. [darrick: This patch is a port of a userspace patch removing the xfs_buf_t typedef in preparation to make the userspace xfs_buf code behave more like its kernel counterpart.] Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]>
2020-12-16Merge tag 'for-5.11/block-2020-12-14' of git://git.kernel.dk/linux-blockLinus Torvalds1-5/+2
Pull block updates from Jens Axboe: "Another series of killing more code than what is being added, again thanks to Christoph's relentless cleanups and tech debt tackling. This contains: - blk-iocost improvements (Baolin Wang) - part0 iostat fix (Jeffle Xu) - Disable iopoll for split bios (Jeffle Xu) - block tracepoint cleanups (Christoph Hellwig) - Merging of struct block_device and hd_struct (Christoph Hellwig) - Rework/cleanup of how block device sizes are updated (Christoph Hellwig) - Simplification of gendisk lookup and removal of block device aliasing (Christoph Hellwig) - Block device ioctl cleanups (Christoph Hellwig) - Removal of bdget()/blkdev_get() as exported API (Christoph Hellwig) - Disk change rework, avoid ->revalidate_disk() (Christoph Hellwig) - sbitmap improvements (Pavel Begunkov) - Hybrid polling fix (Pavel Begunkov) - bvec iteration improvements (Pavel Begunkov) - Zone revalidation fixes (Damien Le Moal) - blk-throttle limit fix (Yu Kuai) - Various little fixes" * tag 'for-5.11/block-2020-12-14' of git://git.kernel.dk/linux-block: (126 commits) blk-mq: fix msec comment from micro to milli seconds blk-mq: update arg in comment of blk_mq_map_queue blk-mq: add helper allocating tagset->tags Revert "block: Fix a lockdep complaint triggered by request queue flushing" nvme-loop: use blk_mq_hctx_set_fq_lock_class to set loop's lock class blk-mq: add new API of blk_mq_hctx_set_fq_lock_class block: disable iopoll for split bio block: Improve blk_revalidate_disk_zones() checks sbitmap: simplify wrap check sbitmap: replace CAS with atomic and sbitmap: remove swap_lock sbitmap: optimise sbitmap_deferred_clear() blk-mq: skip hybrid polling if iopoll doesn't spin blk-iocost: Factor out the base vrate change into a separate function blk-iocost: Factor out the active iocgs' state check into a separate function blk-iocost: Move the usage ratio calculation to the correct place blk-iocost: Remove unnecessary advance declaration blk-iocost: Fix some typos in comments blktrace: fix up a kerneldoc comment block: remove the request_queue to argument request based tracepoints ...
2020-12-12fs/xfs: convert comma to semicolonZheng Yongjun1-1/+1
Replace a comma between expression statements by a semicolon. Signed-off-by: Zheng Yongjun <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Eric Sandeen <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-12xfs: open code updating i_mode in xfs_set_aclChristoph Hellwig3-31/+27
Rather than going through the big and hairy xfs_setattr_nonsize function, just open code a transactional i_mode and i_ctime update. This allows to mark xfs_setattr_nonsize and remove the flags argument to it. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Gao Xiang <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-12xfs: remove xfs_vn_setattr_nonsizeChristoph Hellwig2-20/+7
Merge xfs_vn_setattr_nonsize into the only caller. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Gao Xiang <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-12xfs: kill ialloced in xfs_dialloc()Gao Xiang1-13/+9
It's enough to just use return code, and get rid of an argument. Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-12xfs: spilt xfs_dialloc() into 2 functionsDave Chinner3-37/+48
This patch explicitly separates free inode chunk allocation and inode allocation into two individual high level operations. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-12xfs: move xfs_dialloc_roll() into xfs_dialloc()Dave Chinner3-94/+24
Get rid of the confusing ialloc_context and failure handling around xfs_dialloc() by moving xfs_dialloc_roll() into xfs_dialloc(). Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-12xfs: move on-disk inode allocation out of xfs_ialloc()Dave Chinner3-161/+86
So xfs_ialloc() will only address in-core inode allocation then, Also, rename xfs_ialloc() to xfs_dir_ialloc_init() in order to keep everything in xfs_inode.c under the same namespace. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-12xfs: introduce xfs_dialloc_roll()Dave Chinner3-30/+41
Introduce a helper to make the on-disk inode allocation rolling logic clearer in preparation of the following cleanup. Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-12xfs: convert noroom, okalloc in xfs_dialloc() to boolGao Xiang1-4/+4
Boolean is preferred for such use. Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: don't catch dax+reflink inodes as corruption in verifierEric Sandeen2-8/+0
We don't yet support dax on reflinked files, but that is in the works. Further, having the flag set does not automatically mean that the inode is actually "in the CPU direct access state," which depends on several other conditions in addition to the flag being set. As such, we should not catch this as corruption in the verifier - simply not actually enabling S_DAX on reflinked files is enough for now. Fixes: 4f435ebe7d04 ("xfs: don't mix reflink and DAX mode for now") Signed-off-by: Eric Sandeen <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> [darrick: fix the scrubber too] Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: fix the forward progress assertion in xfs_iwalk_run_callbacksDarrick J. Wong1-1/+1
In commit 27c14b5daa82 we started tracking the last inode seen during an inode walk to avoid infinite loops if a corrupt inobt record happens to have a lower ir_startino than the record preceeding it. Unfortunately, the assertion trips over the case where there are completely empty inobt records (which can happen quite easily on 64k page filesystems) because we advance the tracking cursor without actually putting the empty record into the processing buffer. Fix the assert to allow for this case. Reported-by: [email protected] Fixes: 27c14b5daa82 ("xfs: ensure inobt record walks always make forward progress") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Zorro Lang <[email protected]> Reviewed-by: Dave Chinner <[email protected]>
2020-12-09xfs: remove unneeded return value check for *init_cursor()Joseph Qi7-46/+0
Since *init_cursor() can always return a valid cursor, the NULL check in caller is unneeded. So clean them up. This also keeps the behavior consistent with other callers. Signed-off-by: Joseph Qi <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: introduce xfs_validate_stripe_geometry()Gao Xiang2-11/+69
Introduce a common helper to consolidate stripe validation process. Also make kernel code xfs_validate_sb_common() use it first. Signed-off-by: Gao Xiang <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: show the proper user quota optionsKaixu Xia1-4/+6
The quota option 'usrquota' should be shown if both the XFS_UQUOTA_ACCT and XFS_UQUOTA_ENFD flags are set. The option 'uqnoenforce' should be shown when only the XFS_UQUOTA_ACCT flag is set. The current code logic seems wrong, Fix it and show proper options. Signed-off-by: Kaixu Xia <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: remove the unused XFS_B_FSB_OFFSET macroKaixu Xia1-1/+0
There are no callers of the XFS_B_FSB_OFFSET macro, so remove it. Signed-off-by: Kaixu Xia <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: remove unnecessary null check in xfs_generic_createKaixu Xia1-4/+2
The function posix_acl_release() test the passed-in argument and move on only when it is non-null, so maybe the null check in xfs_generic_create is unnecessary. Signed-off-by: Kaixu Xia <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: directly return if the delta equal to zeroKaixu Xia1-14/+9
The xfs_trans_mod_dquot() function will allocate new tp->t_dqinfo if it is NULL and make the changes in the tp->t_dqinfo->dqs[XFS_QM_TRANS _{USR,GRP,PRJ}]. Nowadays seems none of the callers want to join the dquots to the transaction and push them to device when the delta is zero. Actually, most of time the caller would check the delta and go on only when the delta value is not zero, so we should bail out when it is zero. Signed-off-by: Kaixu Xia <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: check tp->t_dqinfo value instead of the XFS_TRANS_DQ_DIRTY flagKaixu Xia3-19/+3
Nowadays the only things that the XFS_TRANS_DQ_DIRTY flag seems to do are indicates the tp->t_dqinfo->dqs[XFS_QM_TRANS_{USR,GRP,PRJ}] values changed and check in xfs_trans_apply_dquot_deltas() and the unreserve variant xfs_trans_unreserve_and_mod_dquots(). Actually, we also can use the tp->t_dqinfo value instead of the XFS_TRANS_DQ_DIRTY flag, that is to say, we allocate the new tp->t_dqinfo only when the qtrx values changed, so the tp->t_dqinfo value isn't NULL equals the XFS_TRANS_DQ_DIRTY flag is set, we only need to check if tp->t_dqinfo == NULL in xfs_trans_apply_dquot_deltas() and its unreserve variant to determine whether lock all of the dquots and join them to the transaction. Signed-off-by: Kaixu Xia <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: delete duplicated tp->t_dqinfo null check and allocationKaixu Xia1-7/+0
The function xfs_trans_mod_dquot_byino() wraps around xfs_trans_mod_dquot() to account for quotas, and also there is the function call chain xfs_trans_reserve_quota_bydquots -> xfs_trans_dqresv -> xfs_trans_mod_dquot, both of them do the duplicated null check and allocation. Thus we can delete the duplicated operation from them. Signed-off-by: Kaixu Xia <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: rename xfs_fc_* back to xfs_fs_*Darrick J. Wong1-13/+13
Get rid of this one-off namespace since we're done converting things to fscontext now. Suggested-by: Dave Chinner <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2020-12-09xfs: refactor file range validationDarrick J. Wong8-5/+37
Refactor all the open-coded validation of file block ranges into a single helper, and teach the bmap scrubber to check the ranges. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>