aboutsummaryrefslogtreecommitdiff
path: root/fs/xfs/libxfs
AgeCommit message (Collapse)AuthorFilesLines
2021-02-11xfs: consider shutdown in bmapbt cursor delete assertBrian Foster1-21/+12
The assert in xfs_btree_del_cursor() checks that the bmapbt block allocation field has been handled correctly before the cursor is freed. This field is used for accurate calculation of indirect block reservation requirements (for delayed allocations), for example. generic/019 reproduces a scenario where this assert fails because the filesystem has shutdown while in the middle of a bmbt record insertion. This occurs after a bmbt block has been allocated via the cursor but before the higher level bmap function (i.e. xfs_bmap_add_extent_hole_real()) completes and resets the field. Update the assert to accommodate the transient state if the filesystem has shutdown. While here, clean up the indentation and comments in the function. Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-02-03libxfs: expose inobtcount in xfs geometryZorro Lang2-0/+3
As xfs supports the feature of inode btree block counters now, expose this feature flag in xfs geometry, for userspace can check if the inobtcnt is enabled or not. Signed-off-by: Zorro Lang <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-02-03xfs: allow reservation of rtblocks with xfs_trans_alloc_inodeDarrick J. Wong2-2/+2
Make it so that we can reserve rt blocks with the xfs_trans_alloc_inode wrapper function, then convert a few more callsites. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-02-03xfs: refactor common transaction/inode/quota allocation idiomDarrick J. Wong2-18/+3
Create a new helper xfs_trans_alloc_inode that allocates a transaction, locks and joins an inode to it, and then reserves the appropriate amount of quota against that transction. Then replace all the open-coded idioms with a single call to this helper. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-02-03xfs: reserve data and rt quota at the same timeDarrick J. Wong2-8/+2
Modify xfs_trans_reserve_quota_nblks so that we can reserve data and realtime blocks from the dquot at the same time. This change has the theoretical side effect that for allocations to realtime files we will reserve from the dquot both the number of rtblocks being allocated and the number of bmbt blocks that might be needed to add the mapping. However, since the mount code disables quota if it finds a realtime device, this should not result in any behavior changes. Now that we've moved the inode creation callers away from using the _nblks function, we can repurpose the (now unused) ninos argument for realtime blocks, so make that change. This also replaces the flags argument with a boolean parameter to force the reservation since we don't need to distinguish between data and rt quota reservations any more, and the only flag being passed in was FORCE_RES. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-02-03xfs: create convenience wrappers for incore quota block reservationsDarrick J. Wong1-6/+4
Create a couple of convenience wrappers for creating and deleting quota block reservations against future changes. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-02-03xfs: clean up quota reservation callsitesDarrick J. Wong1-2/+1
Convert a few xfs_trans_*reserve* callsites that are open-coding other convenience functions. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-02-01xfs: Fix 'set but not used' warning in xfs_bmap_compute_alignments()Chandan Babu R1-5/+4
With both CONFIG_XFS_DEBUG and CONFIG_XFS_WARN disabled, the only reference to local variable "error" in xfs_bmap_compute_alignments() gets eliminated during pre-processing stage of the compilation process. This causes the compiler to generate a "set but not used" warning. Reported-by: kernel test robot <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-01-22xfs: Introduce error injection to allocate only minlen size extents for filesChandan Babu R4-25/+156
This commit adds XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag which helps userspace test programs to get xfs_bmap_btalloc() to always allocate minlen sized extents. This is required for test programs which need a guarantee that minlen extents allocated for a file do not get merged with their existing neighbours in the inode's BMBT. "Inode fork extent overflow check" for Directories, Xattrs and extension of realtime inodes need this since the file offset at which the extents are being allocated cannot be explicitly controlled from userspace. One way to use this error tag is to, 1. Consume all of the free space by sequentially writing to a file. 2. Punch alternate blocks of the file. This causes CNTBT to contain sufficient number of one block sized extent records. 3. Inject XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag. After step 3, xfs_bmap_btalloc() will issue space allocation requests for minlen sized extents only. ENOSPC error code is returned to userspace when there aren't any "one block sized" extents left in any of the AGs. Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Process allocated extent in a separate functionChandan Babu R1-29/+45
This commit moves over the code in xfs_bmap_btalloc() which is responsible for processing an allocated extent to a new function. Apart from xfs_bmap_btalloc(), the new function will be invoked by another function introduced in a future commit. Reviewed-by: Allison Henderson <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Compute bmap extent alignments in a separate functionChandan Babu R1-37/+52
This commit moves over the code which computes stripe alignment and extent size hint alignment into a separate function. Apart from xfs_bmap_btalloc(), the new function will be used by another function introduced in a future commit. Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Remove duplicate assert statement in xfs_bmap_btalloc()Chandan Babu R1-1/+0
The check for verifying if the allocated extent is from an AG whose index is greater than or equal to that of tp->t_firstblock is already done a couple of statements earlier in the same function. Hence this commit removes the redundant assert statement. Reviewed-by: Allison Henderson <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Introduce error injection to reduce maximum inode fork extent countChandan Babu R2-1/+7
This commit adds XFS_ERRTAG_REDUCE_MAX_IEXTENTS error tag which enables userspace programs to test "Inode fork extent count overflow detection" by reducing maximum possible inode fork extent count to 10. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when swapping extentsChandan Babu R1-0/+7
Removing an initial range of source/donor file's extent and adding a new extent (from donor/source file) in its place will cause extent count to increase by 1. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when moving extent from cow to data forkChandan Babu R1-0/+9
Moving an extent to data fork can cause a sub-interval of an existing extent to be unmapped. This will increase extent count by 1. Mapping in the new extent can increase the extent count by 1 again i.e. | Old extent | New extent | Old extent | Hence number of extents increases by 2. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when writing to unwritten extentChandan Babu R1-0/+9
A write to a sub-interval of an existing unwritten extent causes the original extent to be split into 3 extents i.e. | Unwritten | Real | Unwritten | Hence extent count can increase by 2. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when adding/removing xattrsChandan Babu R2-0/+23
Adding/removing an xattr can cause XFS_DA_NODE_MAXDEPTH extents to be added. One extra extent for dabtree in case a local attr is large enough to cause a double split. It can also cause extent count to increase proportional to the size of a remote xattr's value. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when renaming dir entriesChandan Babu R1-0/+3
A rename operation is essentially a directory entry remove operation from the perspective of parent directory (i.e. src_dp) of rename's source. Hence the only place where we check for extent count overflow for src_dp is in xfs_bmap_del_extent_real(). xfs_bmap_del_extent_real() returns -ENOSPC when it detects a possible extent count overflow and in response, the higher layers of directory handling code do the following: 1. Data/Free blocks: XFS lets these blocks linger until a future remove operation removes them. 2. Dabtree blocks: XFS swaps the blocks with the last block in the Leaf space and unmaps the last block. For target_dp, there are two cases depending on whether the destination directory entry exists or not. When destination directory entry does not exist (i.e. target_ip == NULL), extent count overflow check is performed only when transaction has a non-zero sized space reservation associated with it. With a zero-sized space reservation, XFS allows a rename operation to continue only when the directory has sufficient free space in its data/leaf/free space blocks to hold the new entry. When destination directory entry exists (i.e. target_ip != NULL), all we need to do is change the inode number associated with the already existing entry. Hence there is no need to perform an extent count overflow check. Signed-off-by: Chandan Babu R <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when removing dir entriesChandan Babu R1-0/+18
Directory entry removal must always succeed; Hence XFS does the following during low disk space scenario: 1. Data/Free blocks linger until a future remove operation. 2. Dabtree blocks would be swapped with the last block in the leaf space and then the new last block will be unmapped. This facility is reused during low inode extent count scenario i.e. this commit causes xfs_bmap_del_extent_real() to return -ENOSPC error code so that the above mentioned behaviour is exercised causing no change to the directory's extent count. Signed-off-by: Chandan Babu R <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when adding dir entriesChandan Babu R1-0/+13
Directory entry addition can cause the following, 1. Data block can be added/removed. A new extent can cause extent count to increase by 1. 2. Free disk block can be added/removed. Same behaviour as described above for Data block. 3. Dabtree blocks. XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these can be new extents. Hence extent count can increase by XFS_DA_NODE_MAXDEPTH. Signed-off-by: Chandan Babu R <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when punching a holeChandan Babu R1-0/+7
The extent mapping the file offset at which a hole has to be inserted will be split into two extents causing extent count to increase by 1. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Check for extent overflow when trivally adding a new extentChandan Babu R2-0/+12
When adding a new data extent (without modifying an inode's existing extents) the extent count increases only by 1. This commit checks for extent count overflow in such cases. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: Add helper for checking per-inode extent count overflowChandan Babu R2-0/+25
XFS does not check for possible overflow of per-inode extent counter fields when adding extents to either data or attr fork. For e.g. 1. Insert 5 million xattrs (each having a value size of 255 bytes) and then delete 50% of them in an alternating manner. 2. On a 4k block sized XFS filesystem instance, the above causes 98511 extents to be created in the attr fork of the inode. xfsaild/loop0 2008 [003] 1475.127209: probe:xfs_inode_to_disk: (ffffffffa43fb6b0) if_nextents=98511 i_ino=131 3. The incore inode fork extent counter is a signed 32-bit quantity. However the on-disk extent counter is an unsigned 16-bit quantity and hence cannot hold 98511 extents. 4. The following incorrect value is stored in the attr extent counter, # xfs_db -f -c 'inode 131' -c 'print core.naextents' /dev/loop0 core.naextents = -32561 This commit adds a new helper function (i.e. xfs_iext_count_may_overflow()) to check for overflow of the per-inode data and xattr extent counters. Future patches will use this function to make sure that an FS operation won't cause the extent counter to overflow. Suggested-by: Darrick J. Wong <[email protected]> Reviewed-by: Allison Henderson <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-01-22xfs: fix an ABBA deadlock in xfs_renameDarrick J. Wong2-3/+1
When overlayfs is running on top of xfs and the user unlinks a file in the overlay, overlayfs will create a whiteout inode and ask xfs to "rename" the whiteout file atop the one being unlinked. If the file being unlinked loses its one nlink, we then have to put the inode on the unlinked list. This requires us to grab the AGI buffer of the whiteout inode to take it off the unlinked list (which is where whiteouts are created) and to grab the AGI buffer of the file being deleted. If the whiteout was created in a higher numbered AG than the file being deleted, we'll lock the AGIs in the wrong order and deadlock. Therefore, grab all the AGI locks we think we'll need ahead of time, and in order of increasing AG number per the locking rules. Reported-by: wenli xie <[email protected]> Fixes: 93597ae8dac0 ("xfs: Fix deadlock between AGI and AGF when target_ip exists in xfs_rename()") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2020-12-16xfs: remove xfs_buf_t typedefDave Chinner5-29/+29
Prepare for kernel xfs_buf alignment by getting rid of the xfs_buf_t typedef from userspace. [darrick: This patch is a port of a userspace patch removing the xfs_buf_t typedef in preparation to make the userspace xfs_buf code behave more like its kernel counterpart.] Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]>
2020-12-12fs/xfs: convert comma to semicolonZheng Yongjun1-1/+1
Replace a comma between expression statements by a semicolon. Signed-off-by: Zheng Yongjun <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Eric Sandeen <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-12xfs: kill ialloced in xfs_dialloc()Gao Xiang1-13/+9
It's enough to just use return code, and get rid of an argument. Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-12xfs: spilt xfs_dialloc() into 2 functionsDave Chinner2-35/+39
This patch explicitly separates free inode chunk allocation and inode allocation into two individual high level operations. Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-12xfs: move xfs_dialloc_roll() into xfs_dialloc()Dave Chinner2-59/+21
Get rid of the confusing ialloc_context and failure handling around xfs_dialloc() by moving xfs_dialloc_roll() into xfs_dialloc(). Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-12xfs: introduce xfs_dialloc_roll()Dave Chinner2-0/+40
Introduce a helper to make the on-disk inode allocation rolling logic clearer in preparation of the following cleanup. Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-12xfs: convert noroom, okalloc in xfs_dialloc() to boolGao Xiang1-4/+4
Boolean is preferred for such use. Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: don't catch dax+reflink inodes as corruption in verifierEric Sandeen1-4/+0
We don't yet support dax on reflinked files, but that is in the works. Further, having the flag set does not automatically mean that the inode is actually "in the CPU direct access state," which depends on several other conditions in addition to the flag being set. As such, we should not catch this as corruption in the verifier - simply not actually enabling S_DAX on reflinked files is enough for now. Fixes: 4f435ebe7d04 ("xfs: don't mix reflink and DAX mode for now") Signed-off-by: Eric Sandeen <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> [darrick: fix the scrubber too] Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: remove unneeded return value check for *init_cursor()Joseph Qi4-25/+0
Since *init_cursor() can always return a valid cursor, the NULL check in caller is unneeded. So clean them up. This also keeps the behavior consistent with other callers. Signed-off-by: Joseph Qi <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: introduce xfs_validate_stripe_geometry()Gao Xiang2-11/+69
Introduce a common helper to consolidate stripe validation process. Also make kernel code xfs_validate_sb_common() use it first. Signed-off-by: Gao Xiang <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: remove the unused XFS_B_FSB_OFFSET macroKaixu Xia1-1/+0
There are no callers of the XFS_B_FSB_OFFSET macro, so remove it. Signed-off-by: Kaixu Xia <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: check tp->t_dqinfo value instead of the XFS_TRANS_DQ_DIRTY flagKaixu Xia1-1/+0
Nowadays the only things that the XFS_TRANS_DQ_DIRTY flag seems to do are indicates the tp->t_dqinfo->dqs[XFS_QM_TRANS_{USR,GRP,PRJ}] values changed and check in xfs_trans_apply_dquot_deltas() and the unreserve variant xfs_trans_unreserve_and_mod_dquots(). Actually, we also can use the tp->t_dqinfo value instead of the XFS_TRANS_DQ_DIRTY flag, that is to say, we allocate the new tp->t_dqinfo only when the qtrx values changed, so the tp->t_dqinfo value isn't NULL equals the XFS_TRANS_DQ_DIRTY flag is set, we only need to check if tp->t_dqinfo == NULL in xfs_trans_apply_dquot_deltas() and its unreserve variant to determine whether lock all of the dquots and join them to the transaction. Signed-off-by: Kaixu Xia <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-12-09xfs: refactor file range validationDarrick J. Wong3-1/+29
Refactor all the open-coded validation of file block ranges into a single helper, and teach the bmap scrubber to check the ranges. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2020-12-09xfs: refactor realtime volume extent validationDarrick J. Wong3-10/+21
Refactor all the open-coded validation of realtime device extents into a single helper. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Dave Chinner <[email protected]>
2020-12-09xfs: refactor data device extent validationDarrick J. Wong3-6/+27
Refactor all the open-coded validation of non-static data device extents into a single helper. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2020-12-09xfs: detect overflows in bmbt recordsDarrick J. Wong1-0/+5
Detect file block mappings with a blockcount that's either so large that integer overflows occur or are zero, because neither are valid in the filesystem. Worse yet, attempting directory modifications causes the iext code to trip over the bmbt key handling and takes the filesystem down. We can fix most of this by preventing the bad metadata from entering the incore structures in the first place. Found by setting blockcount=0 in a directory data fork mapping and watching the fireworks. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2020-12-09xfs: enable the needsrepair featureDarrick J. Wong1-1/+2
Make it so that libxfs recognizes the needsrepair feature. Note that the kernel will still refuse to mount these. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Eric Sandeen <[email protected]> Reviewed-by: Dave Chinner <[email protected]>
2020-12-09xfs: define a new "needrepair" featureDarrick J. Wong1-0/+7
Define an incompat feature flag to indicate that the filesystem needs to be repaired. While libxfs will recognize this feature, the kernel will refuse to mount if the feature flag is set, and only xfs_repair will be able to clear the flag. The goal here is to force the admin to run xfs_repair to completion after upgrading the filesystem, or if we otherwise detect anomalies. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Eric Sandeen <[email protected]>
2020-12-08xfs: move kernel-specific superblock validation out of libxfsDarrick J. Wong1-27/+0
A couple of the superblock validation checks apply only to the kernel, so move them to xfs_fc_fill_super before we add the needsrepair "feature", which will prevent the kernel (but not xfsprogs) from mounting the filesystem. This also reduces the diff between kernel and userspace libxfs. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Eric Sandeen <[email protected]>
2020-11-19xfs: revert "xfs: fix rmap key and record comparison functions"Darrick J. Wong1-8/+8
This reverts commit 6ff646b2ceb0eec916101877f38da0b73e3a5b7f. Your maintainer committed a major braino in the rmap code by adding the attr fork, bmbt, and unwritten extent usage bits into rmap record key comparisons. While XFS uses the usage bits *in the rmap records* for cross-referencing metadata in xfs_scrub and xfs_repair, it only needs the owner and offset information to distinguish between reverse mappings of the same physical extent into the data fork of a file at multiple offsets. The other bits are not important for key comparisons for index lookups, and never have been. Eric Sandeen reports that this causes regressions in generic/299, so undo this patch before it does more damage. Reported-by: Eric Sandeen <[email protected]> Fixes: 6ff646b2ceb0 ("xfs: fix rmap key and record comparison functions") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Eric Sandeen <[email protected]>
2020-11-18xfs: fix forkoff miscalculation related to XFS_LITINO(mp)Gao Xiang1-1/+7
Currently, commit e9e2eae89ddb dropped a (int) decoration from XFS_LITINO(mp), and since sizeof() expression is also involved, the result of XFS_LITINO(mp) is simply as the size_t type (commonly unsigned long). Considering the expression in xfs_attr_shortform_bytesfit(): offset = (XFS_LITINO(mp) - bytes) >> 3; let "bytes" be (int)340, and "XFS_LITINO(mp)" be (unsigned long)336. on 64-bit platform, the expression is offset = ((unsigned long)336 - (int)340) >> 3 = (int)(0xfffffffffffffffcUL >> 3) = -1 but on 32-bit platform, the expression is offset = ((unsigned long)336 - (int)340) >> 3 = (int)(0xfffffffcUL >> 3) = 0x1fffffff instead. so offset becomes a large positive number on 32-bit platform, and cause xfs_attr_shortform_bytesfit() returns maxforkoff rather than 0. Therefore, one result is "ASSERT(new_size <= XFS_IFORK_SIZE(ip, whichfork));" assertion failure in xfs_idata_realloc(), which was also the root cause of the original bugreport from Dennis, see: https://bugzilla.redhat.com/show_bug.cgi?id=1894177 And it can also be manually triggered with the following commands: $ touch a; $ setfattr -n user.0 -v "`seq 0 80`" a; $ setfattr -n user.1 -v "`seq 0 80`" a on 32-bit platform. Fix the case in xfs_attr_shortform_bytesfit() by bailing out "XFS_LITINO(mp) < bytes" in advance suggested by Eric and a misleading comment together with this bugfix suggested by Darrick. It seems the other users of XFS_LITINO(mp) are not impacted. Fixes: e9e2eae89ddb ("xfs: only check the superblock version for dinode size calculation") Cc: <[email protected]> # 5.7+ Reported-and-tested-by: Dennis Gilmore <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-11-10xfs: fix rmap key and record comparison functionsDarrick J. Wong1-8/+8
Keys for extent interval records in the reverse mapping btree are supposed to be computed as follows: (physical block, owner, fork, is_btree, is_unwritten, offset) This provides users the ability to look up a reverse mapping from a bmbt record -- start with the physical block; then if there are multiple records for the same block, move on to the owner; then the inode fork type; and so on to the file offset. However, the key comparison functions incorrectly remove the fork/btree/unwritten information that's encoded in the on-disk offset. This means that lookup comparisons are only done with: (physical block, owner, offset) This means that queries can return incorrect results. On consistent filesystems this hasn't been an issue because blocks are never shared between forks or with bmbt blocks; and are never unwritten. However, this bug means that online repair cannot always detect corruption in the key information in internal rmapbt nodes. Found by fuzzing keys[1].attrfork = ones on xfs/371. Fixes: 4b8ed67794fe ("xfs: add rmap btree operations") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2020-11-10xfs: fix flags argument to rmap lookup when converting shared file rmapsDarrick J. Wong1-1/+1
Pass the same oldext argument (which contains the existing rmapping's unwritten state) to xfs_rmap_lookup_le_range at the start of xfs_rmap_convert_shared. At this point in the code, flags is zero, which means that we perform lookups using the wrong key. Fixes: 3f165b334e51 ("xfs: convert unwritten status of reverse mappings for shared files") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2020-10-29xfs: set xefi_discard when creating a deferred agfl free log intent itemDarrick J. Wong2-1/+2
Make sure that we actually initialize xefi_discard when we're scheduling a deferred free of an AGFL block. This was (eventually) found by the UBSAN while I was banging on realtime rmap problems, but it exists in the upstream codebase. While we're at it, rearrange the structure to reduce the struct size from 64 to 56 bytes. Fixes: fcb762f5de2e ("xfs: add bmapi nodiscard flag") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2020-10-16xfs: fix high key handling in the rt allocator's query_range functionDarrick J. Wong1-7/+4
Fix some off-by-one errors in xfs_rtalloc_query_range. The highest key in the realtime bitmap is always one less than the number of rt extents, which means that the key clamp at the start of the function is wrong. The 4th argument to xfs_rtfind_forw is the highest rt extent that we want to probe, which means that passing 1 less than the high key is wrong. Finally, drop the rem variable that controls the loop because we can compare the iteration point (rtstart) against the high key directly. The sordid history of this function is that the original commit (fb3c3) incorrectly passed (high_rec->ar_startblock - 1) as the 'limit' parameter to xfs_rtfind_forw. This was wrong because the "high key" is supposed to be the largest key for which the caller wants result rows, not the key for the first row that could possibly be outside the range that the caller wants to see. A subsequent attempt (8ad56) to strengthen the parameter checking added incorrect clamping of the parameters to the number of rt blocks in the system (despite the bitmap functions all taking units of rt extents) to avoid querying ranges past the end of rt bitmap file but failed to fix the incorrect _rtfind_forw parameter. The original _rtfind_forw parameter error then survived the conversion of the startblock and blockcount fields to rt extents (a0e5c), and the most recent off-by-one fix (a3a37) thought it was patching a problem when the end of the rt volume is not in use, but none of these fixes actually solved the original problem that the author was confused about the "limit" argument to xfs_rtfind_forw. Sadly, all four of these patches were written by this author and even his own usage of this function and rt testing were inadequate to get this fixed quickly. Original-problem: fb3c3de2f65c ("xfs: add a couple of queries to iterate free extents in the rtbitmap") Not-fixed-by: 8ad560d2565e ("xfs: strengthen rtalloc query range checks") Not-fixed-by: a0e5c435babd ("xfs: fix xfs_rtalloc_rec units") Fixes: a3a374bf1889 ("xfs: fix off-by-one error in xfs_rtalloc_query_range") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Chandan Babu R <[email protected]>
2020-10-07xfs: only relog deferred intent items if free space in the log gets lowDarrick J. Wong1-0/+16
Now that we have the ability to ask the log how far the tail needs to be pushed to maintain its free space targets, augment the decision to relog an intent item so that we only do it if the log has hit the 75% full threshold. There's no point in relogging an intent into the same checkpoint, and there's no need to relog if there's plenty of free space in the log. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>