aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-12-22xfs: tidy up xfs_rtallocate_extent_exactChristoph Hellwig1-21/+13
Use common code for both xfs_rtallocate_range calls by moving the !isfree logic into the non-default branch. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: merge the calls to xfs_rtallocate_range in xfs_rtallocate_blockChristoph Hellwig1-8/+5
Use a goto to use a common tail for the case of being able to allocate an extent. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: reflow the tail end of xfs_rtallocate_extent_blockChristoph Hellwig1-21/+23
Change polarity of a check so that the successful case of being able to allocate an extent is in the main path of the function and error handling is on a branch. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: invert a check in xfs_rtallocate_extent_blockChristoph Hellwig1-5/+4
Doing a break in the else side of a conditional is rather silly. Invert the check, break ASAP and unindent the other leg. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: split xfs_rtmodify_summary_intChristoph Hellwig1-51/+25
Inline the logic of xfs_rtmodify_summary_int into xfs_rtmodify_summary and xfs_rtget_summary instead of having a somewhat awkward helper to share a little bit of code. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: move xfs_rtget_summary to xfs_rtbitmap.cChristoph Hellwig3-18/+16
xfs_rtmodify_summary_int is only used inside xfs_rtbitmap.c and to implement xfs_rtget_summary. Move xfs_rtget_summary to xfs_rtbitmap.c as the exported API and mark xfs_rtmodify_summary_int static. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: cleanup picking the start extent hint in xfs_bmap_rtallocChristoph Hellwig1-20/+15
Clean up the logical in xfs_bmap_rtalloc that tries to find a rtextent to start the search from by using a separate variable for the hint, not calling xfs_bmap_adjacent when we want to ignore the locality and avoid an extra roundtrip converting between block numbers and RT extent numbers. As a side-effect this doesn't pointlessly call xfs_rtpick_extent and increment the start rtextent hint if we are going to ignore the result anyway. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: indicate if xfs_bmap_adjacent changed ap->blknoChristoph Hellwig2-6/+15
Add a return value to xfs_bmap_adjacent to indicate if it did change ap->blkno or not. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: reflow the tail end of xfs_bmap_rtallocChristoph Hellwig1-30/+30
Reorder the tail end of xfs_bmap_rtalloc so that the successfully allocation is in the main path, and the error handling is on a branch. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: return -ENOSPC from xfs_rtallocate_*Christoph Hellwig2-141/+71
Just return -ENOSPC instead of returning 0 and setting the return rt extent number to NULLRTEXTNO. This is turn removes all users of NULLRTEXTNO, so remove that as well. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: move xfs_bmap_rtalloc to xfs_rtalloc.cChristoph Hellwig3-170/+133
xfs_bmap_rtalloc is currently in xfs_bmap_util.c, which is a somewhat odd spot for it, given that is only called from xfs_bmap.c and calls into xfs_rtalloc.c to do the actual work. Move xfs_bmap_rtalloc to xfs_rtalloc.c and mark xfs_rtpick_extent xfs_rtallocate_extent and xfs_rtallocate_extent static now that they aren't called from outside of xfs_rtalloc.c. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: also use xfs_bmap_btalloc_accounting for RT allocationsChristoph Hellwig3-18/+17
Make xfs_bmap_btalloc_accounting more generic by handling the RT quota reservations and then also use it from xfs_bmap_rtalloc instead of open coding the accounting logic there. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: remove the xfs_alloc_arg argument to xfs_bmap_btalloc_accountingChristoph Hellwig1-10/+9
xfs_bmap_btalloc_accounting only uses the len field from args, but that has just been propagated to ap->length field by the caller. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: turn the xfs_trans_mod_dquot_byino stub into an inline functionChristoph Hellwig1-1/+4
Without this upcoming change can cause an unused variable warning, when adding a local variable for the fields field passed to it. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: consider minlen sized extents in xfs_rtallocate_extent_blockChristoph Hellwig1-1/+1
minlen is the lower bound on the extent length that the caller can accept, and maxlen is at this point the maximal available length. This means a minlen extent is perfectly fine to use, so do it. This matches the equivalent logic in xfs_rtallocate_extent_exact that also accepts a minlen sized extent. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs/health: cleanup, remove duplicated includingWang Jinchao1-2/+0
remove the second ones: \#include "xfs_trans_resv.h" \#include "xfs_mount.h" Signed-off-by: Wang Jinchao <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: fix perag leak when growfs failsLong Li3-11/+32
During growfs, if new ag in memory has been initialized, however sb_agcount has not been updated, if an error occurs at this time it will cause perag leaks as follows, these new AGs will not been freed during umount , because of these new AGs are not visible(that is included in mp->m_sb.sb_agcount). unreferenced object 0xffff88810be40200 (size 512): comm "xfs_growfs", pid 857, jiffies 4294909093 hex dump (first 32 bytes): 00 c0 c1 05 81 88 ff ff 04 00 00 00 00 00 00 00 ................ 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc 381741e2): [<ffffffff8191aef6>] __kmalloc+0x386/0x4f0 [<ffffffff82553e65>] kmem_alloc+0xb5/0x2f0 [<ffffffff8238dac5>] xfs_initialize_perag+0xc5/0x810 [<ffffffff824f679c>] xfs_growfs_data+0x9bc/0xbc0 [<ffffffff8250b90e>] xfs_file_ioctl+0x5fe/0x14d0 [<ffffffff81aa5194>] __x64_sys_ioctl+0x144/0x1c0 [<ffffffff83c3d81f>] do_syscall_64+0x3f/0xe0 [<ffffffff83e00087>] entry_SYSCALL_64_after_hwframe+0x62/0x6a unreferenced object 0xffff88810be40800 (size 512): comm "xfs_growfs", pid 857, jiffies 4294909093 hex dump (first 32 bytes): 20 00 00 00 00 00 00 00 57 ef be dc 00 00 00 00 .......W....... 10 08 e4 0b 81 88 ff ff 10 08 e4 0b 81 88 ff ff ................ backtrace (crc bde50e2d): [<ffffffff8191b43a>] __kmalloc_node+0x3da/0x540 [<ffffffff81814489>] kvmalloc_node+0x99/0x160 [<ffffffff8286acff>] bucket_table_alloc.isra.0+0x5f/0x400 [<ffffffff8286bdc5>] rhashtable_init+0x405/0x760 [<ffffffff8238dda3>] xfs_initialize_perag+0x3a3/0x810 [<ffffffff824f679c>] xfs_growfs_data+0x9bc/0xbc0 [<ffffffff8250b90e>] xfs_file_ioctl+0x5fe/0x14d0 [<ffffffff81aa5194>] __x64_sys_ioctl+0x144/0x1c0 [<ffffffff83c3d81f>] do_syscall_64+0x3f/0xe0 [<ffffffff83e00087>] entry_SYSCALL_64_after_hwframe+0x62/0x6a Factor out xfs_free_unused_perag_range() from xfs_initialize_perag(), used for freeing unused perag within a specified range in error handling, included in the error path of the growfs failure. Fixes: 1c1c6ebcf528 ("xfs: Replace per-ag array with a radix tree") Signed-off-by: Long Li <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-22xfs: add lock protection when remove perag from radix treeLong Li1-0/+4
Take mp->m_perag_lock for deletions from the perag radix tree in xfs_initialize_perag to prevent racing with tagging operations. Lookups are fine - they are RCU protected so already deal with the tree changing shape underneath the lookup - but tagging operations require the tree to be stable while the tags are propagated back up to the root. Right now there's nothing stopping radix tree tagging from operating while a growfs operation is progress and adding/removing new entries into the radix tree. Hence we can have traversals that require a stable tree occurring at the same time we are removing unused entries from the radix tree which causes the shape of the tree to change. Likely this hasn't caused a problem in the past because we are only doing append addition and removal so the active AG part of the tree is not changing shape, but that doesn't mean it is safe. Just making the radix tree modifications serialise against each other is obviously correct. Signed-off-by: Long Li <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: "Darrick J. Wong" <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2023-12-16Merge tag 'repair-quota-6.8_2023-12-15' of ↵Chandan Babu R12-54/+1026
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.8-mergeB xfs: online repair of quota and rt metadata files XFS stores quota records and free space bitmap information in files. Add the necessary infrastructure to enable repairing metadata inodes and their forks, and then make it so that we can repair the file metadata for the rtbitmap. Repairing the bitmap contents (and the summary file) is left for subsequent patchsets. We also add the ability to repair file metadata the quota files. As part of these repairs, we also reinitialize the ondisk dquot records as necessary to get the incore dquots working. We can also correct obviously bad dquot record attributes, but we leave checking the resource usage counts for the next patchsets. Signed-off-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> * tag 'repair-quota-6.8_2023-12-15' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: repair quotas xfs: improve dquot iteration for scrub xfs: check dquot resource timers xfs: check the ondisk space mapping behind a dquot
2023-12-16Merge tag 'repair-rtbitmap-6.8_2023-12-15' of ↵Chandan Babu R12-74/+647
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.8-mergeB xfs: online repair of rt bitmap file Add in the necessary infrastructure to check the inode and data forks of metadata files, then apply that to the realtime bitmap file. We won't be able to reconstruct the contents of the rtbitmap file until rmapbt is added for realtime volumes, but we can at least get the basics started. Signed-off-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> * tag 'repair-rtbitmap-6.8_2023-12-15' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: online repair of realtime bitmaps xfs: create a new inode fork block unmap helper xfs: repair the inode core and forks of a metadata inode xfs: always check the rtbitmap and rtsummary files xfs: check rt summary file geometry more thoroughly xfs: check rt bitmap file geometry more thoroughly
2023-12-16Merge tag 'repair-file-mappings-6.8_2023-12-15' of ↵Chandan Babu R24-52/+2160
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.8-mergeB xfs: online repair of file fork mappings In this series, online repair gains the ability to rebuild data and attr fork mappings from the reverse mapping information. It is at this point where we reintroduce the ability to reap file extents. Repair of CoW forks is a little different -- on disk, CoW staging extents are owned by the refcount btree and cannot be mapped back to individual files. Hence we can only detect staging extents that don't quite look right (missing reverse mappings, shared staging extents) and replace them with fresh allocations. Signed-off-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> * tag 'repair-file-mappings-6.8_2023-12-15' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: repair problems in CoW forks xfs: create a ranged query function for refcount btrees xfs: refactor repair forcing tests into a repair.c helper xfs: repair inode fork block mapping data structures xfs: reintroduce reaping of file metadata blocks to xrep_reap_extents
2023-12-16Merge tag 'repair-inodes-6.8_2023-12-15' of ↵Chandan Babu R34-85/+2185
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.8-mergeB xfs: online repair of inodes and forks In this series, online repair gains the ability to repair inode records. To do this, we must repair the ondisk inode and fork information enough to pass the iget verifiers and hence make the inode igettable again. Once that's done, we can perform higher level repairs on the incore inode. The fstests counterpart of this patchset implements stress testing of repair. Signed-off-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> * tag 'repair-inodes-6.8_2023-12-15' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: skip the rmapbt search on an empty attr fork unless we know it was zapped xfs: abort directory parent scrub scans if we encounter a zapped directory xfs: zap broken inode forks xfs: repair inode records xfs: set inode sick state flags when we zap either ondisk fork xfs: dont cast to char * for XFS_DFORK_*PTR macros xfs: add missing nrext64 inode flag check to scrub xfs: try to attach dquots to files before repairing them xfs: disable online repair quota helpers when quota not enabled
2023-12-16Merge tag 'repair-ag-btrees-6.8_2023-12-15' of ↵Chandan Babu R39-363/+3687
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.8-mergeB xfs: online repair of AG btrees Now that we've spent a lot of time reworking common code in online fsck, we're ready to start rebuilding the AG space btrees. This series implements repair functions for the free space, inode, and refcount btrees. Rebuilding the reverse mapping btree is much more intense and is left for a subsequent patchset. The fstests counterpart of this patchset implements stress testing of repair. Signed-off-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> * tag 'repair-ag-btrees-6.8_2023-12-15' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: repair refcount btrees xfs: repair inode btrees xfs: repair free space btrees xfs: remove trivial bnobt/inobt scrub helpers xfs: roll the scrub transaction after completing a repair xfs: move the per-AG datatype bitmaps to separate files xfs: create separate structures and code for u32 bitmaps
2023-12-16Merge tag 'repair-prep-for-bulk-loading-6.8_2023-12-15' of ↵Chandan Babu R10-35/+198
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.8-mergeB xfs: prepare repair for bulk loading Before we start merging the online repair functions, let's improve the bulk loading code a bit. First, we need to fix a misinteraction between the AIL and the btree bulkloader wherein the delwri at the end of the bulk load fails to queue a buffer for writeback if it happens to be on the AIL list. Second, we introduce a defer ops barrier object so that the process of reaping blocks after a repair cannot queue more than two extents per EFI log item. This increases our exposure to leaking blocks if the system goes down during a reap, but also should prevent transaction overflows, which result in the system going down. Third, we change the bulkloader itself to copy multiple records into a block if possible, and add some debugging knobs so that developers can control the slack factors, just like they can do for xfs_repair. Signed-off-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]> * tag 'repair-prep-for-bulk-loading-6.8_2023-12-15' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: constrain dirty buffers while formatting a staged btree xfs: move btree bulkload record initialization to ->get_record implementations xfs: add debug knobs to control btree bulk load slack factors xfs: read leaf blocks when computing keys for bulkloading into node blocks xfs: set XBF_DONE on newly formatted btree block that are ready for writing xfs: force all buffers to be written during btree bulk load
2023-12-15xfs: repair quotasDarrick J. Wong10-8/+628
Fix anything that causes the quota verifiers to fail. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: improve dquot iteration for scrubDarrick J. Wong9-46/+319
Upon a closer inspection of the quota record scrubber, I noticed that dqiterate wasn't actually walking all possible dquots for the mapped blocks in the quota file. This is due to xfs_qm_dqget_next skipping all XFS_IS_DQUOT_UNINITIALIZED dquots. For a fsck program, we really want to look at all the dquots, even if all counters and limits in the dquot record are zero. Rewrite the implementation to do this, as well as switching to an iterator paradigm to reduce the number of indirect calls. This enables removal of the old broken dqiterate code from xfs_dquot.c. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: check dquot resource timersDarrick J. Wong1-0/+21
For each dquot resource, ensure either (a) the resource usage is over the soft limit and there is a nonzero timer; or (b) usage is at or under the soft limit and the timer is unset. (a) is redundant with the dquot buffer verifier, but (b) isn't checked anywhere. Found by fuzzing xfs/426 and noticing that diskdq.btimer = add didn't trip any kind of warning for having a timer set even with no limits. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: check the ondisk space mapping behind a dquotDarrick J. Wong1-0/+58
Each xfs_dquot object caches the file offset and daddr of the ondisk block that backs the dquot. Make sure these cached values are the same as the bmapi data, and that the block state is written. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: online repair of realtime bitmapsDarrick J. Wong6-8/+245
Fix all the file metadata surrounding the realtime bitmap file, which includes the rt geometry, file size, forks, and space mappings. The bitmap contents themselves cannot be fixed without rt rmap, so that will come later. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: create a new inode fork block unmap helperDarrick J. Wong3-24/+46
Create a new helper to unmap blocks from an inode's fork. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: repair the inode core and forks of a metadata inodeDarrick J. Wong3-4/+168
Add a helper function to repair the core and forks of a metadata inode, so that we can get move onto the task of repairing higher level metadata that lives in an inode. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: always check the rtbitmap and rtsummary filesDarrick J. Wong1-2/+0
XFS filesystems always have a realtime bitmap and summary file, even if there has never been a realtime volume attached. Always check them. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: check rt summary file geometry more thoroughlyDarrick J. Wong1-27/+110
I forgot that the xfs_mount tracks the size and number of levels in the realtime summary file, and that the rt summary file can have more blocks mapped to the data fork than m_rsumsize implies if growfsrt fails. So. Add to the rtsummary scrubber an explicit check that all the summary geometry values are correct, then adjust the rtsummary i_size checks to allow for the growfsrt failure case. Finally, flag post-eof blocks in the summary file. While we're at it, split the extent map checking so that we only call xfs_bmapi_read once per extent instead of once per rtsummary block. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: check rt bitmap file geometry more thoroughlyDarrick J. Wong1-15/+84
I forgot that the superblock tracks the number of blocks that are in the realtime bitmap, and that the rt bitmap file can have more blocks mapped to the data fork than sb_rbmblocks if growfsrt fails. So. Add to the rtbitmap scrubber an explicit check that sb_rextents and sb_rbmblocks are correct, then adjust the rtbitmap i_size checks to allow for the growfsrt failure case. Finally, flag post-eof blocks in the rtbitmap. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: repair problems in CoW forksDarrick J. Wong7-1/+771
Try to repair errors that we see in file CoW forks so that we don't do stupid things like remap garbage into a file. There's not a lot we can do with the COW fork -- the ondisk metadata record only that the COW staging extents are owned by the refcount btree, which effectively means that we can't reconstruct this incore structure from scratch. Actually, this is even worse -- we can't touch written extents, because those map space that are actively under writeback, and there's not much to do with delalloc reservations. Hence we can only detect crosslinked unwritten extents and fix them by punching out the problematic parts and replacing them with delalloc extents. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: create a ranged query function for refcount btreesDarrick J. Wong2-0/+51
Implement ranged queries for refcount records. The next patch will use this to scan refcount data. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: refactor repair forcing tests into a repair.c helperDarrick J. Wong3-13/+25
There are a couple of conditions that userspace can set to force repairs of metadata. These really belong in the repair code and not open-coded into the check code, so refactor them into a helper. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: repair inode fork block mapping data structuresDarrick J. Wong17-34/+1153
Use the reverse-mapping btree information to rebuild an inode block map. Update the btree bulk loading code as necessary to support inode rooted btrees and fix some bitrot problems. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: skip the rmapbt search on an empty attr fork unless we know it was zappedDarrick J. Wong1-22/+79
The attribute fork scrubber can optionally scan the reverse mapping records of the filesystem to determine if the fork is missing mappings that it should have. However, this is a very expensive operation, so we only want to do this if we suspect that the fork is missing records. For attribute forks the criteria for suspicion is that the attr fork is in EXTENTS format and has zero extents. However, there are several ways that a file can end up in this state through regular filesystem usage. For example, an LSM can set a s_security hook but then decide not to set an ACL; or an attr set can create the attr fork but then the actual set operation fails with ENOSPC; or we can delete all the attrs on a file whose data fork is in btree format, in which case we do not delete the attr fork. We don't want to run the expensive check for any case that can be arrived at through regular operations. However. When online inode repair decides to zap an attribute fork, it cannot determine if it is zapping ACL information. As a precaution it removes all the discretionary access control permissions and sets the user and group ids to zero. Check these three additional conditions to decide if we want to scan the rmap records. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: reintroduce reaping of file metadata blocks to xrep_reap_extentsDarrick J. Wong4-4/+160
Back in commit a55e07308831b ("xfs: only allow reaping of per-AG blocks in xrep_reap_extents"), we removed from the reaping code the ability to handle bmbt blocks. At the time, the reaping code only walked single blocks, didn't correctly detect crosslinked blocks, and the special casing made the function hard to understand. It was easier to remove unneeded functionality prior to fixing all the bugs. Now that we've fixed the problems, we want again the ability to reap file metadata blocks. Reintroduce the per-file reaping functionality atop the current implementation. We require that sc->sa is uninitialized, so that we can use it to hold all the per-AG context for a given extent. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: abort directory parent scrub scans if we encounter a zapped directoryDarrick J. Wong4-0/+46
In a previous patch, we added some code to perform sufficient repairs to an ondisk inode record such that the inode cache would be willing to load the inode. If the broken inode was a shortform directory, it will reset the directory to something plausible, which is to say an empty subdirectory of the root. The telltale signs that something is seriously wrong is the broken link count. Such directories look clean, but they shouldn't participate in a filesystem scan to find or confirm a directory parent pointer. Create a predicate that identifies such directories and abort the scrub. Found by fuzzing xfs/1554 with multithreaded xfs_scrub enabled and u3.bmx[0].startblock = zeroes. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: zap broken inode forksDarrick J. Wong11-46/+808
Determine if inode fork damage is responsible for the inode being unable to pass the ifork verifiers in xfs_iget and zap the fork contents if this is true. Once this is done the fork will be empty but we'll be able to construct an in-core inode, and a subsequent call to the inode fork repair ioctl will search the rmapbt to rebuild the records that were in the fork. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: repair inode recordsDarrick J. Wong7-2/+1018
If an inode is so badly damaged that it cannot be loaded into the cache, fix the ondisk metadata and try again. If there /is/ a cached inode, fix any problems and apply any optimizations that can be solved incore. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: set inode sick state flags when we zap either ondisk forkDarrick J. Wong13-12/+166
In a few patches, we'll add some online repair code that tries to massage the ondisk inode record just enough to get it to pass the inode verifiers so that we can continue with more file repairs. Part of that massaging can include zapping the ondisk forks to clear errors. After that point, the bmap fork repair functions will rebuild the zapped forks. Christoph asked for stronger protections against online repair zapping a fork to get the inode to load vs. other threads trying to access the partially repaired file. Do this by adding a special "[DA]FORK_ZAPPED" inode health flag whenever repair zaps a fork, and sprinkling checks for that flag into the various file operations for things that don't like handling an unexpected zero-extents fork. In practice xfs_scrub will scrub and fix the forks almost immediately after zapping them, so the window is very small. However, if a crash or unmount should occur, we can still detect these zapped inode forks by looking for a zero-extents fork when data was expected. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: dont cast to char * for XFS_DFORK_*PTR macrosDarrick J. Wong2-2/+2
Code in the next patch will assign the return value of XFS_DFORK_*PTR macros to a struct pointer. gcc complains about casting char* strings to struct pointers, so let's fix the macro's cast to void* to shut up the warnings. While we're at it, fix one of the scrub tests that uses PTR to use BOFF instead for a simpler integer comparison, since other linters whine about char* and void* comparisons. Can't satisfy all these dman bots. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: add missing nrext64 inode flag check to scrubDarrick J. Wong1-0/+4
Add this missing check that the superblock nrext64 flag is set if the inode flag is set. Fixes: 9b7d16e34bbeb ("xfs: Introduce XFS_DIFLAG2_NREXT64 and associated helpers") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: try to attach dquots to files before repairing themDarrick J. Wong7-5/+55
Inode resource usage is tracked in the quota metadata. Repairing a file might change the resources used by that file, which means that we need to attach dquots to the file that we're examining before accessing anything in the file protected by the ILOCK. However, there's a twist: a dquot cache miss requires the dquot to be read in from the quota file, during which we drop the ILOCK on the file being examined. This means that we *must* try to attach the dquots before taking the ILOCK. Therefore, dquots must be attached to files in the scrub setup function. If doing so yields corruption errors (or unknown dquot errors), we instead clear the quotachecked status, which will cause a quotacheck on next mount. A future series will make this trigger live quotacheck. While we're here, change the xrep_ino_dqattach function to use the unlocked dqattach functions so that we avoid cycling the ILOCK if the inode already has dquots attached. This makes the naming and locking requirements consistent with the rest of the filesystem. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: repair refcount btreesDarrick J. Wong12-19/+856
Reconstruct the refcount data from the rmap btree. Link: https://docs.kernel.org/filesystems/xfs-online-fsck-design.html#case-study-rebuilding-the-space-reference-counts Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: disable online repair quota helpers when quota not enabledDarrick J. Wong2-0/+11
Don't compile the quota helper functions if quota isn't being built into the XFS module. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2023-12-15xfs: repair inode btreesDarrick J. Wong11-51/+1022
Use the rmapbt to find inode chunks, query the chunks to compute hole and free masks, and with that information rebuild the inobt and finobt. Refer to the case study in Documentation/filesystems/xfs-online-fsck-design.rst for more details. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>