aboutsummaryrefslogtreecommitdiff
path: root/fs/xfs
AgeCommit message (Collapse)AuthorFilesLines
2024-09-09mm,tmpfs: consider end of file write in shmem_is_hugeRik van Riel2-4/+4
Take the end of a file write into consideration when deciding whether or not to use huge pages for tmpfs files when the tmpfs filesystem is mounted with huge=within_size This allows large writes that append to the end of a file to automatically use large pages. Doing 4MB sequential writes without fallocate to a 16GB tmpfs file with fio. The numbers without THP or with huge=always stay the same, but the performance with huge=within_size now matches that of huge=always. huge before after 4kB pages 1560 MB/s 1560 MB/s within_size 1560 MB/s 4720 MB/s always: 4720 MB/s 4720 MB/s [[email protected]: coding-style cleanups] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Rik van Riel <[email protected]> Reviewed-by: Baolin Wang <[email protected]> Tested-by: Baolin Wang <[email protected]> Cc: Darrick J. Wong <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-08treewide: Fix wrong singular form of jiffies in commentsAnna-Maria Behnsen1-1/+1
There are several comments all over the place, which uses a wrong singular form of jiffies. Replace 'jiffie' by 'jiffy'. No functional change. Signed-off-by: Anna-Maria Behnsen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> # m68k Link: https://lore.kernel.org/all/20240904-devel-anna-maria-b4-timers-flseep-v1-3-e98760256370@linutronix.de
2024-09-03iomap: fix handling of dirty folios over unwritten extentsBrian Foster1-10/+0
The iomap zero range implementation doesn't properly handle dirty pagecache over unwritten mappings. It skips such mappings as if they were pre-zeroed. If some part of an unwritten mapping is dirty in pagecache from a previous write, the data in cache should be zeroed as well. Instead, the data is left in cache and creates a stale data exposure problem if writeback occurs sometime after the zero range. Most callers are unaffected by this because the higher level filesystem contexts that call zero range typically perform a filemap flush of the target range for other reasons. A couple contexts that don't otherwise need to flush are write file size extension and truncate in XFS. The former path is currently susceptible to the stale data exposure problem and the latter performs a flush specifically to work around it. This is clearly inconsistent and incomplete. As a first step toward correcting behavior, lift the XFS workaround to iomap_zero_range() and unconditionally flush the range before the zero range operation proceeds. While this appears to be a bit of a big hammer, most all users already do this from calling context save for the couple of exceptions noted above. Future patches will optimize or elide this flush while maintaining functional correctness. Fixes: ae259a9c8593 ("fs: introduce iomap infrastructure") Signed-off-by: Brian Foster <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-09-03iomap: add a private argument for iomap_file_buffered_writeJosef Bacik1-1/+1
In order to switch fuse over to using iomap for buffered writes we need to be able to have the struct file for the original write, in case we have to read in the page to make it uptodate. Handle this by using the existing private field in the iomap_iter, and add the argument to iomap_file_buffered_write. This will allow us to pass the file in through the iomap buffered write path, and is flexible for any other file systems needs. Signed-off-by: Josef Bacik <[email protected]> Link: https://lore.kernel.org/r/7f55c7c32275004ba00cddf862d970e6e633f750.1724755651.git.josef@toxicpanda.com Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-09-03xfs: enable block size larger than page size supportPankaj Raghav5-11/+32
Page cache now has the ability to have a minimum order when allocating a folio which is a prerequisite to add support for block size > page size. Signed-off-by: Pankaj Raghav <[email protected]> Signed-off-by: Luis Chamberlain <[email protected]> Link: https://lore.kernel.org/r/[email protected] # fix folded Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-09-03xfs: ensure st_blocks never goes to zero during COW writesChristoph Hellwig2-0/+18
COW writes remove the amount overwritten either directly for delalloc reservations, or in earlier deferred transactions than adding the new amount back in the bmap map transaction. This means st_blocks on an inode where all data is overwritten using the COW path can temporarily show a 0 st_blocks. This can easily be reproduced with the pending zoned device support where all writes use this path and trips the check in generic/615, but could also happen on a reflink file without that. Fix this by temporarily add the pending blocks to be mapped to i_delayed_blks while the item is queued. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: use xas_for_each_marked in xfs_reclaim_inodes_countChristoph Hellwig2-29/+9
xfs_reclaim_inodes_count iterates over all AGs to sum up the reclaimable inodes counts. There is no point in grabbing a reference to the them or unlock the RCU critical section for each iteration, so switch to the more efficient xas_for_each_marked iterator. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: convert perag lookup to xarrayChristoph Hellwig4-52/+35
Convert the perag lookup from the legacy radix tree to the xarray, which allows for much nicer iteration and bulk lookup semantics. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: simplify tagged perag iterationChristoph Hellwig2-40/+35
Pass the old perag structure to the tagged loop helpers so that they can grab the old agno before releasing the reference. This removes the need to separately track the agno and the iterator macro, and thus also obsoletes the for_each_perag_tag syntactic sugar. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: move the tagged perag lookup helpers to xfs_icache.cChristoph Hellwig3-62/+58
The tagged perag helpers are only used in xfs_icache.c in the kernel code and not at all in xfsprogs. Move them to xfs_icache.c in preparation for switching to an xarray, for which I have no plan to implement the tagged lookup functions for userspace. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: use kfree_rcu_mightsleep to free the perag structuresChristoph Hellwig2-14/+1
Using the kfree_rcu_mightsleep is simpler and removes the need for a rcu_head in the perag structure. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: use LIST_HEAD() to simplify codeHongbo Li1-2/+1
list_head can be initialized automatically with LIST_HEAD() instead of calling INIT_LIST_HEAD(). Signed-off-by: Hongbo Li <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: Remove duplicate xfs_trans_priv.h headerJiapeng Chong1-1/+0
./fs/xfs/libxfs/xfs_defer.c: xfs_trans_priv.h is included more than once. Reported-by: Abaci Robot <[email protected]> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9491 Signed-off-by: Jiapeng Chong <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: remove unnecessary checkDan Carpenter1-1/+1
We checked that "pip" is non-NULL at the start of the if else statement so there is no need to check again here. Delete the check. Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: Use xfs set and clear mp state helpersJohn Garry5-9/+9
Use the set and clear mp state helpers instead of open-coding. It is noted that in some instances calls to atomic operation set_bit() and clear_bit() are being replaced with test_and_set_bit() and test_and_clear_bit(), respectively, as there is no specific helpers for set_bit() and clear_bit() only. However should be ok, as we are just ignoring the returned value from those "test" variants. Signed-off-by: John Garry <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: reclaim speculative preallocations for append only filesChristoph Hellwig3-8/+10
The XFS XFS_DIFLAG_APPEND maps to the VFS S_APPEND flag, which forbids writes that don't append at the current EOF. But the commit originally adding XFS_DIFLAG_APPEND support (commit a23321e766d in xfs xfs-import repository) also checked it to skip releasing speculative preallocations, which doesn't make any sense. Another commit (dd9f438e3290 in the xfs-import repository) later extended that flag to also report these speculation preallocations which should not exist in getbmap. Remove these checks as nothing XFS_DIFLAG_APPEND implies that preallocations beyond EOF should exist, but explicitly check for XFS_DIFLAG_APPEND in xfs_file_release to bypass the algorithm that discard preallocations on the first close as append only files aren't expected to be written to only once. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: simplify extent lookup in xfs_can_free_eofblocksChristoph Hellwig1-15/+7
xfs_can_free_eofblocks just cares if there is an extent beyond EOF. Replace the call to xfs_bmapi_read with a xfs_iext_lookup_extent as we've already checked that extents are read in earlier. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: check XFS_EOFBLOCKS_RELEASED earlier in xfs_release_eofblocksChristoph Hellwig1-3/+2
If the XFS_EOFBLOCKS_RELEASED flag is set, we are not going to free the eofblocks, so don't bother locking the inode or performing the checks in xfs_can_free_eofblocks. Also switch to a test_and_set operation once the iolock has been acquire so that only the caller that sets it actually frees the post-EOF blocks. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: only free posteof blocks on first closeDarrick J. Wong2-23/+13
Certain workloads fragment files on XFS very badly, such as a software package that creates a number of threads, each of which repeatedly run the sequence: open a file, perform a synchronous write, and close the file, which defeats the speculative preallocation mechanism. We work around this problem by only deleting posteof blocks the /first/ time a file is closed to preserve the behavior that unpacking a tarball lays out files one after the other with no gaps. Signed-off-by: Darrick J. Wong <[email protected]> [hch: rebased, updated comment, renamed the flag] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: don't free post-EOF blocks on read closeDave Chinner1-1/+7
When we have a workload that does open/read/close in parallel with other allocation, the file becomes rapidly fragmented. This is due to close() calling xfs_file_release() and removing the speculative preallocation beyond EOF. Add a check for a writable context to xfs_file_release to skip the post-EOF block freeing (an the similarly pointless flushing on truncate down). Before: Test 1: sync write fragmentation counts /mnt/scratch/file.0: 919 /mnt/scratch/file.1: 916 /mnt/scratch/file.2: 919 /mnt/scratch/file.3: 920 /mnt/scratch/file.4: 920 /mnt/scratch/file.5: 921 /mnt/scratch/file.6: 916 /mnt/scratch/file.7: 918 After: Test 1: sync write fragmentation counts /mnt/scratch/file.0: 24 /mnt/scratch/file.1: 24 /mnt/scratch/file.2: 11 /mnt/scratch/file.3: 24 /mnt/scratch/file.4: 3 /mnt/scratch/file.5: 24 /mnt/scratch/file.6: 24 /mnt/scratch/file.7: 23 Signed-off-by: Dave Chinner <[email protected]> [darrick: wordsmithing, fix commit message] Signed-off-by: Darrick J. Wong <[email protected]> [hch: ported to the new ->release code structure] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: skip all of xfs_file_release when shut downChristoph Hellwig1-4/+6
There is no point in trying to free post-EOF blocks when the file system is shutdown, as it will just error out ASAP. Instead return instantly when xfs_file_release is called on a shut down file system. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: don't bother returning errors from xfs_file_releaseChristoph Hellwig1-9/+9
While ->release returns int, the only caller ignores the return value. As we're only doing cleanup work there isn't much of a point in return a value to start with, so just document the situation instead. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: refactor f_op->release handlingChristoph Hellwig3-83/+68
Currently f_op->release is split in not very obvious ways. Fix that by folding xfs_release into xfs_file_release. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-03xfs: remove the i_mode check in xfs_releaseChristoph Hellwig1-3/+0
xfs_release is only called from xfs_file_release, which is wired up as the f_op->release handler for regular files only. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Chandan Babu R <[email protected]>
2024-09-02xfs: make the calculation generic in xfs_sb_validate_fsb_count()Pankaj Raghav1-1/+6
Instead of assuming that PAGE_SHIFT is always higher than the blocklog, make the calculation generic so that page cache count can be calculated correctly for LBS. Signed-off-by: Pankaj Raghav <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Darrick J. Wong <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Daniel Gomez <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-09-02xfs: expose block size in statPankaj Raghav1-1/+1
For block size larger than page size, the unit of efficient IO is the block size, not the page size. Leaving stat() to report PAGE_SIZE as the block size causes test programs like fsx to issue illegal ranges for operations that require block size alignment (e.g. fallocate() insert range). Hence update the preferred IO size to reflect the block size in this case. This change is based on a patch originally from Dave Chinner.[1] [1] https://lwn.net/ml/linux-fsdevel/[email protected]/ Signed-off-by: Pankaj Raghav <[email protected]> Signed-off-by: Luis Chamberlain <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Darrick J. Wong <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Daniel Gomez <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-09-02xfs: use kvmalloc for xattr buffersDave Chinner1-9/+6
Pankaj Raghav reported that when filesystem block size is larger than page size, the xattr code can use kmalloc() for high order allocations. This triggers a useless warning in the allocator as it is a __GFP_NOFAIL allocation here: static inline struct page *rmqueue(struct zone *preferred_zone, struct zone *zone, unsigned int order, gfp_t gfp_flags, unsigned int alloc_flags, int migratetype) { struct page *page; /* * We most definitely don't want callers attempting to * allocate greater than order-1 page units with __GFP_NOFAIL. */ >>>> WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1)); ... Fix this by changing all these call sites to use kvmalloc(), which will strip the NOFAIL from the kmalloc attempt and if that fails will do a __GFP_NOFAIL vmalloc(). This is not an issue that productions systems will see as filesystems with block size > page size cannot be mounted by the kernel; Pankaj is developing this functionality right now. Reported-by: Pankaj Raghav <[email protected]> Fixes: f078d4ea8276 ("xfs: convert kmem_alloc() to kmalloc()") Signed-off-by: Dave Chinner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Pankaj Raghav <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Reviewed-by: Daniel Gomez <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2024-09-01mm: kvmalloc: align kvrealloc() with krealloc()Danilo Krummrich1-1/+1
Besides the obvious (and desired) difference between krealloc() and kvrealloc(), there is some inconsistency in their function signatures and behavior: - krealloc() frees the memory when the requested size is zero, whereas kvrealloc() simply returns a pointer to the existing allocation. - krealloc() behaves like kmalloc() if a NULL pointer is passed, whereas kvrealloc() does not accept a NULL pointer at all and, if passed, would fault instead. - krealloc() is self-contained, whereas kvrealloc() relies on the caller to provide the size of the previous allocation. Inconsistent behavior throughout allocation APIs is error prone, hence make kvrealloc() behave like krealloc(), which seems superior in all mentioned aspects. Besides that, implementing kvrealloc() by making use of krealloc() and vrealloc() provides oppertunities to grow (and shrink) allocations more efficiently. For instance, vrealloc() can be optimized to allocate and map additional pages to grow the allocation or unmap and free unused pages to shrink the allocation. [[email protected]: document concurrency restrictions] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: disable KASAN when switching to vmalloc] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: properly document __GFP_ZERO behavior] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Danilo Krummrich <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Chandan Babu R <[email protected]> Cc: Christian König <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: David Rientjes <[email protected]> Cc: Hyeonggon Yoo <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Kees Cook <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Uladzislau Rezki <[email protected]> Cc: Wedson Almeida Filho <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-09-01xfs: standardize the btree maxrecs function parametersDarrick J. Wong14-33/+40
Standardize the parameters in xfs_{alloc,bm,ino,rmap,refcount}bt_maxrecs so that we have consistent calling conventions. This doesn't affect the kernel that much, but enables us to clean up userspace a bit. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2024-09-01xfs: replace shouty XFS_BM{BT,DR} macrosDarrick J. Wong9-122/+198
Replace all the shouty bmap btree and bmap disk root macros with actual functions. sed \ -e 's/XFS_BMBT_BLOCK_LEN/xfs_bmbt_block_len/g' \ -e 's/XFS_BMBT_REC_ADDR/xfs_bmbt_rec_addr/g' \ -e 's/XFS_BMBT_KEY_ADDR/xfs_bmbt_key_addr/g' \ -e 's/XFS_BMBT_PTR_ADDR/xfs_bmbt_ptr_addr/g' \ -e 's/XFS_BMDR_REC_ADDR/xfs_bmdr_rec_addr/g' \ -e 's/XFS_BMDR_KEY_ADDR/xfs_bmdr_key_addr/g' \ -e 's/XFS_BMDR_PTR_ADDR/xfs_bmdr_ptr_addr/g' \ -e 's/XFS_BMAP_BROOT_PTR_ADDR/xfs_bmap_broot_ptr_addr/g' \ -e 's/XFS_BMAP_BROOT_SPACE_CALC/xfs_bmap_broot_space_calc/g' \ -e 's/XFS_BMAP_BROOT_SPACE/xfs_bmap_broot_space/g' \ -e 's/XFS_BMDR_SPACE_CALC/xfs_bmdr_space_calc/g' \ -e 's/XFS_BMAP_BMDR_SPACE/xfs_bmap_bmdr_space/g' \ -i $(git ls-files fs/xfs/*.[ch] fs/xfs/libxfs/*.[ch] fs/xfs/scrub/*.[ch]) Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2024-09-01xfs: fix a sloppy memory handling bug in xfs_iroot_reallocDarrick J. Wong1-5/+5
While refactoring code, I noticed that when xfs_iroot_realloc tries to shrink a bmbt root block, it allocates a smaller new block and then copies "records" and pointers to the new block. However, bmbt root blocks cannot ever be leaves, which means that it's not technically correct to copy records. We /should/ be copying keys. Note that this has never resulted in actual memory corruption because sizeof(bmbt_rec) == (sizeof(bmbt_key) + sizeof(bmbt_ptr)). However, this will no longer be true when we start adding realtime rmap stuff, so fix this now. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2024-09-01xfs: fix FITRIM reporting againDarrick J. Wong1-1/+1
Don't report FITRIMming more bytes than possibly exist in the filesystem. Fixes: 410e8a18f8e93 ("xfs: don't bother reporting blocks trimmed via FITRIM") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2024-09-01xfs: fix C++ compilation errors in xfs_fs.hDarrick J. Wong1-2/+3
Several people reported C++ compilation errors due to things that C compilers allow but C++ compilers do not. Fix both of these problems, and hope there aren't more of these brown paper bags in 2 months when we finally get these fixes through the process into a released xfsprogs. NOTE: I am submitting this bugfix over the objections of a former maintainer, who insists that we should remove this function from the published userspace ABI instead of fixing the C++ compilation errors. No deprecation period, no discussion, just a hard drop of an already provided and correct C function, which would be in contravention of Linus' rules. IOWs, removing ABI that have already shipped in a released kernel requires a careful deprecation period, so I will let that maintainer run that process. Reported-by: [email protected] Reported-by: [email protected] Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219203 Fixes: 233f4e12bbb2c ("xfs: add parent pointer ioctls") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2024-09-01xfs: refactor loading quota inodes in the regular caseDarrick J. Wong4-36/+81
Create a helper function to load quota inodes in the case where the dqtype and the sb quota inode fields correspond. This is true for nearly all the iget callsites in the quota code, except for when we're switching the group and project quota inodes. We'll need this in subsequent patches to make the metadir handling less convoluted. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2024-09-01xfs: move xfs_ioc_getfsmap out of xfs_ioctl.cDarrick J. Wong3-136/+134
Move this function out of xfs_ioctl.c to reduce the clutter in there, and make the entire getfsmap implementation self-contained in a single file. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2024-09-01xfs: rearrange xfs_fsmap.c a little bitDarrick J. Wong1-134/+134
The order of the functions in this file has gotten a little confusing over the years. Specifically, the two data device implementations (bnobt and rmapbt) could be adjacent in the source code instead of split in two by the logdev and rtdev fsmap implementations. We're about to add more functionality to this file, so rearrange things now. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2024-09-01xfs: replace m_rsumsize with m_rsumblocksChristoph Hellwig7-25/+19
Track the RT summary file size in blocks, just like the RT bitmap file. While we have users of both units, blocks are used slightly more often and this matches the bitmap file for consistency. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2024-09-01xfs: remove xfs_{rtbitmap,rtsummary}_wordcountChristoph Hellwig2-38/+0
xfs_rtbitmap_wordcount and xfs_rtsummary_wordcount are currently unused, so remove them to simplify refactoring other rtbitmap helpers. They can be added back or simply open coded when actually needed. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2024-09-01xfs: add xchk_setup_nothing and xchk_nothing helpersDarrick J. Wong2-40/+18
Add common helpers for no-op scrubbing methods. Signed-off-by: Darrick J. Wong <[email protected]> [hch: split from a larger patch] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2024-09-01xfs: make the rtalloc start hint a xfs_rtblock_tChristoph Hellwig1-7/+11
0 is a valid start RT extent, and with pending changes it will become both more common and non-unique. Switch to pass a xfs_rtblock_t instead so that we can use NULLRTBLOCK to determine if a hint was set or not. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2024-09-01xfs: factor out a xfs_rtallocate_align helperChristoph Hellwig1-34/+59
Split the code to calculate the aligned allocation request from xfs_bmap_rtalloc into a separate self-contained helper. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2024-09-01xfs: rework the rtalloc fallback handlingChristoph Hellwig1-35/+34
xfs_rtallocate currently has two fallbacks, when an allocation fails: 1) drop the requested extent size alignment, if any, and retry 2) ignore the locality hint Oddly enough it does those in order, as trying a different location is more in line with what the user asked for, and does it in a very unstructured way. Lift the fallback to try to allocate without the locality hint into xfs_rtallocate to both perform them in a more sensible order and to clean up the code. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2024-09-01xfs: factor out a xfs_rtallocate helperChristoph Hellwig1-31/+50
Split out a helper from xfs_rtallocate that performs the actual allocation. This keeps the scope of the xfs_rtalloc_args structure contained, and prepares for rtgroups support. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2024-09-01xfs: clean up the ISVALID macro in xfs_bmap_adjacentChristoph Hellwig1-24/+33
Turn the ISVALID macro defined and used inside in xfs_bmap_adjacent that relies on implict context into a proper inline function. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2024-09-01xfs: simplify xfs_rtalloc_query_rangeChristoph Hellwig4-41/+30
There isn't much of a good reason to pass the xfs_rtalloc_rec structures that describe extents to xfs_rtalloc_query_range as we really just want a lower and upper bound xfs_rtxnum_t. Pass the rtxnum directly and simply the interface. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2024-09-01xfs: remove xfs_rtb_to_rtxremChristoph Hellwig2-23/+4
Simplify the number of block number conversion helpers by removing xfs_rtb_to_rtxrem. Any recent compiler is smart enough to eliminate the double divisions if using separate xfs_rtb_to_rtx and xfs_rtb_to_rtxoff calls. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2024-09-01xfs: fix broken variable-sized allocation detection in ↵Darrick J. Wong1-6/+9
xfs_rtallocate_extent_block This function tries to find a suitable free space extent starting from a particular rtbitmap block. Some time ago, I added a clamping function to prevent the free space scans from running off the end of the bitmap, but I didn't quite get the logic right. Let's say there's an allocation request with a minlen of 5 and a maxlen of 32 and we're scanning the last rtbitmap block. If we come within 4 rtx of the end of the rt volume, maxlen will get clamped to 4. If the next 3 rtx are free, we could have satisfied the allocation, but the code setting partial besti/bestlen for "minlen < maxlen" will think that we're doing a non-variable allocation and ignore it. The root of this problem is overwriting maxlen; I should have stuffed the results in a different variable, which would not have introduced this bug. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2024-09-01xfs: reduce excessive clamping of maxlen in xfs_rtallocate_extent_nearDarrick J. Wong1-11/+12
The near rt allocator employs two allocation strategies -- first it tries to allocate at exactly @start. If that fails, it will pivot back and forth around that starting point looking for an appropriately sized free space. However, I clamped maxlen ages ago to prevent the exact allocation scan from running off the end of the rt volume. This, I realize, was excessive. If the allocation request is (say) for 32 rtx but the start position is 5 rtx from the end of the volume, we clamp maxlen to 5. If the exact allocation fails, we then pivot back and forth looking for 5 rtx, even though the original intent was to try to get 32 rtx. If we then find 5 rtx when we could have gotten 32 rtx, we've not done as well as we could have. This may be moot if the caller immediately comes back for more space, but it might not be. Either way, we can do better here. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2024-09-01xfs: clean up xfs_rtallocate_extent_exact a bitDarrick J. Wong1-20/+21
Before we start doing more surgery on the rt allocator, let's clean up the exact allocator so that it doesn't change its arguments and uses the helper introduced in the previous patch. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2024-09-01xfs: refactor aligning bestlen to prodDarrick J. Wong1-11/+15
There are two places in xfs_rtalloc.c where we want to make sure that a count of rt extents is aligned with a particular prod(uct) factor. In one spot, we actually use rounddown(), albeit unnecessarily if prod < 2. In the other case, we open-code this rounding inefficiently by promoting the 32-bit length value to a 64-bit value and then performing a 64-bit division to figure out the subtraction. Refactor this into a single helper that uses the correct types and division method for the type, and skips the division entirely unless prod is large enough to make a difference. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>