aboutsummaryrefslogtreecommitdiff
path: root/fs/xfs/libxfs
AgeCommit message (Collapse)AuthorFilesLines
2021-06-09xfs: Make attr name schemes consistentAllison Henderson3-11/+11
This patch renames the following functions to make the nameing scheme more consistent: xfs_attr_shortform_remove -> xfs_attr_sf_removename xfs_attr_node_remove_name -> xfs_attr_node_removename xfs_attr_set_fmt -> xfs_attr_sf_addname Suggested-by: Darrick J. Wong <[email protected]> Signed-off-by: Allison Henderson <[email protected]> Reviewed-by: Chandan Babu R <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-09xfs: Fix default ASSERT in xfs_attr_set_iterAllison Henderson1-1/+1
This ASSERT checks for the state value of RM_SHRINK in the set path which should never happen. Change to ASSERT(0); Suggested-by: Darrick J. Wong <[email protected]> Signed-off-by: Allison Henderson <[email protected]> Reviewed-by: Chandan Babu R <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-08Merge tag 'inode-walk-cleanups-5.14_2021-06-03' of ↵Darrick J. Wong2-6/+6
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-5.14-merge2 xfs: clean up incore inode walk functions This ambitious series aims to cleans up redundant inode walk code in xfs_icache.c, hide implementation details of the quotaoff dquot release code, and eliminates indirect function calls from incore inode walks. The first thing it does is to move all the code that quotaoff calls to release dquots from all incore inodes into xfs_icache.c. Next, it separates the goal of an inode walk from the actual radix tree tags that may or may not be involved and drops the kludgy XFS_ICI_NO_TAG thing. Finally, we split the speculative preallocation (blockgc) and quotaoff dquot release code paths into separate functions so that we can keep the implementations cohesive. Christoph suggested last cycle that we 'simply' change quotaoff not to allow deactivating quota entirely, but as these cleanups are to enable one major change in behavior (deferred inode inactivation) I do not want to add a second behavior change (quotaoff) as a dependency. To be blunt: Additional cleanups are not in scope for this series. Next, I made two observations about incore inode radix tree walks -- since there's a 1:1 mapping between the walk goal and the per-inode processing function passed in, we can use the goal to make a direct call to the processing function. Furthermore, the only caller to supply a nonzero iter_flags argument is quotaoff, and there's only one INEW flag. From that observation, I concluded that it's quite possible to remove two parameters from the xfs_inode_walk* function signatures -- the iter_flags, and the execute function pointer. The middle of the series moves the INEW functionality into the one piece (quotaoff) that wants it, and removes the indirect calls. The final observation is that the inode reclaim walk loop is now almost the same as xfs_inode_walk, so it's silly to maintain two copies. Merge the reclaim loop code into xfs_inode_walk. Lastly, refactor the per-ag radix tagging functions since there's duplicated code that can be consolidated. This series is a prerequisite for the next two patchsets, since deferred inode inactivation will add another inode radix tree tag and iterator function to xfs_inode_walk. v2: walk the vfs inode list when running quotaoff instead of the radix tree, then rework the (now completely internal) inode walk function to take the tag as the main parameter. v3: merge the reclaim loop into xfs_inode_walk, then consolidate the radix tree tagging functions v4: rebase to 5.13-rc4 v5: combine with the quotaoff patchset, reorder functions to minimize forward declarations, split inode walk goals from radix tree tags to reduce conceptual confusion v6: start moving the inode cache code towards the xfs_icwalk prefix * tag 'inode-walk-cleanups-5.14_2021-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: refactor per-AG inode tagging functions xfs: merge xfs_reclaim_inodes_ag into xfs_inode_walk_ag xfs: pass struct xfs_eofblocks to the inode scan callback xfs: fix radix tree tag signs xfs: make the icwalk processing functions clean up the grab state xfs: clean up inode state flag tests in xfs_blockgc_igrab xfs: remove indirect calls from xfs_inode_walk{,_ag} xfs: remove iter_flags parameter from xfs_inode_walk_* xfs: move xfs_inew_wait call into xfs_dqrele_inode xfs: separate the dqrele_all inode grab logic from xfs_inode_walk_ag_grab xfs: pass the goal of the incore inode walk to xfs_inode_walk() xfs: rename xfs_inode_walk functions to xfs_icwalk xfs: move the inode walk functions further down xfs: detach inode dquots at the end of inactivation xfs: move the quotaoff dqrele inode walk into xfs_icache.c [djwong: added variable names to function declarations while fixing merge conflicts] Signed-off-by: Darrick J. Wong <[email protected]>
2021-06-08Merge tag 'assorted-fixes-5.14-1_2021-06-03' of ↵Darrick J. Wong4-15/+9
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-5.14-merge2 xfs: assorted fixes for 5.14, part 1 This branch contains the first round of various small fixes for 5.14. * tag 'assorted-fixes-5.14-1_2021-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: don't take a spinlock unconditionally in the DIO fastpath xfs: mark xfs_bmap_set_attrforkoff static xfs: Remove redundant assignment to busy xfs: sort variable alphabetically to avoid repeated declaration
2021-06-08Merge tag 'unit-conversion-cleanups-5.14_2021-06-03' of ↵Darrick J. Wong1-1/+1
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-5.14-merge2 xfs: various unit conversions Crafting the realtime file extent size hint fixes revealed various opportunities to clean up unit conversions, so now that gets its own series. * tag 'unit-conversion-cleanups-5.14_2021-06-03' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: remove unnecessary shifts xfs: clean up open-coded fs block unit conversions
2021-06-08xfs: drop the AGI being passed to xfs_check_agi_freecountDave Chinner1-15/+13
From: Dave Chinner <[email protected]> Stephen Rothwell reported this compiler warning from linux-next: fs/xfs/libxfs/xfs_ialloc.c: In function 'xfs_difree_finobt': fs/xfs/libxfs/xfs_ialloc.c:2032:20: warning: unused variable 'agi' [-Wunused-variable] 2032 | struct xfs_agi *agi = agbp->b_addr; Which is fallout from agno -> perag conversions that were done in this function. xfs_check_agi_freecount() is the only user of "agi" in xfs_difree_finobt() now, and it only uses the agi to get the current free inode count. We hold that in the perag structure, so there's not need to directly reference the raw AGI to get this information. The btree cursor being passed to xfs_check_agi_freecount() has a reference to the perag being operated on, so use that directly in xfs_check_agi_freecount() rather than passing an AGI. Fixes: 7b13c5155182 ("xfs: use perag for ialloc btree cursors") Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Carlos Maiolino <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-06-08Merge tag 'xfs-perag-conv-tag' of ↵Darrick J. Wong27-804/+1057
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs into xfs-5.14-merge2 xfs: initial agnumber -> perag conversions for shrink If we want to use active references to the perag to be able to gate shrink removing AGs and hence perags safely, we've got a fair bit of work to do actually use perags in all the places we need to. There's a lot of code that iterates ag numbers and then looks up perags from that, often multiple times for the same perag in the one operation. If we want to use reference counted perags for access control, then we need to convert all these uses to perag iterators, not agno iterators. [Patches 1-4] The first step of this is consolidating all the perag management - init, free, get, put, etc into a common location. THis is spread all over the place right now, so move it all into libxfs/xfs_ag.[ch]. This does expose kernel only bits of the perag to libxfs and hence userspace, so the structures and code is rearranged to minimise the number of ifdefs that need to be added to the userspace codebase. The perag iterator in xfs_icache.c is promoted to a first class API and expanded to the needs of the code as required. [Patches 5-10] These are the first basic perag iterator conversions and changes to pass the perag down the stack from those iterators where appropriate. A lot of this is obvious, simple changes, though in some places we stop passing the perag down the stack because the code enters into an as yet unconverted subsystem that still uses raw AGs. [Patches 11-16] These replace the agno passed in the btree cursor for per-ag btree operations with a perag that is passed to the cursor init function. The cursor takes it's own reference to the perag, and the reference is dropped when the cursor is deleted. Hence we get reference coverage for the entire time the cursor is active, even if the code that initialised the cursor drops it's reference before the cursor or any of it's children (duplicates) have been deleted. The first patch adds the perag infrastructure for the cursor, the next four patches convert a btree cursor at a time, and the last removes the agno from the cursor once it is unused. [Patches 17-21] These patches are a demonstration of the simplifications and cleanups that come from plumbing the perag through interfaces that select and then operate on a specific AG. In this case the inode allocation algorithm does up to three walks across all AGs before it either allocates an inode or fails. Two of these walks are purely just to select the AG, and even then it doesn't guarantee inode allocation success so there's a third walk if the selected AG allocation fails. These patches collapse the selection and allocation into a single loop, simplifies the error handling because xfs_dir_ialloc() always returns ENOSPC if no AG was selected for inode allocation or we fail to allocate an inode in any AG, gets rid of xfs_dir_ialloc() wrapper, converts inode allocation to run entirely from a single perag instance, and then factors xfs_dialloc() into a much, much simpler loop which is easy to understand. Hence we end up with the same inode allocation logic, but it only needs two complete iterations at worst, makes AG selection and allocation atomic w.r.t. shrink and chops out out over 100 lines of code from this hot code path. [Patch 22] Converts the unlink path to pass perags through it. There's more conversion work to be done, but this patchset gets through a large chunk of it in one hit. Most of the iterators are converted, so once this is solidified we can move on to converting these to active references for being able to free perags while the fs is still active. * tag 'xfs-perag-conv-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (23 commits) xfs: remove xfs_perag_t xfs: use perag through unlink processing xfs: clean up and simplify xfs_dialloc() xfs: inode allocation can use a single perag instance xfs: get rid of xfs_dir_ialloc() xfs: collapse AG selection for inode allocation xfs: simplify xfs_dialloc_select_ag() return values xfs: remove agno from btree cursor xfs: use perag for ialloc btree cursors xfs: convert allocbt cursors to use perags xfs: convert refcount btree cursor to use perags xfs: convert rmap btree cursor to using a perag xfs: add a perag to the btree cursor xfs: pass perags around in fsmap data dev functions xfs: push perags through the ag reservation callouts xfs: pass perags through to the busy extent code xfs: convert secondary superblock walk to use perags xfs: convert xfs_iwalk to use perag references xfs: convert raw ag walks to use for_each_perag xfs: make for_each_perag... a first class citizen ...
2021-06-08Merge tag 'xfs-buf-bulk-alloc-tag' of ↵Darrick J. Wong1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs into xfs-5.14-merge2 xfs: buffer cache bulk page allocation This patchset makes use of the new bulk page allocation interface to reduce the overhead of allocating large numbers of pages in a loop. The first two patches are refactoring buffer memory allocation and converting the uncached buffer path to use the same page allocation path, followed by converting the page allocation path to use bulk allocation. The rest of the patches are then consolidation of the page allocation and freeing code to simplify the code and remove a chunk of unnecessary abstraction. This is largely based on a series of changes made by Christoph Hellwig. * tag 'xfs-buf-bulk-alloc-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: xfs: merge xfs_buf_allocate_memory xfs: cleanup error handling in xfs_buf_get_map xfs: get rid of xb_to_gfp() xfs: simplify the b_page_count calculation xfs: remove ->b_offset handling for page backed buffers xfs: move page freeing into _xfs_buf_free_pages() xfs: merge _xfs_buf_get_pages() xfs: use alloc_pages_bulk_array() for buffers xfs: use xfs_buf_alloc_pages for uncached buffers xfs: split up xfs_buf_allocate_memory
2021-06-03xfs: fix radix tree tag signsDarrick J. Wong2-3/+3
Radix tree tags are supposed to be unsigned ints, so fix the callers. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]>
2021-06-02xfs: mark xfs_bmap_set_attrforkoff staticChristoph Hellwig2-2/+1
xfs_bmap_set_attrforkoff is only used inside of xfs_bmap.c, so mark it static. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: Remove redundant assignment to busyJiapeng Chong1-1/+0
Variable busy is set to false, but this value is never read as it is overwritten or not used later on, hence it is a redundant assignment and can be removed. Clean up the following clang-analyzer warning: fs/xfs/libxfs/xfs_alloc.c:1679:2: warning: Value stored to 'busy' is never read [clang-analyzer-deadcode.DeadStores]. Reported-by: Abaci Robot <[email protected]> Signed-off-by: Jiapeng Chong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: sort variable alphabetically to avoid repeated declarationShaokun Zhang1-12/+8
Variable 'xfs_agf_buf_ops', 'xfs_agi_buf_ops', 'xfs_dquot_buf_ops' and 'xfs_symlink_buf_ops' are declared twice, so sort these variables alphabetically and remove the repeated declaration. Cc: "Darrick J. Wong" <[email protected]> Signed-off-by: Shaokun Zhang <[email protected]> Reviewed-by: Carlos Maiolino <[email protected] Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: remove xfs_perag_tDave Chinner3-35/+35
Almost unused, gets rid of another typedef. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: use perag through unlink processingDave Chinner2-24/+12
Unlinked lists are held in the perag, and freeing of inodes needs to be passed a perag, too, so look up the perag early in the unlink processing and use it throughout. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-06-02xfs: clean up and simplify xfs_dialloc()Dave Chinner1-118/+153
Because it's a mess. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: inode allocation can use a single perag instanceDave Chinner1-3/+3
Now that we've internalised the two-phase inode allocation, we can now easily make the AG selection and allocation atomic from the perspective of a single perag context. This will ensure AGs going offline/away cannot occur between the selection and allocation steps. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: get rid of xfs_dir_ialloc()Dave Chinner2-30/+14
This is just a simple wrapper around the per-ag inode allocation that doesn't need to exist. The internal mechanism to select and allocate within an AG does not need to be exposed outside xfs_ialloc.c, and it being exposed simply makes it harder to follow the code and simplify it. This is simplified by internalising xf_dialloc_select_ag() and xfs_dialloc_ag() into a single xfs_dialloc() function and then xfs_dir_ialloc() can go away. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: collapse AG selection for inode allocationDave Chinner1-147/+78
xfs_dialloc_select_ag() does a lot of repetitive work. It first calls xfs_ialloc_ag_select() to select the AG to start allocation attempts in, which can do up to two entire loops across the perags that inodes can be allocated in. This is simply checking if there is spce available to allocate inodes in an AG, and it returns when it finds the first candidate AG. xfs_dialloc_select_ag() then does it's own iterative walk across all the perags locking the AGIs and trying to allocate inodes from the locked AG. It also doesn't limit the search to mp->m_maxagi, so it will walk all AGs whether they can allocate inodes or not. Hence if we are really low on inodes, we could do almost 3 entire walks across the whole perag range before we find an allocation group we can allocate inodes in or report ENOSPC. Because xfs_ialloc_ag_select() returns on the first candidate AG it finds, we can simply do these checks directly in xfs_dialloc_select_ag() before we lock and try to allocate inodes. This reduces the inode allocation pass down to 2 perag sweeps at most - one for aligned inode cluster allocation and if we can't allocate full, aligned inode clusters anywhere we'll do another pass trying to do sparse inode cluster allocation. This also removes a big chunk of duplicate code. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: simplify xfs_dialloc_select_ag() return valuesDave Chinner1-15/+8
The only caller of xfs_dialloc_select_ag() will always return -ENOSPC to it's caller if the agbp returned from xfs_dialloc_select_ag() is NULL. IOWs, failure to find a candidate AGI we can allocate inodes from is always an ENOSPC condition, so move this logic up into xfs_dialloc_select_ag() so we can simplify the return logic in this function. xfs_dialloc_select_ag() now only ever returns 0 with a locked agbp, or an error with no agbp. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: remove agno from btree cursorDave Chinner10-116/+111
Now that everything passes a perag, the agno is not needed anymore. Convert all the users to use pag->pag_agno instead and remove the agno from the cursor. This was largely done as an automated search and replace. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: use perag for ialloc btree cursorsDave Chinner3-107/+103
Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: convert allocbt cursors to use peragsDave Chinner3-35/+24
Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: convert refcount btree cursor to use peragsDave Chinner4-34/+41
Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: convert rmap btree cursor to using a peragDave Chinner6-34/+32
Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: add a perag to the btree cursorDave Chinner14-47/+93
Which will eventually completely replace the agno in it. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-06-02xfs: pass perags around in fsmap data dev functionsDave Chinner1-2/+13
Needs a [from, to] ranged AG walk, and the perag to be stuffed into the info structure for callouts to use. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: push perags through the ag reservation calloutsDave Chinner7-23/+23
We currently pass an agno from the AG reservation functions to the individual feature accounting functions, which in future may have to do perag lookups to access per-AG state. Instead, pre-emptively plumb the perag through from the highest AG reservation layer to the feature callouts so they won't have to look it up again. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-06-02xfs: pass perags through to the busy extent codeDave Chinner5-39/+44
All of the callers of the busy extent API either have perag references available to use so we can pass a perag to the busy extent functions rather than having them have to do unnecessary lookups. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: convert secondary superblock walk to use peragsDave Chinner1-5/+7
Clean up the last external manual AG walk. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: convert xfs_iwalk to use perag referencesDave Chinner1-6/+10
Rather than manually walking the ags and passing agnunbers around, pass the perag for the AG we are currently working on around in the iwalk structure. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: convert raw ag walks to use for_each_peragDave Chinner1-1/+3
Convert the raw walks to an iterator, pulling the current AG out of pag->pag_agno instead of the loop iterator variable. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: make for_each_perag... a first class citizenDave Chinner1-0/+17
for_each_perag_tag() is defined in xfs_icache.c for local use. Promote this to xfs_ag.h and define equivalent iteration functions so that we can use them to iterate AGs instead to replace open coded perag walks and perag lookups. We also convert as many of the straight forward open coded AG walks to use these iterators as possible. Anything that is not a direct conversion to an iterator is ignored and will be updated in future commits. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: move perag structure and setup to libxfs/xfs_ag.[ch]Dave Chinner4-2/+247
Move the xfs_perag infrastructure to the libxfs files that contain all the per AG infrastructure. This helps set up for passing perags around all the code instead of bare agnos with minimal extra includes for existing files. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-02xfs: move xfs_perag_get/put to xfs_ag.[ch]Dave Chinner13-149/+154
They are AG functions, not superblock functions, so move them to the appropriate location. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-01xfs: clean up open-coded fs block unit conversionsDarrick J. Wong1-1/+1
Replace some open-coded fs block unit conversions with the standard conversion macro. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Carlos Maiolino <[email protected]>
2021-06-01xfs: Clean up xfs_attr_node_addname_clear_incompleteAllison Henderson1-8/+3
We can use the helper function xfs_attr_node_remove_name to reduce duplicate code in this function Signed-off-by: Allison Henderson <[email protected]> Reviewed-by: Chandan Babu R <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-01xfs: Remove xfs_attr_rmtval_setAllison Henderson2-67/+0
This function is no longer used, so it is safe to remove Signed-off-by: Allison Henderson <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Chandan Babu R <[email protected]>
2021-06-01xfs: Add delay ready attr set routinesAllison Henderson4-219/+610
This patch modifies the attr set routines to be delay ready. This means they no longer roll or commit transactions, but instead return -EAGAIN to have the calling routine roll and refresh the transaction. In this series, xfs_attr_set_args has become xfs_attr_set_iter, which uses a state machine like switch to keep track of where it was when EAGAIN was returned. See xfs_attr.h for a more detailed diagram of the states. Two new helper functions have been added: xfs_attr_rmtval_find_space and xfs_attr_rmtval_set_blk. They provide a subset of logic similar to xfs_attr_rmtval_set, but they store the current block in the delay attr context to allow the caller to roll the transaction between allocations. This helps to simplify and consolidate code used by xfs_attr_leaf_addname and xfs_attr_node_addname. xfs_attr_set_args has now become a simple loop to refresh the transaction until the operation is completed. Lastly, xfs_attr_rmtval_remove is no longer used, and is removed. Signed-off-by: Allison Henderson <[email protected]> Reviewed-by: Chandan Babu R <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-06-01xfs: Add delay ready attr remove routinesAllison Henderson5-85/+326
This patch modifies the attr remove routines to be delay ready. This means they no longer roll or commit transactions, but instead return -EAGAIN to have the calling routine roll and refresh the transaction. In this series, xfs_attr_remove_args is merged with xfs_attr_node_removename become a new function, xfs_attr_remove_iter. This new version uses a sort of state machine like switch to keep track of where it was when EAGAIN was returned. A new version of xfs_attr_remove_args consists of a simple loop to refresh the transaction until the operation is completed. A new XFS_DAC_DEFER_FINISH flag is used to finish the transaction where ever the existing code used to. Calls to xfs_attr_rmtval_remove are replaced with the delay ready version __xfs_attr_rmtval_remove. We will rename __xfs_attr_rmtval_remove back to xfs_attr_rmtval_remove when we are done. xfs_attr_rmtval_remove itself is still in use by the set routines (used during a rename). For reasons of preserving existing function, we modify xfs_attr_rmtval_remove to call xfs_defer_finish when the flag is set. Similar to how xfs_attr_remove_args does here. Once we transition the set routines to be delay ready, xfs_attr_rmtval_remove is no longer used and will be removed. This patch also adds a new struct xfs_delattr_context, which we will use to keep track of the current state of an attribute operation. The new xfs_delattr_state enum is used to track various operations that are in progress so that we know not to repeat them, and resume where we left off before EAGAIN was returned to cycle out the transaction. Other members take the place of local variables that need to retain their values across multiple function calls. See xfs_attr.h for a more detailed diagram of the states. Signed-off-by: Allison Henderson <[email protected]> Reviewed-by: Chandan Babu R <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-01xfs: Hoist node transaction handlingAllison Henderson1-26/+29
This patch basically hoists the node transaction handling around the leaf code we just hoisted. This will helps setup this area for the state machine since the goto is easily replaced with a state since it ends with a transaction roll. Signed-off-by: Allison Henderson <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Chandan Babu R <[email protected]>
2021-06-01xfs: Hoist xfs_attr_leaf_addnameAllison Henderson1-113/+96
This patch hoists xfs_attr_leaf_addname into the calling function. The goal being to get all the code that will require state management into the same scope. This isn't particularly aesthetic right away, but it is a preliminary step to merging in the state machine code. Signed-off-by: Allison Henderson <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Chandan Babu R <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-06-01xfs: Hoist xfs_attr_node_addnameAllison Henderson1-84/+75
This patch hoists the later half of xfs_attr_node_addname into the calling function. We do this because it is this area that will need the most state management, and we want to keep such code in the same scope as much as possible Signed-off-by: Allison Henderson <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Chandan Babu R <[email protected]>
2021-06-01xfs: Add helper xfs_attr_node_addname_find_attrAllison Henderson1-33/+54
This patch separates the first half of xfs_attr_node_addname into a helper function xfs_attr_node_addname_find_attr. It also replaces the restart goto with an EAGAIN return code driven by a loop in the calling function. This looks odd now, but will clean up nicly once we introduce the state machine. It will also enable hoisting the last state out of xfs_attr_node_addname with out having to plumb in a "done" parameter to know if we need to move to the next state or not. Signed-off-by: Allison Henderson <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Chandan Babu R <[email protected]>
2021-06-01xfs: Separate xfs_attr_node_addname and xfs_attr_node_addname_clear_incompleteAllison Henderson1-0/+23
This patch separate xfs_attr_node_addname into two functions. This will help to make it easier to hoist parts of xfs_attr_node_addname that need state management Signed-off-by: Allison Henderson <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Chandan Babu R <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-01xfs: Refactor xfs_attr_set_shortformAllison Henderson1-28/+14
This patch is actually the combination of patches from the previous version (v18). Initially patch 3 hoisted xfs_attr_set_shortform, and the next added the helper xfs_attr_set_fmt. xfs_attr_set_fmt is similar the old xfs_attr_set_shortform. It returns 0 when the attr has been set and no further action is needed. It returns -EAGAIN when shortform has been transformed to leaf, and the calling function should proceed the set the attr in leaf form. Signed-off-by: Allison Henderson <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Chandan Babu R <[email protected]>
2021-06-01xfs: Add xfs_attr_node_remove_nameAllison Henderson1-9/+20
This patch pulls a new helper function xfs_attr_node_remove_name out of xfs_attr_node_remove_step. This helps to modularize xfs_attr_node_remove_step which will help make the delayed attribute code easier to follow Signed-off-by: Allison Henderson <[email protected]> Reviewed-by: Chandan Babu R <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-06-01xfs: Reverse apply 72b97ea40dAllison Henderson1-19/+9
Originally we added this patch to help modularize the attr code in preparation for delayed attributes and the state machine it requires. However, later reviews found that this slightly alters the transaction handling as the helper function is ambiguous as to whether the transaction is diry or clean. This may cause a dirty transaction to be included in the next roll, where previously it had not. To preserve the existing code flow, we reverse apply this commit. Signed-off-by: Allison Henderson <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Chandan Babu R <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-06-01xfs: use xfs_buf_alloc_pages for uncached buffersDave Chinner1-1/+0
Use the newly factored out page allocation code. This adds automatic buffer zeroing for non-read uncached buffers. This also allows us to greatly simply the error handling in xfs_buf_get_uncached(). Because xfs_buf_alloc_pages() cleans up partial allocation failure, we can just call xfs_buf_free() in all error cases now to clean up after failures. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2021-05-27xfs: bunmapi has unnecessary AG lock ordering issuesDave Chinner1-11/+0
large directory block size operations are assert failing because xfs_bunmapi() is not completely removing fragmented directory blocks like so: XFS: Assertion failed: done, file: fs/xfs/libxfs/xfs_dir2.c, line: 677 .... Call Trace: xfs_dir2_shrink_inode+0x1a8/0x210 xfs_dir2_block_to_sf+0x2ae/0x410 xfs_dir2_block_removename+0x21a/0x280 xfs_dir_removename+0x195/0x1d0 xfs_rename+0xb79/0xc50 ? avc_has_perm+0x8d/0x1a0 ? avc_has_perm_noaudit+0x9a/0x120 xfs_vn_rename+0xdb/0x150 vfs_rename+0x719/0xb50 ? __lookup_hash+0x6a/0xa0 do_renameat2+0x413/0x5e0 __x64_sys_rename+0x45/0x50 do_syscall_64+0x3a/0x70 entry_SYSCALL_64_after_hwframe+0x44/0xae We are aborting the bunmapi() pass because of this specific chunk of code: /* * Make sure we don't touch multiple AGF headers out of order * in a single transaction, as that could cause AB-BA deadlocks. */ if (!wasdel && !isrt) { agno = XFS_FSB_TO_AGNO(mp, del.br_startblock); if (prev_agno != NULLAGNUMBER && prev_agno > agno) break; prev_agno = agno; } This is designed to prevent deadlocks in AGF locking when freeing multiple extents by ensuring that we only ever lock in increasing AG number order. Unfortunately, this also violates the "bunmapi will always succeed" semantic that some high level callers depend on, such as xfs_dir2_shrink_inode(), xfs_da_shrink_inode() and xfs_inactive_symlink_rmt(). This AG lock ordering was introduced back in 2017 to fix deadlocks triggered by generic/299 as reported here: https://lore.kernel.org/linux-xfs/[email protected]/ This codebase is old enough that it was before we were defering all AG based extent freeing from within xfs_bunmapi(). THat is, we never actually lock AGs in xfs_bunmapi() any more - every non-rt based extent free is added to the defer ops list, as is all BMBT block freeing. And RT extents are not RT based, so there's no lock ordering issues associated with them. Hence this AGF lock ordering code is both broken and dead. Let's just remove it so that the large directory block code works reliably again. Tested against xfs/538 and generic/299 which is the original test that exposed the deadlocks that this code fixed. Fixes: 5b094d6dac04 ("xfs: fix multi-AG deadlock in xfs_bunmapi") Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2021-05-27xfs: btree format inode forks can have zero extentsDave Chinner1-1/+0
xfs/538 is assert failing with this trace when testing with directory block sizes of 64kB: XFS: Assertion failed: !xfs_need_iread_extents(ifp), file: fs/xfs/libxfs/xfs_bmap.c, line: 608 .... Call Trace: xfs_bmap_btree_to_extents+0x2a9/0x470 ? kmem_cache_alloc+0xe7/0x220 __xfs_bunmapi+0x4ca/0xdf0 xfs_bunmapi+0x1a/0x30 xfs_dir2_shrink_inode+0x71/0x210 xfs_dir2_block_to_sf+0x2ae/0x410 xfs_dir2_block_removename+0x21a/0x280 xfs_dir_removename+0x195/0x1d0 xfs_remove+0x244/0x460 xfs_vn_unlink+0x53/0xa0 ? selinux_inode_unlink+0x13/0x20 vfs_unlink+0x117/0x220 do_unlinkat+0x1a2/0x2d0 __x64_sys_unlink+0x42/0x60 do_syscall_64+0x3a/0x70 entry_SYSCALL_64_after_hwframe+0x44/0xae This is a check to ensure that the extents have been read into memory before we are doing a ifork btree manipulation. This assert is bogus in the above case. We have a fragmented directory block that has more extents in it than can fit in extent format, so the inode data fork is in btree format. xfs_dir2_shrink_inode() asks to remove all remaining 16 filesystem blocks from the inode so it can convert to short form, and __xfs_bunmapi() removes all the extents. We now have a data fork in btree format but have zero extents in the fork. This incorrectly trips the xfs_need_iread_extents() assert because it assumes that an empty extent btree means the extent tree has not been read into memory yet. This is clearly not the case with xfs_bunmapi(), as it has an explicit call to xfs_iread_extents() in it to pull the extents into memory before it starts unmapping. Also, the assert directly after this bogus one is: ASSERT(ifp->if_format == XFS_DINODE_FMT_BTREE); Which covers the context in which it is legal to call xfs_bmap_btree_to_extents just fine. Hence we should just remove the bogus assert as it is clearly wrong and causes a regression. The returns the test behaviour to the pre-existing assert failure in xfs_dir2_shrink_inode() that indicates xfs_bunmapi() has failed to remove all the extents in the range it was asked to unmap. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>