aboutsummaryrefslogtreecommitdiff
path: root/fs/xfs/libxfs/xfs_rmap_btree.c
AgeCommit message (Collapse)AuthorFilesLines
2020-07-28xfs: Remove kmem_zone_zalloc() usageCarlos Maiolino1-1/+1
Use kmem_cache_zalloc() directly. With the exception of xlog_ticket_alloc() which will be dealt on the next patch for readability. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Carlos Maiolino <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]>
2020-07-14xfs: get rid of unnecessary xfs_perag_{get,put} pairsGao Xiang1-5/+4
In the course of some operations, we look up the perag from the mount multiple times to get or change perag information. These are often very short pieces of code, so while the lookup cost is generally low, the cost of the lookup is far higher than the cost of the operation we are doing on the perag. Since we changed buffers to hold references to the perag they are cached in, many modification contexts already hold active references to the perag that are held across these operations. This is especially true for any operation that is serialised by an allocation group header buffer. In these cases, we can just use the buffer's reference to the perag to avoid needing to do lookups to access the perag. This means that many operations don't need to do perag lookups at all to access the perag because they've already looked up objects that own persistent references and hence can use that reference instead. Cc: Dave Chinner <[email protected]> Cc: "Darrick J. Wong" <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2020-03-18xfs: add support for rmap btree staging cursorsDarrick J. Wong1-10/+57
Add support for btree staging cursors for the rmap btrees. This is needed both for online repair and also to convert xfs_repair to use btree bulk loading. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2020-03-13xfs: convert btree cursor ag-private member nameDave Chinner1-14/+14
bc_private.a -> bc_ag conversion via script: `sed -i 's/bc_private\.a/bc_ag/g' fs/xfs/*[ch] fs/xfs/*/*[ch]` And then revert the change to the bc_ag #define in fs/xfs/libxfs/xfs_btree.h manually. Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2020-03-11xfs: remove XFS_BUF_TO_AGFChristoph Hellwig1-6/+6
Just dereference bp->b_addr directly and make the code a little simpler and more clear. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Eric Sandeen <[email protected]> Reviewed-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2019-06-28xfs: remove unused header filesEric Sandeen1-4/+0
There are many, many xfs header files which are included but unneeded (or included twice) in the xfs code, so remove them. nb: xfs_linux.h includes about 9 headers for everyone, so those explicit includes get removed by this. I'm not sure what the preference is, but if we wanted explicit includes everywhere, a followup patch could remove those xfs_*.h includes from xfs_linux.h and move them into the files that need them. Or it could be left as-is. Signed-off-by: Eric Sandeen <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2019-06-28xfs: add struct xfs_mount pointer to struct xfs_bufChristoph Hellwig1-1/+1
We need to derive the mount pointer from a buffer in a lot of place. Add a direct pointer to short cut the pointer chasing. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2019-05-20xfs: don't reserve per-AG space for an internal logDarrick J. Wong1-0/+9
It turns out that the log can consume nearly all the space in an AG, and when this happens this it's possible that there will be less free space in the AG than the reservation would try to hide. On a debug kernel this can trigger an ASSERT in xfs/250: XFS: Assertion failed: xfs_perag_resv(pag, XFS_AG_RESV_METADATA)->ar_reserved + xfs_perag_resv(pag, XFS_AG_RESV_RMAPBT)->ar_reserved <= pag->pagf_freeblks + pag->pagf_flcount, file: fs/xfs/libxfs/xfs_ag_resv.c, line: 319 The log is permanently allocated, so we know we're never going to have to expand the btrees to hold any records associated with the log space. We therefore can treat the space as if it doesn't exist. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Eric Sandeen <[email protected]>
2019-02-11xfs: miscellaneous verifier magic value fixupsBrian Foster1-1/+2
Most buffer verifiers have hardcoded magic value checks conditionalized on the version of the filesystem. The magic value field of the verifier structure facilitates abstraction of some of this code. Populate the ->magic field of various verifiers to take advantage of this abstraction. No functional changes. Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2018-07-29xfs: pass transaction lock while setting up agresv on cyclic metadataDarrick J. Wong1-2/+3
Pass a tranaction pointer through to all helpers that calculate the per-AG block reservation. Online repair will use this to reinitialize per-ag reservations while it still holds all the AG headers locked to the repair transaction. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2018-06-06xfs: convert to SPDX license tagsDave Chinner1-13/+1
Remove the verbose license text from XFS files and replace them with SPDX tags. This does not change the license of any of the code, merely refers to the common, up-to-date license files in LICENSES/ This change was mostly scripted. fs/xfs/Makefile and fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected and modified by the following command: for f in `git grep -l "GNU General" fs/xfs/` ; do echo $f cat $f | awk -f hdr.awk > $f.new mv -f $f.new $f done And the hdr.awk script that did the modification (including detecting the difference between GPL-2.0 and GPL-2.0+ licenses) is as follows: $ cat hdr.awk BEGIN { hdr = 1.0 tag = "GPL-2.0" str = "" } /^ \* This program is free software/ { hdr = 2.0; next } /any later version./ { tag = "GPL-2.0+" next } /^ \*\// { if (hdr > 0.0) { print "// SPDX-License-Identifier: " tag print str print $0 str="" hdr = 0.0 next } print $0 next } /^ \* / { if (hdr > 1.0) next if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 next } /^ \*/ { if (hdr > 0.0) next print $0 next } // { if (hdr > 0.0) { if (str != "") str = str "\n" str = str $0 next } print $0 } END { } $ Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2018-06-04xfs: don't ASSERT on short form btree root pointer of zeroDarrick J. Wong1-1/+0
Don't ASSERT if the short form btree root pointer is zero. Now that we use xfs_verify_agbno to check all short form btree pointers, we'll let that log the error and pass it to the upper layers. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]>
2018-04-09xfs: non-scrub - remove unused function parametersEric Sandeen1-3/+2
Signed-off-by: Eric Sandeen <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2018-03-11xfs: account only rmapbt-used blocks against rmapbt perag resBrian Foster1-0/+4
The rmapbt perag metadata reservation reserves blocks for the reverse mapping btree (rmapbt). Since the rmapbt uses blocks from the agfl and perag accounting is updated as blocks are allocated from the allocation btrees, the reservation actually accounts blocks as they are allocated to (or freed from) the agfl rather than the rmapbt itself. While this works for blocks that are eventually used for the rmapbt, not all agfl blocks are destined for the rmapbt. Blocks that are allocated to the agfl (and thus "reserved" for the rmapbt) but then used by another structure leads to a growing inconsistency over time between the runtime tracking of rmapbt usage vs. actual rmapbt usage. Since the runtime tracking thinks all agfl blocks are rmapbt blocks, it essentially believes that less future reservation is required to satisfy the rmapbt than what is actually necessary. The inconsistency is rectified across mount cycles because the perag reservation is initialized based on the actual rmapbt usage at mount time. The problem, however, is that the excessive drain of the reservation at runtime opens a window to allocate blocks for other purposes that might be required for the rmapbt on a subsequent mount. This problem can be demonstrated by a simple test that runs an allocation workload to consume agfl blocks over time and then observe the difference in the agfl reservation requirement across an unmount/mount cycle: mount ...: xfs_ag_resv_init: ... resv 3193 ask 3194 len 3194 ... ... : xfs_ag_resv_alloc_extent: ... resv 2957 ask 3194 len 1 umount...: xfs_ag_resv_free: ... resv 2956 ask 3194 len 0 mount ...: xfs_ag_resv_init: ... resv 3052 ask 3194 len 3194 As the above tracepoints show, the reservation requirement reduces from 3194 blocks to 2956 blocks as the workload runs. Without any other changes in the filesystem, the same reservation requirement jumps from 2956 to 3052 blocks over a umount/mount cycle. To address this divergence, update the RMAPBT reservation to account blocks used for the rmapbt only rather than all blocks filled into the agfl. This patch makes several high-level changes toward that end: 1.) Reintroduce an AGFL reservation type to serve as an accounting no-op for blocks allocated to (or freed from) the AGFL. 2.) Invoke RMAPBT usage accounting from the actual rmapbt block allocation path rather than the AGFL allocation path. The first change is required because agfl blocks are considered free blocks throughout their lifetime. The perag reservation subsystem is invoked unconditionally by the allocation subsystem, so we need a way to tell the perag subsystem (via the allocation subsystem) to not make any accounting changes for blocks filled into the AGFL. The second change causes the in-core RMAPBT reservation usage accounting to remain consistent with the on-disk state at all times and eliminates the risk of leaving the rmapbt reservation underfilled. Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2018-03-11Cleanup old XFS_BTREE_* tracesCarlos Maiolino1-7/+1
Remove unused legacy btree traces from IRIX era. Signed-off-by: Carlos Maiolino <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2018-01-08xfs: create a new buf_ops pointer to verify structure metadataDarrick J. Wong1-0/+1
Expose all metadata structure buffer verifier functions via buf_ops. These will be used by the online scrub mechanism to look for problems with buffers that are already sitting around in memory. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]>
2018-01-08xfs: refactor verifier callers to print address of failing checkDarrick J. Wong1-5/+13
Refactor the callers of verifiers to print the instruction address of a failing check. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]>
2018-01-08xfs: have buffer verifier functions report failing addressDarrick J. Wong1-9/+11
Modify each function that checks the contents of a metadata buffer to return the instruction address of the failing test so that we can report more precise failure errors to the log. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]>
2018-01-08xfs: refactor xfs_verifier_error and xfs_buf_ioerrorDarrick J. Wong1-7/+4
Since all verification errors also mark the buffer as having an error, we can combine these two calls. Later we'll add a xfs_failaddr_t parameter to promote the idea of reporting corruption errors and the address of the failing check to enable better debugging reports. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]>
2017-06-19xfs: always compile the btree inorder check functionsDarrick J. Wong1-4/+0
The btree record and key inorder check functions will be used by the btree scrubber code, so make sure they're always built. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2017-06-19xfs: remove double-underscore integer typesDarrick J. Wong1-15/+15
This is a purely mechanical patch that removes the private __{u,}int{8,16,32,64}_t typedefs in favor of using the system {u,}int{8,16,32,64}_t typedefs. This is the sed script used to perform the transformation and fix the resulting whitespace and indentation errors: s/typedef\t__uint8_t/typedef __uint8_t\t/g s/typedef\t__uint/typedef __uint/g s/typedef\t__int\([0-9]*\)_t/typedef int\1_t\t/g s/__uint8_t\t/__uint8_t\t\t/g s/__uint/uint/g s/__int\([0-9]*\)_t\t/__int\1_t\t\t/g s/__int/int/g /^typedef.*int[0-9]*_t;$/d Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2017-01-03xfs: use the actual AG length when reserving blocksDarrick J. Wong1-7/+7
We need to use the actual AG length when making per-AG reservations, since we could otherwise end up reserving more blocks out of the last AG than there are actual blocks. Complained-about-by: Brian Foster <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2016-12-05xfs: make xfs btree stats less hugeDave Chinner1-0/+1
Embedding a switch statement in every btree stats inc/add adds a lot of code overhead to the core btree infrastructure paths. Stats are supposed to be small and lightweight, but the btree stats have become big and bloated as we've added more btrees. It needs fixing because the reflink code will just add more overhead again. Convert the v2 btree stats to arrays instead of independent variables, and instead use the type to index the specific btree array via an enum. This allows us to use array based indexing to update the stats, rather than having to derefence variables specific to the btree type. If we then wrap the xfsstats structure in a union and place uint32_t array beside it, and calculate the correct btree stats array base array index when creating a btree cursor, we can easily access entries in the stats structure without having to switch names based on the btree type. We then replace with the switch statement with a simple set of stats wrapper macros, resulting in a significant simplification of the btree stats code, and: text data bss dec hex filename 48905 144 8 49057 bfa1 fs/xfs/libxfs/xfs_btree.o.old 36793 144 8 36945 9051 fs/xfs/libxfs/xfs_btree.o it reduces the core btree infrastructure code size by close to 25%! Signed-off-by: Dave Chinner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-10-05xfs: preallocate blocks for worst-case btree expansionDarrick J. Wong1-0/+60
To gracefully handle the situation where a CoW operation turns a single refcount extent into a lot of tiny ones and then run out of space when a tree split has to happen, use the per-AG reserved block pool to pre-allocate all the space we'll ever need for a maximal btree. For a 4K block size, this only costs an overhead of 0.3% of available disk space. When reflink is enabled, we have an unfortunate problem with rmap -- since we can share a block billions of times, this means that the reverse mapping btree can expand basically infinitely. When an AG is so full that there are no free blocks with which to expand the rmapbt, the filesystem will shut down hard. This is rather annoying to the user, so use the AG reservation code to reserve a "reasonable" amount of space for rmap. We'll prevent reflinks and CoW operations if we think we're getting close to exhausting an AG's free space rather than shutting down, but this permanent reservation should be enough for "most" users. Hopefully. Signed-off-by: Darrick J. Wong <[email protected]> [[email protected]: ensure that we invalidate the freed btree buffer] Signed-off-by: Christoph Hellwig <[email protected]>
2016-10-03xfs: introduce refcount btree definitionsDarrick J. Wong1-2/+20
Add new per-AG refcount btree definitions to the per-AG structures. Signed-off-by: Darrick J. Wong <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2016-08-17xfs: store rmapbt block count in the AGFDarrick J. Wong1-0/+6
Track the number of blocks used for the rmapbt in the AGF. When we get to the AG reservation code we need this counter to quickly make our reservation during mount. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: remove the get*keys and update_keys btree ops pointersDarrick J. Wong1-4/+0
These are internal btree functions; we don't need them to be dispatched via function pointers. Make them static again and just check the overlapped flag to figure out what we need to do. The strategy behind this patch was suggested by Christoph. Signed-off-by: Darrick J. Wong <[email protected]> Suggested-by: Christoph Hellwig <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: support overlapping intervals in the rmap btreeDarrick J. Wong1-2/+66
Now that the generic btree code supports overlapping intervals, plug in the rmap btree to this functionality. We will need it to find potential left neighbors in xfs_rmap_{alloc,free} later in the patch set. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: add rmap btree operationsDarrick J. Wong1-0/+267
Originally-From: Dave Chinner <[email protected]> Implement the generic btree operations needed to manipulate rmap btree blocks. This is very similar to the per-ag freespace btree implementation, and uses the AGFL for allocation and freeing of blocks. Adapt the rmap btree to store owner offsets within each rmap record, and to handle the primary key being redefined as the tuple [agblk, owner, offset]. The expansion of the primary key is crucial to allowing multiple owners per extent. [darrick: adapt the btree ops to deal with offsets] [darrick: remove init_rec_from_key] [darrick: move unwritten bit to rm_offset] Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Signed-off-by: Dave Chinner <[email protected]>
2016-08-03xfs: define the on-disk rmap btree formatDarrick J. Wong1-0/+184
Originally-From: Dave Chinner <[email protected]> Now we have all the surrounding call infrastructure in place, we can start filling out the rmap btree implementation. Start with the on-disk btree format; add everything needed to read, write and manipulate rmap btree blocks. This prepares the way for adding the btree operations implementation. [darrick: record owner and offset info in rmap btree] [darrick: fork, bmbt and unwritten state in rmap btree] [darrick: flags are a separate field in xfs_rmap_irec] [darrick: calculate maxlevels separately] [darrick: move the 'unwritten' bit into unused parts of rm_offset] Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]>