Age | Commit message (Collapse) | Author | Files | Lines |
|
The code for this helper is in space-info.c, move the prototype to
space-info.h.
Reviewed-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Anand Jain <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This is actually embedded in struct btrfs_block_group, so move this
definition to block-group.h, and then open-code the init of the tree
where we init the rest of the block group instead of using a helper.
Reviewed-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Anand Jain <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This is a block group related definition, move it into block-group.h.
Reviewed-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Anand Jain <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
There is a separate I/O failure tree to track the fail reads, so remove
the extra EXTENT_DAMAGED bit in the I/O tree as it's set but never used.
Reviewed-by: Josef Bacik <[email protected]>
Reviewed-by: Qu Wenruo <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
We're only initializing extent_io_tree's with a private data if we're a
normal inode, so we don't need this extra check.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
We only use this for normal inodes, so don't set it if we're not a
normal inode.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Instead of taking up a whole argument to indicate we're clearing
everything in a range, simply add another EXTENT bit to control this,
and then update all the callers to drop this argument from the
clear_extent_bit variants.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
When trying to release the extent states due to memory pressure we'll
set all the bits except LOCKED, NODATASUM, and DELALLOC_NEW. This
includes some of the CTL bits, which isn't really a problem but isn't
correct either. Exclude the CTL bits from this clearing.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This was used as an optimization for count_range_bits(EXTENT_DIRTY),
which was used by the failed record code. However this was removed in
this series by patch "btrfs: convert the io_failure_tree to a plain
rb_tree" which was the last user of this optimization. Remove the
->dirty_bytes as nobody cares anymore.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Since commit 78361f64ff42 ("btrfs: remove unnecessary EXTENT_UPTODATE
state in buffered I/O path") we no longer check ->track_uptodate, remove
it.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
We have two variants of lock/unlock extent, one set that takes a cached
state, another that does not. This is slightly annoying, and generally
speaking there are only a few places where we don't have a cached state.
Simplify this by making lock_extent/unlock_extent the only variant and
make it take a cached state, then convert all the callers appropriately.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
The only places that set extent_changeset is set_record_extent_bits,
everywhere else sets it to NULL. Drop this argument from
set_extent_bit.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This is only used for internal locking related helpers, everybody else
just passes in NULL. I've changed set_extent_bit to __set_extent_bit
and made it static, removed failed_start from set_extent_bit and have it
call __set_extent_bit with a NULL failed_start, and I've moved some code
down below the now static __set_extent_bit.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This is only used in the case that we are clearing EXTENT_LOCKED, so
infer this value from the bits passed in instead of taking it as an
argument.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This is only ever set if we have EXTENT_LOCKED set, so simply push this
into the function itself and remove the function argument.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
These prototypes have nothing to do with the extent_io_tree helpers,
move them to their appropriate header.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
We use rb_next/rb_prev and then get the entry for the adjacent items in
an extent io tree. We have helpers for this, so convert merge_state to
use next_state/prev_state and simplify the code.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Instead of doing the rb_entry again once we return from this function,
simply return the actual states themselves, and then clean up the only
user of this helper to handle states instead of nodes.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
We use this to search for an extent state, or return the nodes we need
to insert a new extent state. This means we have the following pattern
node = tree_search_for_insert();
if (!node) {
/* alloc and insert. */
goto again;
}
state = rb_entry(node, struct extent_state, rb_node);
we don't use the node for anything else. Making
tree_search_for_insert() return the extent_state means we can drop the
rb_node and clean this up by eliminating the rb_entry.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
We have a consistent pattern of
n = tree_search();
if (!n)
goto out;
state = rb_entry(n, struct extent_state, rb_node);
while (state) {
/* do something. */
}
which is a bit redundant. If we make tree_search return the state we
can simply have
state = tree_search();
while (state) {
/* do something. */
}
which cleans up the code quite a bit.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
We can simplify a lot of these functions where we have to cycle through
extent_state's by simply using next_state() instead of rb_next(). In
many spots this allows us to do things like
while (state) {
/* whatever */
state = next_state(state);
}
instead of
while (1) {
state = rb_entry(n, struct extent_state, rb_node);
n = rb_next(n);
if (!n)
break;
}
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This existed when we overloaded the tree manipulation functions for both
the extent_io_tree and the extent buffer tree. However the extent
buffers are now stored in a radix tree, so we no longer need this
abstraction. Remove struct tree_entry and use extent_state directly
instead.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Now that we've moved everything we can unexport all the temporary
exports, move the random helpers, and mark everything as static again.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
We no longer need to export this as all users are in extent-io-tree.c,
remove it from the header and put it into extent-io-tree.c.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This is still huge, but unfortunately I cannot make it smaller without
renaming tree_search() and changing all the callers to use the new name,
then moving those chunks and then changing the name back. This feels
like too much churn for code movement, so I've limited this to only
things that called tree_search(). With this patch all of the
extent_io_tree code is now in extent-io-tree.c.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
These are the last few helpers that do not rely on tree_search() and
who's other helpers are exported and in extent-io-tree.c already. Move
these across now in order to make the core move smaller.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
In order to avoid moving all of the related code at once temporarily
export all of the extent state related helpers. Then move these helpers
into extent-io-tree.c. We will clean up the exports and make them
static in followup patches.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
A lot of the various internals of extent_io_tree call these two
functions for insert or searching the rb tree for entries, so
temporarily export them and then move them to extent-io-tree.c. We
can't move tree_search() without renaming it, and I don't want to
introduce a bunch of churn just to do that, so move these functions
first and then we can move a few big functions and then the remaining
users of tree_search().
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This helper is used by a lot of the core extent_io_tree helpers, so
temporarily export it and move it into extent-io-tree.c in order to make
it straightforward to migrate the helpers in batches.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This is used by the subpage code in addition to lock_extent_bits, so
export it so we can move it out of extent_io.c
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
These are just variants and wrappers around the actual work horses of
the extent state. Extract these out of extent_io.c.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
We only call these functions from the qgroup code which doesn't call
with EXTENT_BIT_LOCKED. These are BUG_ON()'s that exist to keep us
developers from using these functions with EXTENT_BIT_LOCKED, so convert
them to ASSERT()'s.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Start cleaning up extent_io.c by moving the extent state code out of it.
This patch starts with the extent state allocation code and the
extent_io_tree init code.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
We're going to move this code in stages, but while we're doing that we
need to export these helpers so we can more easily move the code into
the new file.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Currently we have the add/del functions generic so that we can use them
for both extent buffers and extent states. We want to separate this
code however, so separate these helpers into per-object helpers in
anticipation of the split.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
In order to help separate the extent buffer from the extent io tree code
we need to break up the init functions.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Currently we're using find_first_extent_bit_state to check if our state
contains the given failrec range, however this is more of an internal
extent_io_tree helper, and is technically unsafe to use because we're
accessing the state outside of the extent_io_tree lock.
Instead use the normal helper find_first_extent_bit which returns the
range of the extent state we find in find_first_extent_bit_state and use
that to do our sanity checking.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
We still have this oddity of stashing the io_failure_record in the
extent state for the io_failure_tree, which is leftover from when we
used to stuff private pointers in extent_io_trees.
However this doesn't make a lot of sense for the io failure records, we
can simply use a normal rb_tree for this. This will allow us to further
simplify the extent_io_tree code by removing the io_failure_rec pointer
from the extent state.
Convert the io_failure_tree to an rb tree + spinlock in the inode, and
then use our rb tree simple helpers to insert and find failed records.
This greatly cleans up this code and makes it easier to separate out the
extent_io_tree code.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
These are internally used functions and are not used outside of
extent_io.c.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
This is exported, so rename it to btrfs_clean_io_failure. Additionally
we are passing in the io tree's and such from the inode, so instead of
doing all that simply pass in the inode itself and get all the
components we need directly inside of btrfs_clean_io_failure.
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
KCSAN reports that there's unlocked access mixed with locked access,
which is technically correct but is not a bug. To avoid false alerts at
least from KCSAN, add annotation and use a wrapper whenever ->full is
accessed for read outside of lock.
It is used as a fast check and only advisory. In the worst case the
block reserve is found !full and becomes full in the meantime, but
properly handled.
Depending on the value of ->full, btrfs_block_rsv_release decides
where to return the reservation, and block_rsv_release_bytes handles a
NULL pointer for block_rsv and if it's not NULL then it double checks
the full status under a lock.
Link: https://lore.kernel.org/linux-btrfs/CAAwBoOJDjei5Hnem155N_cJwiEkVwJYvgN-tQrwWbZQGhFU=cA@mail.gmail.com/
Link: https://lore.kernel.org/linux-btrfs/YvHU/[email protected]
Reported-by: Zygo Blaxell <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
At space-info.c:__reserve_bytes(), we increment the 'used' variable, but
then we don't use the variable anymore, making the increment pointless.
The increment became useless with commit 2e294c60497f29 ("btrfs: simplify
the logic in need_preemptive_flushing"), so just remove it.
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
btrfs_check_zoned_mode is really hard to follow, mostly due to the
fact that a lot of the checks use duplicate conditions after support
for zone emulation for conventional devices on file systems with the
ZONED flag was added. Fix this by factoring out the check for host
managed devices for !ZONED file systems into a separate helper and
then simplifying the rest of the code.
Reviewed-by: Naohiro Aota <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Add a missing 'r'. s/qgoup/qgroup/ . Codespell does not catch that for
some reason.
Signed-off-by: Christophe JAILLET <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
btrfs_bit_radix_cachep has been removed since
commit 45c06543afe2 ("Btrfs: remove unused btrfs_bit_radix slab"),
so remove it.
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Gaosheng Cui <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
btrfs_commit_transaction()
Btrfs qgroup has a long history of bringing performance penalty in
btrfs_commit_transaction().
Although we tried our best to migrate such impact, there is still an
unsolved call site, btrfs_drop_snapshot().
This function will find the highest shared tree block and modify its
extent ownership to do a subvolume/snapshot dropping.
Such change will affect the whole subtree, and cause tons of qgroup
dirty extents and stall btrfs_commit_transaction().
To avoid such problem, here we introduce a new sysfs interface,
/sys/fs/btrfs/<uuid>/qgroups/drop_subptree_threshold, to determine at
whether and at which level we should skip qgroup accounting for subtree
dropping.
The default value is BTRFS_MAX_LEVEL, thus every subtree drop will go
through qgroup accounting, to ensure qgroup numbers are kept as
consistent as possible.
While for performance sensitive cases, add a way to change the values to
more reasonable values like 3, to make any subtree, which is at or higher
than level 3, to mark qgroup inconsistent and skip the accounting.
The cost is obvious, the qgroup number is no longer consistent, but at
least performance is more reasonable, and users have the control.
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
accounting
The new flag will make btrfs qgroup skip all its time consuming
qgroup accounting.
The lifespan is the same as BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN,
only get cleared after a new rescan.
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Introduce a new runtime flag, BTRFS_QGROUP_RUNTIME_FLAG_CANCEL_RESCAN,
which will inform qgroup rescan to cancel its work asynchronously.
This is to address the window when an operation makes qgroup numbers
inconsistent (like qgroup inheriting) while a qgroup rescan is running.
In that case, qgroup inconsistent flag will be cleared when qgroup
rescan finishes.
But we changed the ownership of some extents, which means the rescan is
already meaningless, and the qgroup inconsistent flag should not be
cleared.
With the new flag, each time we set INCONSISTENT flag, we also set this
new flag to inform any running qgroup rescan to exit immediately, and
leaving the INCONSISTENT flag there.
The new runtime flag can only be cleared when a new rescan is started.
Signed-off-by: Qu Wenruo <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Currently we only have 3 qgroup flags:
- BTRFS_QGROUP_STATUS_FLAG_ON
- BTRFS_QGROUP_STATUS_FLAG_RESCAN
- BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT
These flags match the on-disk flags used in btrfs_qgroup_status.
But we're going to introduce extra runtime flags which will not reach
disks.
So here we introduce a new mask, BTRFS_QGROUP_STATUS_FLAGS_MASK, to
make sure only those flags can reach disks.
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Although we already have info kobject for each qgroup, we don't have
global qgroup info attributes to show things like enabled or
inconsistent status flags.
Add this qgroups attribute groups, and the first member is qgroup_flags,
which is a read-only attribute to show human readable qgroup flags.
The path is:
/sys/fs/btrfs/<uuid>/qgroups/enabled
/sys/fs/btrfs/<uuid>/qgroups/inconsistent
The output is simple, just 1 or 0.
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|