aboutsummaryrefslogtreecommitdiff
path: root/fs/btrfs
AgeCommit message (Collapse)AuthorFilesLines
2022-12-05btrfs: move inode prototypes to btrfs_inode.hJosef Bacik2-140/+136
I initially wanted to make a new header file for this, but these prototypes do naturally fit into btrfs_inode.h. If we want to extract vfs from pure btrfs code in the future we may need to split this up, but btrfs_inode embeds the vfs_inode, so it makes sense to put the prototypes in this header for now. Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move the printk and assert helpers to messages.cJosef Bacik3-347/+353
These helpers are core to btrfs, and in order to more easily sync various parts of the btrfs kernel code into btrfs-progs we need to be able to carry these helpers with us. However we want to have our own implementation for the helpers themselves, currently they're implemented in different files that we want to sync inside of btrfs-progs itself. Move these into their own C file, this will allow us to contain our overrides in btrfs-progs in it's own file without messing with the rest of the codebase. In copying things over I fixed up a few whitespace errors that already existed. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: add blk_types.h include to compression.hJosef Bacik1-0/+1
When moving the printk messages into their own file I got a compiler error because the includes grabbed compression.h, but nothing pulled in the blk_types.h dependency that compression.h has because it uses blkstatus_t. Add blk_types.h to compression.h so that this sort of thing doesn't happen in the future. Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: add dependencies to fs.h and block-rsv.hJosef Bacik2-0/+9
There's several structures that are embedded inside of fs_info.h, so if we don't have all the proper includes when we include fs.h we'll get a variety of compile errors. I fixed this by adding a temporary c file that just had #include "fs.h" and then added include files until the compiler stopped complaining. Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move btrfs_chunk_item_size out of ctree.hJosef Bacik2-7/+8
This is used by the volumes code and the tree checker code. We want to maintain inline however, so simply move it to volumes.h. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: convert discard stat defs to enumJosef Bacik1-3/+5
Do away with the defines and use an enum as it's cleaner. Suggested-by: Johannes Thumshirn <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: update function commentsDavid Sterba20-269/+289
Update, reformat or reword function comments. This also removes the kdoc marker so we don't get reports when the function name is missing. Changes made: - remove kdoc markers - reformat the brief description to be a proper sentence - reword to imperative voice - align parameter list - fix typos Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: remove unused btrfs_cond_migrate_bytesJosef Bacik2-28/+0
The last user of this was removed in 7f9fe6144076 ("btrfs: improve global reserve stealing logic"), drop this code as it's no longer called by anybody. Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: remove unused function prototypesJosef Bacik2-7/+0
I wrote the following coccinelle script to find function declarations that didn't have the corresponding code for them @funcproto@ identifier func; type T; position p0; @@ T func@p0(...); @funccode@ identifier funcproto.func; position p1; @@ func@p1(...) { ... } @script:python depends on !funccode@ p0 << funcproto.p0; @@ print("Proto with no function at %s:%s" % (p0[0].file, p0[0].line)) and ran it against btrfs, which identified the 4 function prototypes I've removed in this patch. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move root tree prototypes to their own headerJosef Bacik12-32/+44
Move all the root-tree.c prototypes to root-tree.h, and then update all the necessary files to include the new header. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: delete unused function prototypes in ctree.hJosef Bacik1-6/+0
This batch of prototypes no longer have code associated with them, so remove them. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move delalloc space related prototypes to delalloc-space.hJosef Bacik2-3/+3
These exist in delalloc-space.c, move them from ctree.h into delalloc-space.h. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move extent-tree helpers into their own header fileJosef Bacik17-72/+87
Move all the extent tree related prototypes to extent-tree.h out of ctree.h, and then go include it everywhere needed so everything compiles. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move btrfs_account_ro_block_groups_free_space into space-info.cJosef Bacik4-35/+35
This was prototyped in ctree.h and the code existed in extent-tree.c, but it's space-info related so move it into space-info.c. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: remove extra space info prototypes in ctree.hJosef Bacik1-3/+0
These are defined already in space-info.h, remove them from ctree.h. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: minor whitespace in ctree.hJosef Bacik1-2/+0
We've accumulated some whitespace problems in ctree.h, clean these up. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move the lockdep helpers into locking.hJosef Bacik2-76/+76
These more naturally fit in with the locking related code, and they're all defines so they can easily go anywhere, move them out of ctree.h into locking.h Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move btrfs_fs_info declarations into fs.hJosef Bacik2-658/+661
Now that we have a lot of the fs_info related helpers and stuff isolated, copy these over to fs.h out of ctree.h. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> [ reformat comments ] Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: extend btrfs_dir_item type to store encryption statusOmar Sandoval9-28/+43
For directories with encrypted files/filenames, we need to store a flag indicating this fact. There's no room in other fields, so we'll need to borrow a bit from dir_type. Since it's now a combination of type and flags, we rename it to dir_flags to reflect its new usage. The new flag, FT_ENCRYPTED, indicates a directory containing encrypted data, which is orthogonal to file type; therefore, add the new flag, and make conversion from directory type to file type strip the flag. As the file types almost never change we can afford to use the bits. Actual usage will be guarded behind an incompat bit, this patch only adds the support for later use by fscrypt. Signed-off-by: Omar Sandoval <[email protected]> Signed-off-by: Sweet Tea Dorminy <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: use struct fscrypt_str instead of struct qstrSweet Tea Dorminy12-118/+95
While struct qstr is more natural without fscrypt, since it's provided by dentries, struct fscrypt_str is provided by the fscrypt handlers processing dentries, and is thus more natural in the fscrypt world. Replace all of the struct qstr uses with struct fscrypt_str. Signed-off-by: Sweet Tea Dorminy <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: setup qstr from dentrys using fscrypt helperSweet Tea Dorminy4-57/+189
Most places where we get a struct qstr, we are doing so from a dentry. With fscrypt, the dentry's name may be encrypted on-disk, so fscrypt provides a helper to convert a dentry name to the appropriate disk name if necessary. Convert each of the dentry name accesses to use fscrypt_setup_filename(), then convert the resulting fscrypt_name back to an unencrypted qstr. This does not work for nokey names, but the specific locations that could spawn nokey names are noted. At present, since there are no encrypted directories, nothing goes down the filename encryption paths. Signed-off-by: Sweet Tea Dorminy <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: use struct qstr instead of name and namelen pairsSweet Tea Dorminy12-335/+287
Many functions throughout btrfs take name buffer and name length arguments. Most of these functions at the highest level are usually called with these arguments extracted from a supplied dentry's name. But the entire name can be passed instead, making each function a little more elegant. Each function whose arguments are currently the name and length extracted from a dentry is herein converted to instead take a pointer to the name in the dentry. The couple of calls to these calls without a struct dentry are converted to create an appropriate qstr to pass in. Additionally, every function which is only called with a name/len extracted directly from a qstr is also converted. This change has positive effect on stack consumption, frame of many functions is reduced but this will be used in the future for fscrypt related structures. Signed-off-by: Sweet Tea Dorminy <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: merge module cleanup sequence to one helperAnand Jain1-18/+10
The module exit function exit_btrfs_fs() is duplicating a section of code in init_btrfs_fs(). Add a helper to remove the duplicated code. Due to the init/exit section requirements the function must be inline and not a plain static as it could cause section mismatch. Signed-off-by: Anand Jain <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: sink gfp_t parameter to alloc_scrub_sectorDavid Sterba1-6/+6
All callers pas GFP_KERNEL as parameter so we can use it directly in alloc_scrub_sector. Reviewed-by: Anand Jain <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: switch GFP_NOFS to GFP_KERNEL in scrub_setup_recheck_blockDavid Sterba1-2/+2
There's only one caller that calls scrub_setup_recheck_block in the memalloc_nofs_save/_restore protection so it's effectively already GFP_NOFS and it's safe to use GFP_KERNEL. Reviewed-by: Anand Jain <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: sink gfp_t parameter to btrfs_qgroup_trace_extentDavid Sterba3-13/+9
All callers pass GFP_NOFS, we can drop the parameter and use it directly. Reviewed-by: Anand Jain <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: sink gfp_t parameter to btrfs_backref_iter_allocDavid Sterba3-6/+4
There's only one caller that passes GFP_NOFS, we can drop the parameter an use the flags directly. Reviewed-by: Anand Jain <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: remove temporary btrfs_map_token declaration in ctree.hJosef Bacik1-2/+0
This was added while I was moving this code to its new home, it can be removed now. Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move accessor helpers into accessors.hJosef Bacik46-1091/+1112
This is a large patch, but because they're all macros it's impossible to split up. Simply copy all of the item accessors in ctree.h and paste them in accessors.h, and then update any files to include the header so everything compiles. Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> [ reformat comments, style fixups ] Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move btrfs_map_token to accessorsJosef Bacik6-14/+27
This is specific to the item-accessor code, move it out of ctree.h into accessor.h/.c and then update the users to include the new header file. This un-inlines btrfs_init_map_token, however this is only called once per function so it's not critical to be inlined. This also saves 904 bytes of code on a release build. Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: rename struct-funcs.c to accessors.cJosef Bacik2-2/+1
Rename struct-funcs.c to accessors.c so we can move the item accessors out of ctree.h. accessors.c is a better description of the code that is contained in these files. Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move the compat/incompat flag masks to fs.hJosef Bacik2-57/+57
This is fs wide information, move it out of ctree.h into fs.h. Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: remove fs_info::pending_changes and related codeJosef Bacik4-54/+1
Now that we're not using this code anywhere we can remove it as well as the member from fs_info. We don't have any mount options or on/off features that would utilize the pending infrastructure, the last one was inode_cache. There was a patchset [1] to enable some features from sysfs that would break things if it would be set immediately. In case we'll need that kind of logic again the patch can be reverted, but for the current use it can be replaced by the single state bit to do the commit. [1] https://lore.kernel.org/linux-btrfs/[email protected]/ Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> [ add note ] Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: add a BTRFS_FS_NEED_TRANS_COMMIT flagJosef Bacik4-3/+9
Currently we are only using fs_info->pending_changes to indicate that we need a transaction commit. The original users for this were removed years ago and we don't have more usage in sight, so this is the only remaining reason to have this field. Add a flag so we can remove this code. Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move fs_info::flags enum to fs.hJosef Bacik4-68/+70
These definitions are fs wide, take them out of ctree.h and put them in fs.h. Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move mount option definitions to fs.hJosef Bacik6-63/+67
These are fs wide definitions and helpers, move them out of ctree.h and into fs.h. Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: convert incompat and compat flag test helpers to macrosJosef Bacik1-14/+6
These helpers use functions not defined in fs.h, they're simply accessors of the super block in fs_info, convert them to macros so that we don't have a weird dependency between fs.h and accessors.h. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move BTRFS_FS_STATE* definitions and helpers to fs.hJosef Bacik15-52/+68
We're going to use fs.h to hold fs wide related helpers and definitions, move the FS_STATE enum and related helpers to fs.h, and then update all files that need these definitions to include fs.h. Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: push printk index code into their respective helpersJosef Bacik2-28/+11
The printk index work can be pushed into the printk helpers themselves, this allows us to further sanitize messages.h, removing the last include in the header itself. Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move the printk helpers out of ctree.hJosef Bacik36-250/+293
We have a bunch of printk helpers that are in ctree.h. These have nothing to do with ctree.c, so move them into their own header. Subsequent patches will cleanup the printk helpers. Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move assert helpers out of ctree.hJosef Bacik2-15/+17
These call functions that aren't defined in, or will be moved out of, ctree.h Move them to super.c where the other assert/error message code is defined. Drop the __noreturn attribute for btrfs_assertfail as objtool does not like it and fails with warnings like fs/btrfs/dir-item.o: warning: objtool: .text.unlikely: unexpected end of section fs/btrfs/xattr.o: warning: objtool: btrfs_setxattr() falls through to next function btrfs_setxattr_trans.cold() fs/btrfs/xattr.o: warning: objtool: .text.unlikely: unexpected end of section Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: move fs wide helpers out of ctree.hJosef Bacik26-165/+200
We have several fs wide related helpers in ctree.h. The bulk of these are the incompat flag test helpers, but there are things such as btrfs_fs_closing() and the read only helpers that also aren't directly related to the ctree code. Move these into a fs.h header, which will serve as the location for file system wide related helpers. Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: send add define for v2 buffer sizeWang Yugui2-3/+5
Add a define for the data buffer size (though the maximum size is not limited by it) BTRFS_SEND_BUF_SIZE_V2 so it's more visible. Signed-off-by: Wang Yugui <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: simplify generation check in btrfs_get_dentryDavid Sterba3-9/+19
Callers that pass non-zero generation always want to perform the generation check, we can simply encode that in one parameter and drop check_generation. Add function documentation. Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: auto enable discard=async when possibleDavid Sterba5-0/+22
There's a request to automatically enable async discard for capable devices. We can do that, the async mode is designed to wait for larger freed extents and is not intrusive, with limits to iops, kbps or latency. The status and tunables will be exported in /sys/fs/btrfs/FSID/discard . The automatic selection is done if there's at least one discard capable device in the filesystem (not capable devices are skipped). Mounting with any other discard option will honor that option, notably mounting with nodiscard will keep it disabled. Link: https://lore.kernel.org/linux-btrfs/CAEg-Je_b1YtdsCR0zS5XZ_SbvJgN70ezwvRwLiCZgDGLbeMB=w@mail.gmail.com/ Reviewed-by: Boris Burkov <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: sysfs: convert remaining scnprintf to sysfs_emitDavid Sterba1-3/+3
The sysfs_emit is the safe API for writing to the sysfs files, previously converted from scnprintf, there's one left to do in btrfs_read_policy_show. Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: do not panic if we can't allocate a prealloc extent stateJosef Bacik1-8/+14
We sometimes have to allocate new extent states when clearing or setting new bits in an extent io tree. Generally we preallocate this before taking the tree spin lock, but we can use this preallocated extent state sometimes and then need to try to do a GFP_ATOMIC allocation under the lock. Unfortunately sometimes this fails, and then we hit the BUG_ON() and bring the box down. This happens roughly 20 times a week in our fleet. However the vast majority of callers use GFP_NOFS, which means that if this GFP_ATOMIC allocation fails, we could simply drop the spin lock, go back and allocate a new extent state with our given gfp mask, and begin again from where we left off. For the remaining callers that do not use GFP_NOFS, they are generally using GFP_NOWAIT, which still allows for some reclaim. So allow these allocations to attempt to happen outside of the spin lock so we don't need to rely on GFP_ATOMIC allocations. This in essence creates an infinite loop for anything that isn't GFP_NOFS. To address this we may want to migrate to using mempools for extent states so that we will always have emergency reserves in order to make our allocations. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: remove unused unlock_extent_atomicJosef Bacik1-7/+0
As of "btrfs: do not use GFP_ATOMIC in the read endio" we no longer have any users of unlock_extent_atomic, remove it. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: do not use GFP_ATOMIC in the read endioJosef Bacik1-4/+4
We have done read endio in an async thread for a very, very long time, which makes the use of GFP_ATOMIC and unlock_extent_atomic() unneeded in our read endio path. We've noticed under heavy memory pressure in our fleet that we can fail these allocations, and then often trip a BUG_ON(!allocation), which isn't an ideal outcome. Begin to address this by simply not using GFP_ATOMIC, which will allow us to do things like actually allocate a extent state when doing set_extent_bits(UPTODATE) in the endio handler. End io handlers are not called in atomic context, besides we have been allocating failrec with GFP_NOFS so we'd notice there's a problem. Signed-off-by: Josef Bacik <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-12-05btrfs: skip update of block group item if used bytes are the sameQu Wenruo2-1/+33
[BACKGROUND] When committing a transaction, we will update block group items for all dirty block groups. But in fact, dirty block groups don't always need to update their block group items. It's pretty common to have a metadata block group which experienced several COW operations, but still have the same amount of used bytes. In that case, we may unnecessarily COW a tree block doing nothing. [ENHANCEMENT] This patch will introduce btrfs_block_group::commit_used member to remember the last used bytes, and use that new member to skip unnecessary block group item update. This would be more common for large filesystems, where metadata block group can be as large as 1GiB, containing at most 64K metadata items. In that case, if COW added and then deleted one metadata item near the end of the block group, then it's completely possible we don't need to touch the block group item at all. [BENCHMARK] The change itself can have quite a high chance (20~80%) to skip block group item updates in lot of workloads. As a result, it would result shorter time spent on btrfs_write_dirty_block_groups(), and overall reduce the execution time of the critical section of btrfs_commit_transaction(). Here comes a fio command, which will do random writes in 4K block size, causing a very heavy metadata updates. fio --filename=$mnt/file --size=512M --rw=randwrite --direct=1 --bs=4k \ --ioengine=libaio --iodepth=64 --runtime=300 --numjobs=4 \ --name=random_write --fallocate=none --time_based --fsync_on_close=1 The file size (512M) and number of threads (4) means 2GiB file size in total, but during the full 300s run time, my dedicated SATA SSD is able to write around 20~25GiB, which is over 10 times the file size. Thus after we fill the initial 2G, we should not cause much block group item updates. Please note, the fio numbers by themselves don't have much change, but if we look deeper, there is some reduced execution time, especially for the critical section of btrfs_commit_transaction(). I added extra trace_printk() to measure the following per-transaction execution time: - Critical section of btrfs_commit_transaction() By re-using the existing update_commit_stats() function, which has already calculated the interval correctly. - The while() loop for btrfs_write_dirty_block_groups() Although this includes the execution time of btrfs_run_delayed_refs(), it should still be representative overall. Both result involves transid 7~30, the same amount of transaction committed. The result looks like this: | Before | After | Diff ----------------------+-------------------+----------------+-------- Transaction interval | 229247198.5 | 215016933.6 | -6.2% Block group interval | 23133.33333 | 18970.83333 | -18.0% The change in block group item updates is more obvious, as skipped block group item updates also mean less delayed refs. And the overall execution time for that block group update loop is pretty small, thus we can assume the extent tree is already mostly cached. If we can skip an uncached tree block, it would cause more obvious change. Unfortunately the overall reduction in commit transaction critical section is much smaller, as the block group item updates loop is not really the major part, at least not for the above fio script. But still we have a observable reduction in the critical section. Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: Qu Wenruo <[email protected]> Signed-off-by: David Sterba <[email protected]>