aboutsummaryrefslogtreecommitdiff
path: root/fs/bcachefs/buckets.h
AgeCommit message (Collapse)AuthorFilesLines
2024-11-07bcachefs: Fix null ptr deref in bucket_gen_get()Kent Overstreet1-8/+11
bucket_gen() checks if we're lookup up a valid bucket and returns NULL otherwise, but bucket_gen_get() was failing to check; other callers were correct. Also do a bit of cleanup on callers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-18bcachefs: bch2_folio_reservation_get_partial() is now better behavedKent Overstreet1-5/+7
bch2_folio_reservation_get_partial(), on partial success, will now return a reservation that's aligned to the filesystem blocksize. This is a partial fix for fstests generic/299 - fio verify is badly behaved in the presence of short writes that aren't aligned to its blocksize. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09bcachefs: Switch gc bucket array to a genradixKent Overstreet1-14/+1
A user with a 30 tb device is overflowing the INT_MAX limit on vmalloc allocations... Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-09bcachefs: improve bch2_dev_usage_to_text()Kent Overstreet1-1/+1
Add a line for capacity Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Reduce the scope of gc_lockKent Overstreet1-2/+2
gc_lock is now only for synchronization between check_alloc_info and interior btree updates - nothing else Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Convert gc to new accountingKent Overstreet1-16/+0
Rewrite fsck/gc for the new accounting scheme. This adds a second set of in-memory accounting counters for gc to use; like with other parts of gc we run all trigger in TRIGGER_GC mode, then compare what we calculated to existing in-memory accounting at the end. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Kill fs_usage_onlineKent Overstreet1-12/+0
More dead code deletion. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Kill bch2_fs_usage_to_text()Kent Overstreet1-3/+0
Dead code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Delete journal-buf-sharded old style accountingKent Overstreet1-4/+0
More deletion of dead code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: kill bch2_fs_usage_read()Kent Overstreet1-2/+0
With bch2_ioctl_fs_usage(), this is now dead code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Kill bch2_fs_usage_initialize()Kent Overstreet1-2/+0
Deleting code for the old disk accounting scheme. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Disk space accounting rewriteKent Overstreet1-25/+1
Main part of the disk accounting rewrite. This is a wholesale rewrite of the existing disk space accounting, which relies on percepu counters that are sharded by journal buffer, and rolled up and added to each journal write. With the new scheme, every set of counters is a distinct key in the accounting btree; this fixes scaling limitations of the old scheme, where counters took up space in each journal entry and required multiple percpu counters. Now, in memory accounting requires a single set of percpu counters - not multiple for each in flight journal buffer - and in the future we'll probably also have counters that don't use in memory percpu counters, they're not strictly required. An accounting update is now a normal btree update, using the btree write buffer path. At transaction commit time, we apply accounting updates to the in memory counters, which are percpu counters indexed in an eytzinger tree by the accounting key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14bcachefs: Use try_cmpxchg() family of functions instead of cmpxchg()Uros Bizjak1-2/+2
Use try_cmpxchg() family of functions instead of cmpxchg (*ptr, old, new) == old. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, try_cmpxchg() implicitly assigns old *ptr value to "old" when cmpxchg fails. There is no need to re-read the value in the loop. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Brian Foster <bfoster@redhat.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-10bcachefs: Fix RCU splatKent Overstreet1-0/+8
Reported-by: syzbot+e74fea078710bbca6f4b@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-10bcachefs: Check for invalid bucket from bucket_gen(), gc_bucket()Kent Overstreet1-4/+7
Turn more asserts into proper recoverable error paths. Reported-by: syzbot+246b47da27f8e7e7d6fb@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-10bcachefs: Replace bucket_valid() asserts in bucket lookup with proper checksKent Overstreet1-2/+4
The bucket_gens array and gc_buckets array known their own size; we should be using those members, and returning an error. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: ptr_stale() -> dev_ptr_stale()Kent Overstreet1-6/+8
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: PTR_BUCKET_POS() now takes bch_devKent Overstreet1-7/+3
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_bucket_ref_update() now takes bch_devKent Overstreet1-2/+2
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_check_alloc_key() -> bch2_dev_tryget_noerror()Kent Overstreet1-5/+0
More elimination of bch2_dev_bkey_exists() usage. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: Kill opts.buckets_nouseKent Overstreet1-0/+3
Now explicitly allocate and free the buckets_nouse bitmap - this is going to be used for online fsck. To go RW when we haven't check allocations, we'll do a much slimmed down version that just initializes the buckets_nouse bitmaps. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: kill bch2_dev_usage_update_m()Kent Overstreet1-2/+0
by using bucket_m_to_alloc() more, we can get some nice code cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: Run bch2_check_fix_ptrs() via triggersKent Overstreet1-0/+4
Currently, the reflink_p gc trigger does repair as well - turning a reflink_p key into an error key if the reflink_v it points to doesn't exist. This won't work with online check/repair, because the repair path once online will be subject to transaction restarts, but BTREE_TRIGGER_gc is not idempotant - we can't run it multiple times if we get a transaction restart. So we need to split these paths; to do so this patch calls check_fix_ptrs() by a new general path - a new trigger type, BTREE_TRIGGER_check_repair. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bch2_bucket_ref_update()Kent Overstreet1-3/+3
If we hit an inconsistency when updating allocation information, we don't want to fail the update if it's for a deletion - only if it's for a new key. Rename check_bucket_ref() -> bucket_ref_update() so we can centralize the logic to do this. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: member helper cleanupsKent Overstreet1-2/+2
Some renaming for better consistency bch2_member_exists -> bch2_member_alive bch2_dev_exists -> bch2_member_exists bch2_dev_exsits2 -> bch2_dev_exists bch_dev_locked -> bch2_dev_locked bch_dev_bkey_exists -> bch2_dev_bkey_exists new helper - bch2_dev_safe Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: bucket_valid()Kent Overstreet1-5/+9
cut out a branch from doing it the obvious way Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: iter/update/trigger/str_hash flag cleanupKent Overstreet1-7/+12
Combine iter/update/trigger/str_hash flags into a single enum, and x-macroize them for a to_text() function later. These flags are all for a specific iter/key/update context, so it makes sense to group them together - iter/update/trigger flags were already given distinct bits, this cleans up and unifies that handling. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08bcachefs: mark_superblock cleanupKent Overstreet1-7/+4
Consolidate mark_superblock() and trans_mark_superblock(), like we did with the other trigger paths. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-13bcachefs: Standardize helpers for printing enum strs with bounds checksKent Overstreet1-8/+0
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-01bcachefs: BCH_WATERMARK_interior_updatesKent Overstreet1-0/+1
This adds a new watermark, higher priority than BCH_WATERMARK_reclaim, for interior btree updates. We've seen a deadlock where journal replay triggers a ton of btree node merges, and these use up all available open buckets and then interior updates get stuck. One cause of this is that we're currently lacking btree node merging on write buffer btrees - that needs to be fixed as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: bch2_trans_account_disk_usage_change()Kent Overstreet1-0/+2
The disk space accounting rewrite is splitting out accounting for each replicas set - those are moving to btree keys, instead of percpu counters. This breaks bch2_trans_fs_usage_apply() up, splitting out the part we will still need. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21bcachefs: helpers for printing data typesKent Overstreet1-0/+15
We need bounds checking since new versions may introduce new data types. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: unify extent triggerKent Overstreet1-3/+2
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: move stripe triggers to ec.cKent Overstreet1-3/+9
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: move bch2_mark_alloc() to alloc_background.cKent Overstreet1-2/+4
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: unify reservation triggerKent Overstreet1-2/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: kill mem_trigger_run_overwrite_then_insert()Kent Overstreet1-4/+1
now that type signatures are unified, redundant Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: mark now takes bkey_sKent Overstreet1-6/+6
Prep work for disk space accounting rewrite: we're going to want to use a single callback for both of our current triggers, so we need to change them to have the same type signature first. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05bcachefs: trans_mark now takes bkey_sKent Overstreet1-4/+4
Prep work for disk space accounting rewrite: we're going to want to use a single callback for both of our current triggers, so we need to change them to have the same type signature first. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Move reflink_p triggers into reflink.cKent Overstreet1-4/+0
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: bch2_dev_usage_to_text()Kent Overstreet1-0/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01bcachefs: Don't use update_cached_sectors() in bch2_mark_alloc()Kent Overstreet1-0/+3
bch2_update_cached_sectors_list() is closer to how the new disk space accounting works, called from trans_mark(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31bcachefs: All triggers are BTREE_TRIGGER_WANTS_OLD_AND_NEWKent Overstreet1-0/+14
Upcoming rebalance_work btree will require extent triggers to be BTREE_TRIGGER_WANTS_OLD_AND_NEW - so to reduce potential confusion, let's just make all triggers BTREE_TRIGGER_WANTS_OLD_AND_NEW. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31bcachefs: Ensure devices are always correctly initializedKent Overstreet1-0/+1
We can't mark device superblocks or allocate journal on a device that isn't online. That means we may need to do this on every mount, because we may have formatted a new filesystem and then done the first mount (bch2_fs_initialize()) in degraded mode. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: bucket_lock() is now a sleepable lockKent Overstreet1-2/+5
fsck_err() may sleep - it takes a mutex and may allocate memory, so bucket_lock() needs to be a sleepable lock. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Change bucket_lock() to use bit_spin_lock()Kent Overstreet1-3/+30
bucket_lock() previously open coded a spinlock, because we need to cram a spinlock into a single byte. But it turns out not all archs support xchg() on a single byte; since we need struct bucket to be small, this means we have to play fun games with casts and ifdefs for endianness. This fixes building on 32 bit arm, and likely other architectures. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Cc: linux-bcachefs@vger.kernel.org Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: Remove undefined behavior in bch2_dev_buckets_reserved()Josh Poimboeuf1-1/+1
In general it's a good idea to avoid using bare unreachable() because it introduces undefined behavior in compiled code. In this case it even confuses GCC into emitting an empty unused bch2_dev_buckets_reserved.part.0() function. Use BUG() instead, which is nice and defined. While in theory it should never trigger, if something were to go awry and the BCH_WATERMARK_NR case were to actually hit, the failure mode is much more robust. Fixes the following warnings: vmlinux.o: warning: objtool: bch2_bucket_alloc_trans() falls through to next function bch2_reset_alloc_cursors() vmlinux.o: warning: objtool: bch2_dev_buckets_reserved.part.0() is missing an ELF size annotation Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: sb-members.cKent Overstreet1-1/+46
Split out a new file for bch_sb_field_members - we'll likely want to move more code here in the future. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: move inode triggers to inode.cKent Overstreet1-3/+14
bit of reorg Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22bcachefs: BCH_WATERMARK_reclaimKent Overstreet1-0/+1
Add another watermark for journal reclaim - this is needed for the next patches, that unify BCH_WATERMARK with JOURNAL_WATERMARK. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>