aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-03-13bcachefs: bch2_print_opts()Kent Overstreet3-6/+27
Make sure early error messages get redirected, for kernel-fsck-from-userland. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: Improve error messages in device remove pathKent Overstreet1-5/+5
Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: Use kvzalloc() when dynamically allocating btree pathsKent Overstreet1-2/+2
THis silences a mm/page_alloc.c warning about allocating more than a page with GFP_NOFAIL - and there's no reason for this to not have a vmalloc fallback anyways. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: Track iter->ip_allocated at bch2_trans_copy_iter()Kent Overstreet1-0/+3
Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: Save key_cache_path in peek_slot()Kent Overstreet1-0/+1
When bch2_btree_iter_peek_slot() clones the iterator to search for the next key, and then discovers that the key from the cloned iterator is the key we want to return - we also want to save the iter->key_cache_path as well, for the update path. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: Pin btree cache in ram for random access in fsckKent Overstreet5-91/+72
Various phases of fsck involve checking references from one btree to another: this means doing a sequential scan of one btree, and then mostly random access into the second. This is particularly painful for checking extents <-> backpointers; we can prefetch btree node access on the sequential scan, but not on the random access portion, and this is particularly painful on spinning rust, where we'd like to keep the pipeline fairly full of btree node reads so that the elevator can reduce seeking. This patch implements prefetching and pinning of the portion of the btree that we'll be doing random access to. We already calculate how much of the random access btree will fit in memory so it's a fairly straightforward change. This will put more pressure on system memory usage, so we introduce a new option, fsck_memory_usage_percent, which is the percentage of total system ram that fsck is allowed to pin. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: Check for subvolume children when deleting subvolumesKent Overstreet5-9/+32
Recursively destroying subvolumes isn't allowed yet. Fixes: https://github.com/koverstreet/bcachefs/issues/634 Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: BTREE_ID_subvolume_childrenKent Overstreet7-3/+114
Add a btree to record a parent -> child subvolume relationships, according to the filesystem heirarchy. The subvolume_children btree is a bitset btree: if a bit is set at pos p, that means p.offset is a child of subvolume p.inode. This will be used for efficiently listing subvolumes, as well as recursive deletion. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: bch_subvolume::fs_path_parentKent Overstreet7-13/+88
Record the filesystem path heirarchy for subvolumes in bch_subvolume Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: bch2_btree_bit_mod()Kent Overstreet2-0/+22
Provide a non-write buffer version of bch2_btree_bit_mod_buffered(), for the subvolume children btree. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: bch2_btree_bit_mod -> bch2_btree_bit_mod_bufferedKent Overstreet5-9/+11
Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: Correctly reattach subvolumesKent Overstreet3-10/+28
Subvolumes need special handling to reattach - we always reattach them in the root subvolume's lost+found, and they need a slightly different kind of dirent. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: check_path() now prints full inode when reattachingKent Overstreet1-8/+18
Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: Pass inode bkey to check_path()Kent Overstreet2-29/+40
prep work for improving logging/error messages Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: Fix path where dirent -> subvol missing and we don't fixKent Overstreet1-4/+9
Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: bch_subvolume::parent -> creation_parentKent Overstreet2-13/+13
bit of renaming, prep for adding a fs path parent Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: Repair subvol dirents that point to non subvolsKent Overstreet1-0/+6
when repair switches d_type to or from DT_SUBVOL, we need to update the target accordingly Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: check dirent->d_parent_subvolKent Overstreet1-4/+57
Check that d_parent_subvol makes sense - the dirent's snapshot must be visible in d_parent_subvol (i.e. an ancestor of d_parent_subvol's snapshot) in order to be visible. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: check inode->bi_parent_subvol against direntKent Overstreet2-23/+14
Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: delete duplicated checks in check_dirent_to_subvol()Kent Overstreet1-23/+4
these were already checked in check_subvol() Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: simplify check_dirent_inode_dirent()Kent Overstreet1-58/+56
Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: check bi_parent_subvol in check_inode()Kent Overstreet1-0/+10
check for inodes with a nonzero bi_parent_subvol field that aren't actually subvolume roots Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: better log message in lookup_inode_for_snapshot()Kent Overstreet1-21/+24
Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: check_inode_dirent_inode()Kent Overstreet1-36/+89
check that if an inode has a backpointer, the dirent it points to points back to it. We do this in check_dirent_inode_dirent(), but only for inodes that have dirents that point to them - we also have to do the check starting from the inode to catch inodes that don't have dirents that point to them. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: Check subvol <-> inode pointers in check_inode()Kent Overstreet1-0/+25
Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: Check subvol <-> inode pointers in check_subvol()Kent Overstreet3-1/+34
Subvolumes and subvolume root inodes point to each other: this verifies the subvolume -> inode -> subvolme path. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: Kill more -EIO error codesKent Overstreet11-20/+27
This converts -EIOs related to btree node errors to private error codes, which will help with some ongoing debugging by giving us better error messages. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: thread_with_file: add f_ops.flushKent Overstreet3-7/+32
Add a flush op, to return the exit code via close(). Also update bcachefs usage to use this to return fsck exit codes. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: thread_with_file: Fix missing va_end()Kent Overstreet1-0/+2
Fixes: https://lore.kernel.org/linux-bcachefs/202402131603.E953E2CF@keescook/T/#u Reported-by: coverity scan Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: thread_with_file: allow ioctls against these filesDarrick J. Wong2-0/+13
Make it so that a thread_with_stdio user can handle ioctls against the file descriptor. Signed-off-by: Darrick J. Wong <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: thread_with_file: create ops structure for thread_with_stdioDarrick J. Wong3-22/+28
Create an ops structure so we can add more file-based functionality in the next few patches. Signed-off-by: Darrick J. Wong <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: thread_with_file: fix various printf problemsDarrick J. Wong2-20/+37
Experimentally fix some problems with stdio_redirect_vprintf by creating a MOO variant with which we can experiment. We can't do a GFP_KERNEL allocation while holding the spinlock, and I don't like how the printf function can silently truncate the output if memory allocation fails. Signed-off-by: Darrick J. Wong <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: thread_with_file: allow creation of readonly filesDarrick J. Wong2-0/+39
Create a new run_thread_with_stdout function that opens a file in O_RDONLY mode so that the kernel can write things to userspace but userspace cannot write to the kernel. This will be used to convey xfs health event information to userspace. Signed-off-by: Darrick J. Wong <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: thread_with_stdio: suppress hung task warningKent Overstreet1-2/+15
Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13kernel/hung_task.c: export sysctl_hung_task_timeout_secsKent Overstreet1-0/+1
needed for thread_with_file; also rare but not unheard of to need this in module code, when blocking on user input. one workaround used by some code is wait_event_interruptible() - but that can be buggy if the outer context isn't expecting unwinding. Signed-off-by: Kent Overstreet <[email protected]> Cc: Andrew Morton <[email protected]> Cc: fuyuanli <[email protected]>
2024-03-13bcachefs: thread_with_stdio: Mark completed in ->release()Kent Overstreet1-4/+10
This fixes stdio_redirect_read() getting stuck, not noticing that the pipe has been closed. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13of: unittest: Use for_each_child_of_node_scoped()Jonathan Cameron1-8/+3
A simple example of the utility of this autocleanup approach to handling of_node_put(). In this particular case some of the nodes needed for the test are not available and the _available_ version would cause them to be skipped resulting in a test failure. Signed-off-by: Jonathan Cameron <[email protected]> Reviewed-by: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Rob Herring <[email protected]>
2024-03-13of: Introduce for_each_*_child_of_node_scoped() to automate of_node_put() ↵Jonathan Cameron1-0/+13
handling To avoid issues with out of order cleanup, or ambiguity about when the auto freed data is first instantiated, do it within the for loop definition. The disadvantage is that the struct device_node *child variable creation is not immediately obvious where this is used. However, in many cases, if there is another definition of struct device_node *child; the compiler / static analysers will notify us that it is unused, or uninitialized. Note that, in the vast majority of cases, the _available_ form should be used and as code is converted to these scoped handers, we should confirm that any cases that do not check for available have a good reason not to. Signed-off-by: Jonathan Cameron <[email protected]> Reviewed-by: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Rob Herring <[email protected]>
2024-03-13of: Add cleanup.h based auto release via __free(device_node) markingsJonathan Cameron1-0/+2
The recent addition of scope based cleanup support to the kernel provides a convenient tool to reduce the chances of leaking reference counts where of_node_put() should have been called in an error path. This enables struct device_node *child __free(device_node) = NULL; for_each_child_of_node(np, child) { if (test) return test; } with no need for a manual call of of_node_put(). A following patch will reduce the scope of the child variable to the for loop, to avoid an issues with ordering of autocleanup, and make it obvious when this assigned a non NULL value. In this simple example the gains are small but there are some very complex error handling cases buried in these loops that will be greatly simplified by enabling early returns with out the need for this manual of_node_put() call. Note that there are coccinelle checks in scripts/coccinelle/iterators/for_each_child.cocci to detect a failure to call of_node_put(). This new approach does not cause false positives. Longer term we may want to add scripting to check this new approach is done correctly with no double of_node_put() calls being introduced due to the auto cleanup. It may also be useful to script finding places this new approach is useful. Signed-off-by: Jonathan Cameron <[email protected]> Reviewed-by: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Rob Herring <[email protected]>
2024-03-13bcachefs: Thread with file documentationKent Overstreet2-7/+40
Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: thread_with_stdio: fix bch2_stdio_redirect_readline()Kent Overstreet1-11/+22
This fixes a bug where we'd return data without waiting for a newline, if data was present but a newline was not. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: thread_with_stdio: kill thread_with_stdio_done()Kent Overstreet3-22/+23
Move the cleanup code to a wrapper function, where we can call it after the thread_with_stdio fn exits. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: thread_with_stdio: convert to darrayKent Overstreet4-100/+160
- eliminate the dependency on printbufs, so that we can lift thread_with_file for use in xfs - add a nonblocking parameter to stdio_redirect_printf(), and either block if the buffer is full or drop it on the floor - don't buffer infinitely Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: thread_with_stdio: eliminate double bufferingKent Overstreet2-41/+18
The output buffer lock has to be a spinlock so that we can write to it from interrupt context, so we can't use a direct copy_to_user; this switches thread_with_file_read() to use fault_in_writeable() and copy_to_user_nofault(), similar to how thread_with_file_write() works. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13bcachefs: kill kvpmalloc()Kent Overstreet14-115/+49
Signed-off-by: Kent Overstreet <[email protected]>
2024-03-13mempool: kvmalloc poolKent Overstreet2-0/+26
Add mempool_init_kvmalloc_pool() and mempool_create_kvmalloc_pool(), which wrap kvmalloc() instead of kmalloc() - kmalloc() with a vmalloc() fallback. This is part of a bcachefs cleanup - dropping an internal kvpmalloc() helper (which predates kvmalloc()) along with mempool helpers; this replaces the bcachefs-private kvpmalloc_pool. Signed-off-by: Kent Overstreet <[email protected]> Cc: [email protected]
2024-03-13mtd: spi-nor: core: correct type of iMuhammad Usama Anjum1-1/+1
The i should be signed to find out the end of the loop. Otherwise, i >= 0 is always true and loop becomes infinite. Make its type to be int. Fixes: 6a9eda34418f ("mtd: spi-nor: core: set mtd->eraseregions for non-uniform erase map") Signed-off-by: Muhammad Usama Anjum <[email protected]> Reviewed-by: Tudor Ambarus <[email protected]> Reviewed-by: Michael Walle <[email protected]> Reviewed-by: Dan Carpenter <[email protected]> Reviewed-by: AngeloGioacchino Del Regno <[email protected]> Signed-off-by: Miquel Raynal <[email protected]> Link: https://lore.kernel.org/linux-mtd/[email protected]
2024-03-13Merge tag 'spi-nor/for-6.9' into mtd/nextMiquel Raynal5-165/+128
SPI NOR gets the non uniform erase code cleaned. We stopped using bitmasks for erase types and flags, and instead introduced dedicated members. We then passed the SPI NOR erase map to MTD. Users can now determine the erase regions and make informed decisions on partitions size.
2024-03-13Revert "block/mq-deadline: use correct way to throttling write requests"Bart Van Assche1-2/+1
The code "max(1U, 3 * (1U << shift) / 4)" comes from the Kyber I/O scheduler. The Kyber I/O scheduler maintains one internal queue per hwq and hence derives its async_depth from the number of hwq tags. Using this approach for the mq-deadline scheduler is wrong since the mq-deadline scheduler maintains one internal queue for all hwqs combined. Hence this revert. Cc: [email protected] Cc: Damien Le Moal <[email protected]> Cc: Harshit Mogalapalli <[email protected]> Cc: Zhiguo Niu <[email protected]> Fixes: d47f9717e5cf ("block/mq-deadline: use correct way to throttling write requests") Signed-off-by: Bart Van Assche <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2024-03-13Merge tag 'fs_for_v6.9-rc1' of ↵Linus Torvalds32-423/+608
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2, isofs, udf, and quota updates from Jan Kara: "A lot of material this time: - removal of a lot of GFP_NOFS usage from ext2, udf, quota (either it was legacy or replaced with scoped memalloc_nofs_*() API) - removal of BUG_ONs in quota code - conversion of UDF to the new mount API - tightening quota on disk format verification - fix some potentially unsafe use of RCU pointers in quota code and annotate everything properly to make sparse happy - a few other small quota, ext2, udf, and isofs fixes" * tag 'fs_for_v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (26 commits) udf: remove SLAB_MEM_SPREAD flag usage quota: remove SLAB_MEM_SPREAD flag usage isofs: remove SLAB_MEM_SPREAD flag usage ext2: remove SLAB_MEM_SPREAD flag usage ext2: mark as deprecated udf: convert to new mount API udf: convert novrs to an option flag MAINTAINERS: add missing git address for ext2 entry quota: Detect loops in quota tree quota: Properly annotate i_dquot arrays with __rcu quota: Fix rcu annotations of inode dquot pointers isofs: handle CDs with bad root inode but good Joliet root directory udf: Avoid invalid LVID used on mount quota: Fix potential NULL pointer dereference quota: Drop GFP_NOFS instances under dquot->dq_lock and dqio_sem quota: Set nofs allocation context when acquiring dqio_sem ext2: Remove GFP_NOFS use in ext2_xattr_cache_insert() ext2: Drop GFP_NOFS use in ext2_get_blocks() ext2: Drop GFP_NOFS allocation from ext2_init_block_alloc_info() udf: Remove GFP_NOFS allocation in udf_expand_file_adinicb() ...