blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2024-03-13	bcachefs: kill kvpmalloc()	Kent Overstreet	14	-115/+49
	Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: bch2_lookup() gives better error message on inode not found	Kent Overstreet	1	-9/+64
	When a dirent points to a missing inode, we really should print out the dirent. This requires quite a bit of refactoring, but there's some other benefits: we now do the entire looup (dirent and inode) in a single btree transaction, and copy to the VFS inode with btree locks still held, like the create path. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: bch2_inode_insert()	Kent Overstreet	1	-62/+76
	Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: factor out check_inode_backpointer()	Kent Overstreet	1	-9/+29
	Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Factor out check_subvol_dirent()	Kent Overstreet	1	-48/+57
	Going to be adding more code here for checking subvol structure. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Kill some -EINVALs	Kent Overstreet	2	-5/+5
	Repurposing standard error codes in bcachefs code is banned in new code, and we need to get rid of the remaining ones - private error codes give us much better error messages. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: bump max_active on btree_interior_update_worker	Kent Overstreet	1	-1/+1
	WQ_UNBOUND with max_active 1 means ordered workqueue, but we don't actually need or want ordered semantics - and probably want a higher concurrency limit anyways. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: move fsck_write_inode() to inode.c	Kent Overstreet	3	-40/+44
	Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Initialize super_block->s_uuid	Kent Overstreet	1	-0/+1
	Need to fix this oversight for the new FS_IOC_(GET\|SET)UUID ioctls. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Switch to uuid_to_fsid()	Kent Overstreet	1	-5/+1
	switch the statfs code from something horrible and open coded to the more standard uuid_to_fsid() Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Subvolumes may now be renamed	Kent Overstreet	2	-26/+55
	Files within a subvolume cannot be renamed into another subvolume, but subvolumes themselves were intended to be. This implements subvolume renaming - we need to ensure that there's only a single dirent that points to a subvolume key (not multiple versions in different snapshots), and we need to ensure that dirent.d_parent_subol and inode.bi_parent_subvol are updated. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: btree node prefetching in check_topology	Kent Overstreet	4	-3/+42
	btree_and_journal_iter is old code that we want to get rid of, but we're not ready to yet. lack of btree node prefetching is, it turns out, a real performance issue for fsck on spinning rust, so - add it. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: btree_and_journal_iter.trans	Kent Overstreet	4	-17/+21
	we now always have a btree_trans when using a btree_and_journal_iter; prep work for adding prefetching to btree_and_journal_iter Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: better journal pipelining	Kent Overstreet	4	-59/+98
	Recently a severe performance regression was discovered, which bisected to a6548c8b5eb5 bcachefs: Avoid flushing the journal in the discard path It turns out the old behaviour, which issued excessive journal flushes, worked around a performance issue where queueing delays would cause the journal to not be able to write quickly enough and stall. The journal flushes masked the issue because they periodically flushed the device write cache, reducing write latency for non flushes. This patch reworks the journalling code to allow more than one (non-flush) write to be in flight at a time. With this patch, doing 4k random writes and an iodepth of 128, we are now able to hit 560k iops to a Samsung 970 EVO Plus - previously, we were stuck in the ~200k range. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: closure per journal buf	Kent Overstreet	3	-23/+41
	Prep work for having multiple journal writes in flight. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: bio per journal buf	Kent Overstreet	3	-29/+34
	Prep work for having multiple journal writes in flight. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: jset_entry_datetime	Kent Overstreet	4	-17/+67
	This gives us a way to record the date and time every journal entry was written - useful for debugging. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: improve journal entry read fsck error messages	Kent Overstreet	1	-41/+55
	Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: convert journal replay ptrs to darray	Kent Overstreet	3	-58/+36
	Eliminates some error paths - no longer have a hardcoded BCH_REPLICAS_MAX limit. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Cleanup bch2_dirent_lookup_trans()	Kent Overstreet	3	-26/+14
	Drop an unnecessary bch2_subvolume_get_snapshot() call, and drop the __ from the name - this is a normal interface. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: bch2_hash_set_snapshot() -> bch2_hash_set_in_snapshot()	Kent Overstreet	3	-18/+12
	Minor renaming for clarity, bit of refactoring. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Workqueues should be WQ_HIGHPRI	Kent Overstreet	1	-4/+4
	Most bcachefs workqueues are used for completions, and should be WQ_HIGHPRI - this helps reduce queuing delays, we want to complete quickly once we can no longer signal backpressure by blocking. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Improve bch2_dirent_to_text()	Kent Overstreet	1	-9/+11
	For DT_SUBVOL, we now print both parent and child subvol IDs. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: fixup for building in userspace	Kent Overstreet	1	-1/+1
	Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Avoid taking journal lock unnecessarily	Kent Overstreet	2	-53/+55
	Previously, any time we failed to get a journal reservation we'd retry, with the journal lock held; but this isn't necessary given wait_event()/wake_up() ordering. This avoids performance cliffs when the journal starts to get backed up and lock contention shoots up. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Journal writes should be REQ_SYNC\|REQ_META	Kent Overstreet	1	-1/+1
	Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Avoid setting j->write_work unnecessarily	Kent Overstreet	1	-13/+11
	Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Split out journal workqueue	Kent Overstreet	3	-16/+19
	We don't want journal write completions to be blocked behind btree transactions - io_complete_wq is used for btree updates after data and metadata writes. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Kill unnecessary wakeups in journal reclaim	Kent Overstreet	1	-11/+9
	Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: skip invisible entries in empty subvolume checking	Guoyu Ou	3	-5/+9
	When we are checking whether a subvolume is empty in the specified snapshot, entries that do not belong to this subvolume should be skipped. This fixes the following case: $ bcachefs subvolume create ./sub $ cd sub $ bcachefs subvolume create ./sub2 $ bcachefs subvolume snapshot . ./snap $ ls -a snap . .. $ rmdir snap rmdir: failed to remove 'snap': Directory not empty As Kent suggested, we pass 0 in may_delete_deleted_inode() to ignore subvols in the subvol we are checking, because inode.bi_subvol is only set on subvolume roots, and we can't go through every inode in the subvolume and change bi_subvol when taking a snapshot. It makes the check less strict, but that's ok, the rest of fsck will still catch it. Signed-off-by: Guoyu Ou <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: fix split brain message	Kent Overstreet	1	-1/+1
	Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Set path->uptodate when no node at level	Kent Overstreet	1	-2/+2
	We were failing to set path->uptodate when reaching the end of a btree node iterator, causing the new prefetch code for backpointers gc to go into an infinite loop. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Correctly validate k->u64s in btree node read path	Kent Overstreet	3	-6/+27
	validate_bset_keys() never properly validated k->u64s; it checked if it was 0, but not if it was smaller than keys for the given packed format; this fixes that small oversight. This patch was backported, so it's adding quite a few error enums so that they don't get renumbered and we don't have confusing gaps. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Fix degraded mode fsck	Kent Overstreet	1	-18/+18
	We don't know where the superblock and journal lives on offline devices; that means if a device is offline fsck can't check those buckets. Previously, fsck would incorrectly clear bucket data types for those buckets on offline devices; now we just use the previous state. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Fix journal replay with unreadable btree roots	Kent Overstreet	4	-6/+70
	When a btree root is unreadable, we still might be able to get some data back by replaying what's in the journal. Previously though, we got confused when journal replay would attempt to replay a key for a level that didn't exist. This adds bch2_btree_increase_depth(), so that journal replay can handle this. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: fix check_inode_deleted_list()	Kent Overstreet	1	-6/+3
	check_inode_deleted_list() returns true if the inode is on the deleted list; check_inode() was checking the return code incorrectly. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: no_splitbrain_check option	Kent Overstreet	2	-8/+22
	This adds an option to disable kicking out devices when splitbrain is detected - it seems there's some issues with splitbrain detection and we're kicking out devices erronously. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: extent_entry_next_safe()	Kent Overstreet	1	-3/+8
	We need to be able to iterate over extent ptrs that may be corrupted in order to print them - this fixes a bug where we'd pop an assert in bch2_bkey_durability_safe(). Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: journal_seq_blacklist_add() now handles entries being added out of ↵	Kent Overstreet	2	-46/+22
	order bch2_journal_seq_blacklist_add() was bugged when the new entry overlapped with multiple existing entries, and it also assumed new entries are being added in increasing order. This is true on any sane filesystem, but when trying to recover from very badly mangled filesystems we might end up with the journal sequence number rewinding vs. what the blacklist list knows about - easiest to just handle that here. Signed-off-by: Kent Overstreet <[email protected]>
2024-03-10	bcachefs: Fix null-ptr-deref in bch2_fs_alloc()	Li Zetao	1	-3/+3
	There is a null-ptr-deref issue reported by kasan: KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] Call Trace: <TASK> bch2_fs_alloc+0x1092/0x2170 [bcachefs] bch2_fs_open+0x683/0xe10 [bcachefs] ... When initializing the name of bch_fs, it needs to dynamically alloc memory to meet the length of the name. However, when name allocation failed, it will cause a null-ptr-deref access exception in subsequent string copy. Fix this issue by checking if name allocation is successful. Fixes: 401ec4db6308 ("bcachefs: Printbuf rework") Signed-off-by: Li Zetao <[email protected]> Signed-off-by: Kent Overstreet <[email protected]>
2024-02-25	Merge tag 'bcachefs-2024-02-25' of https://evilpiepirate.org/git/bcachefs	Linus Torvalds	7	-22/+25
	Pull bcachefs fixes from Kent Overstreet: "Some more mostly boring fixes, but some not User reported ones: - the BTREE_ITER_FILTER_SNAPSHOTS one fixes a really nasty performance bug; user reported an untar initially taking two seconds and then ~2 minutes - kill a __GFP_NOFAIL in the buffered read path; this was a leftover from the trickier fix to kill __GFP_NOFAIL in readahead, where we can't return errors (and have to silently truncate the read ourselves). bcachefs can't use GFP_NOFAIL for folio state unlike iomap based filesystems because our folio state is just barely too big, 2MB hugepages cause us to exceed the 2 page threshhold for GFP_NOFAIL. additionally, the flags argument was just buggy, we weren't supplying GFP_KERNEL previously (!)" * tag 'bcachefs-2024-02-25' of https://evilpiepirate.org/git/bcachefs: bcachefs: fix bch2_save_backtrace() bcachefs: Fix check_snapshot() memcpy bcachefs: Fix bch2_journal_flush_device_pins() bcachefs: fix iov_iter count underflow on sub-block dio read bcachefs: Fix BTREE_ITER_FILTER_SNAPSHOTS on inodes btree bcachefs: Kill __GFP_NOFAIL in buffered read path bcachefs: fix backpointer_to_text() when dev does not exist
2024-02-25	bcachefs: fix bch2_save_backtrace()	Kent Overstreet	1	-1/+1
	Missed a call in the previous fix. Signed-off-by: Kent Overstreet <[email protected]>
2024-02-25	Merge tag 'erofs-for-6.8-rc6-fixes' of ↵	Linus Torvalds	1	-14/+14
	git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fix from Gao Xiang: - Fix page refcount leak when looking up specific inodes introduced by metabuf reworking * tag 'erofs-for-6.8-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: fix refcount on the metabuf used for inode lookup
2024-02-25	Merge tag 'pull-fixes.pathwalk-rcu-2' of ↵	Linus Torvalds	20	-63/+85
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull RCU pathwalk fixes from Al Viro: "We still have some races in filesystem methods when exposed to RCU pathwalk. This series is a result of code audit (the second round of it) and it should deal with most of that stuff. Still pending: ntfs3 ->d_hash()/->d_compare() and ceph_d_revalidate(). Up to maintainers (a note for NTFS folks - when documentation says that a method may not block, it does imply that blocking allocations are to be avoided. Really)" [ More explanations for people who aren't familiar with the vagaries of RCU path walking: most of it is hidden from filesystems, but if a filesystem actively participates in the low-level path walking it needs to make sure the fields involved in that walk are RCU-safe. That "actively participate in low-level path walking" includes things like having its own ->d_hash()/->d_compare() routines, or by having its own directory permission function that doesn't just use the common helpers. Having a ->d_revalidate() function will also have this issue. Note that instead of making everything RCU safe you can also choose to abort the RCU pathwalk if your operation cannot be done safely under RCU, but that obviously comes with a performance penalty. One common pattern is to allow the simple cases under RCU, and abort only if you need to do something more complicated. So not everything needs to be RCU-safe, and things like the inode etc that the VFS itself maintains obviously already are. But these fixes tend to be about properly RCU-delaying things like ->s_fs_info that are maintained by the filesystem and that got potentially released too early. - Linus ] * tag 'pull-fixes.pathwalk-rcu-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ext4_get_link(): fix breakage in RCU mode cifs_get_link(): bail out in unsafe case fuse: fix UAF in rcu pathwalks procfs: make freeing proc_fs_info rcu-delayed procfs: move dropping pde and pid from ->evict_inode() to ->free_inode() nfs: fix UAF on pathwalk running into umount nfs: make nfs_set_verifier() safe for use in RCU pathwalk afs: fix __afs_break_callback() / afs_drop_open_mmap() race hfsplus: switch to rcu-delayed unloading of nls and freeing ->s_fs_info exfat: move freeing sbi, upcase table and dropping nls into rcu-delayed helper affs: free affs_sb_info with kfree_rcu() rcu pathwalk: prevent bogus hard errors from may_lookup() fs/super.c: don't drop ->s_user_ns until we free struct super_block itself
2024-02-25	Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	Linus Torvalds	2	-4/+8
	Pull vfs fixes from Al Viro: "A couple of fixes - revert of regression from this cycle and a fix for erofs failure exit breakage (had been there since way back)" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: erofs: fix handling kern_mount() failure Revert "get rid of DCACHE_GENOCIDE"
2024-02-25	ext4_get_link(): fix breakage in RCU mode	Al Viro	1	-3/+5
	1) errors from ext4_getblk() should not be propagated to caller unless we are really sure that we would've gotten the same error in non-RCU pathwalk. 2) we leak buffer_heads if ext4_getblk() is successful, but bh is not uptodate. Signed-off-by: Al Viro <[email protected]>
2024-02-25	cifs_get_link(): bail out in unsafe case	Al Viro	1	-0/+3
	->d_revalidate() bails out there, anyway. It's not enough to prevent getting into ->get_link() in RCU mode, but that could happen only in a very contrieved setup. Not worth trying to do anything fancy here unless ->d_revalidate() stops kicking out of RCU mode at least in some cases. Reviewed-by: Christian Brauner <[email protected]> Acked-by: Miklos Szeredi <[email protected]> Signed-off-by: Al Viro <[email protected]>
2024-02-25	fuse: fix UAF in rcu pathwalks	Al Viro	3	-6/+13
	->permission(), ->get_link() and ->inode_get_acl() might dereference ->s_fs_info (and, in case of ->permission(), ->s_fs_info->fc->user_ns as well) when called from rcu pathwalk. Freeing ->s_fs_info->fc is rcu-delayed; we need to make freeing ->s_fs_info and dropping ->user_ns rcu-delayed too. Signed-off-by: Al Viro <[email protected]>
2024-02-25	procfs: make freeing proc_fs_info rcu-delayed	Al Viro	1	-1/+1
	makes proc_pid_ns() safe from rcu pathwalk (put_pid_ns() is still synchronous, but that's not a problem - it does rcu-delay everything that needs to be) Reviewed-by: Christian Brauner <[email protected]> Signed-off-by: Al Viro <[email protected]>
2024-02-25	procfs: move dropping pde and pid from ->evict_inode() to ->free_inode()	Al Viro	2	-13/+8
	that keeps both around until struct inode is freed, making access to them safe from rcu-pathwalk Acked-by: Christian Brauner <[email protected]> Signed-off-by: Al Viro <[email protected]>