aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2022-11-28NFSD: Finish converting the NFSv3 GETACL result encoderChuck Lever1-24/+6
For some reason, the NFSv2 GETACL result encoder was fully converted to use the new nfs_stream_encode_acl(), but the NFSv3 equivalent was not similarly converted. Fixes: 20798dfe249a ("NFSD: Update the NFSv3 GETACL result encoder to use struct xdr_stream") Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2022-11-28NFSD: Finish converting the NFSv2 GETACL result encoderChuck Lever1-10/+0
The xdr_stream conversion inadvertently left some code that set the page_len of the send buffer. The XDR stream encoders should handle this automatically now. This oversight adds garbage past the end of the Reply message. Clients typically ignore the garbage, but NFSD does not need to send it, as it leaks stale memory contents onto the wire. Fixes: f8cba47344f7 ("NFSD: Update the NFSv2 GETACL result encoder to use struct xdr_stream") Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2022-11-28NFSD: Remove redundant assignment to variable host_errColin Ian King1-1/+0
Variable host_err is assigned a value that is never read, it is being re-assigned a value in every different execution path in the following switch statement. The assignment is redundant and can be removed. Cleans up clang-scan warning: warning: Value stored to 'host_err' is never read [deadcode.DeadStores] Signed-off-by: Colin Ian King <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2022-11-28NFSD: Simplify READ_PLUSAnna Schumaker1-107/+32
Chuck had suggested reverting READ_PLUS so it returns a single DATA segment covering the requested read range. This prepares the server for a future "sparse read" function so support can easily be added without needing to rip out the old READ_PLUS code at the same time. Signed-off-by: Anna Schumaker <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2022-11-28fs/ext2: Fix code indentationRong Tao2-7/+7
ts=4 can cause misunderstanding in code reading. It is better to replace 8 spaces with one tab. Signed-off-by: Rong Tao <[email protected]> Signed-off-by: Jan Kara <[email protected]>
2022-11-28ovl: update ->f_iocb_flags when ovl_change_flags() modifies ->f_flagsAl Viro1-0/+1
ovl_change_flags() is an open-coded variant of fs/fcntl.c:setfl() and it got missed by commit 164f4064ca81 ("keep iocb_flags() result cached in struct file"); the same change applies there. Reported-by: Pierre Labastie <[email protected]> Fixes: 164f4064ca81 ("keep iocb_flags() result cached in struct file") Cc: <[email protected]> # v6.0 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216738 Signed-off-by: Al Viro <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2022-11-28ext2: unbugger ext2_empty_dir()Al Viro1-1/+1
In 27cfa258951a "ext2: fix fs corruption when trying to remove a non-empty directory with IO error" a funny thing has happened: - page = ext2_get_page(inode, i, dir_has_error, &page_addr); + page = ext2_get_page(inode, i, 0, &page_addr); - if (IS_ERR(page)) { - dir_has_error = 1; - continue; - } + if (IS_ERR(page)) + goto not_empty; And at not_empty: we hit ext2_put_page(page, page_addr), which does put_page(page). Which, unless I'm very mistaken, should oops immediately when given ERR_PTR(-E...) as page. OK, shit happens, insufficiently tested patches included. But when commit in question describes the fault-injection test that exercised that particular failure exit... Ow. CC: [email protected] Fixes: 27cfa258951a ("ext2: fix fs corruption when trying to remove a non-empty directory with IO error") Signed-off-by: Al Viro <[email protected]> Signed-off-by: Jan Kara <[email protected]>
2022-11-28ovl: fix use inode directly in rcu-walk modeChen Zhongjin1-1/+6
ovl_dentry_revalidate_common() can be called in rcu-walk mode. As document said, "in rcu-walk mode, d_parent and d_inode should not be used without care". Check inode here to protect access under rcu-walk mode. Fixes: bccece1ead36 ("ovl: allow remote upper") Reported-and-tested-by: [email protected] Signed-off-by: Chen Zhongjin <[email protected]> Cc: <[email protected]> # v5.7 Signed-off-by: Miklos Szeredi <[email protected]>
2022-11-27NFS: Fix a race in nfs_call_unlink()Trond Myklebust1-0/+1
We should check that the filehandles match before transferring the sillyrename data to the newly looked-up dentry in case the name was reused on the server. Signed-off-by: Trond Myklebust <[email protected]>
2022-11-27NFS: Fix an Oops in nfs_d_automount()Trond Myklebust1-1/+1
When mounting from a NFSv4 referral, path->dentry can end up being a negative dentry, so derive the struct nfs_server from the dentry itself instead. Fixes: 2b0143b5c986 ("VFS: normal filesystems (and lustre): d_inode() annotations") Signed-off-by: Trond Myklebust <[email protected]>
2022-11-27NFSv4: Fix a deadlock between nfs4_open_recover_helper() and delegreturnTrond Myklebust1-7/+12
If we're asked to recover open state while a delegation return is outstanding, then the state manager thread cannot use a cached open, so if the server returns a delegation, we can end up deadlocked behind the pending delegreturn. To avoid this problem, let's just ask the server not to give us a delegation unless we're explicitly reclaiming one. Fixes: be36e185bd26 ("NFSv4: nfs4_open_recover_helper() must set share access") Signed-off-by: Trond Myklebust <[email protected]>
2022-11-27NFSv4: Fix a credential leak in _nfs4_discover_trunking()Trond Myklebust1-1/+3
Fixes: 4f40a5b55446 ("NFSv4: Add an fattr allocation to _nfs4_discover_trunking()") Signed-off-by: Trond Myklebust <[email protected]>
2022-11-27NFS: Trigger the "ls -l" readdir heuristic soonerBenjamin Coddington1-2/+7
Since commit 1a34c8c9a49e ("NFS: Support larger readdir buffers") has updated dtsize, and with recent improvements to the READDIRPLUS helper heuristic, the heuristic may not trigger until many dentries are emitted to userspace. This will cause many thousands of GETATTR calls for "ls -l" when the directory's pagecache has already been populated. This manifests as poor performance for long directory listings after an initially fast "ls -l". Fix this by emitting only 17 entries for any first pass through the NFS directory's ->iterate_shared(), which allows userpace to prime the counters for the heuristic. Signed-off-by: Benjamin Coddington <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2022-11-27NFSv4.2: Fix initialisation of struct nfs4_labelTrond Myklebust1-5/+10
The call to nfs4_label_init_security() should return a fully initialised label. Fixes: aa9c2669626c ("NFS: Client implementation of Labeled-NFS") Signed-off-by: Trond Myklebust <[email protected]>
2022-11-27NFSv4.2: Fix a memory stomp in decode_attr_security_labelTrond Myklebust1-6/+4
We must not change the value of label->len if it is zero, since that indicates we stored a label. Fixes: b4487b935452 ("nfs: Fix getxattr kernel panic and memory overflow") Signed-off-by: Trond Myklebust <[email protected]>
2022-11-27NFSv4.2: Always decode the security labelTrond Myklebust1-6/+4
If the server returns a reply that includes a security label, then we must decode it whether or not we can store the results. Fixes: 1e2f67da8931 ("NFS: Remove the nfs4_label argument from decode_getattr_*() functions") Signed-off-by: Trond Myklebust <[email protected]>
2022-11-27NFSv4.2: Clear FATTR4_WORD2_SECURITY_LABEL when done decodingTrond Myklebust1-1/+1
We need to clear the FATTR4_WORD2_SECURITY_LABEL bitmap flag irrespective of whether or not the label is too long. Fixes: aa9c2669626c ("NFS: Client implementation of Labeled-NFS") Signed-off-by: Trond Myklebust <[email protected]>
2022-11-27NFS: Clear the file access cache upon loginTrond Myklebust1-0/+23
POSIX typically only refreshes the user's supplementary group information upon login. Since NFS servers may often refresh their concept of the user supplementary group membership at their own cadence, it is possible for the NFS client's access cache to become stale due to the user's group membership changing on the server after the user has already logged in on the client. While it is reasonable to expect that such group membership changes are rare, and that we do not want to optimise the cache to accommodate them, it is also not unreasonable for the user to expect that if they log out and log back in again, that the staleness would clear up. Reviewed-by: Benjamin Coddington <[email protected]> Tested-by: Benjamin Coddington <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2022-11-27Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds3-9/+20
Pull vfs fix from Al Viro: "Amir's copy_file_range() fix" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: fix copy_file_range() averts filesystem freeze protection
2022-11-27Merge tag '6.1-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2-3/+5
Pull cifs fixes from Steve French: "Two small cifs/smb3 client fixes: - an unlock missing in an error path in copychunk_range found by xfstest 476 - a fix for a use after free in a debug code path" * tag '6.1-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix missing unlock in cifs_file_copychunk_range() cifs: Use after free in debug code
2022-11-26Merge tag 'nfsd-6.1-6' of ↵Linus Torvalds1-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fix from Chuck Lever: - Fix rare data corruption on READ operations * tag 'nfsd-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Fix reads with a non-zero offset that don't end on a page boundary
2022-11-25Merge tag 'zonefs-6.1-rc7' of ↵Linus Torvalds2-8/+21
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs Pull zonefs fixes from Damien Le Moal: - Fix a race between zonefs module initialization of sysfs attribute directory and mounting a drive (from Xiaoxu). - Fix active zone accounting in the rare case of an IO error due to a zone transition to offline or read-only state (from me). * tag 'zonefs-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: Fix active zone accounting zonefs: Fix race between modprobe and mount
2022-11-25Merge tag 'for-6.1-rc6-tag' of ↵Linus Torvalds7-35/+132
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix a regression in nowait + buffered write - in zoned mode fix endianness when comparing super block generation - locking and lockdep fixes: - fix potential sleeping under spinlock when setting qgroup limit - lockdep warning fixes when btrfs_path is freed after copy_to_user - do not modify log tree while holding a leaf from fs tree locked - fix freeing of sysfs files of static features on error - use kv.alloc for zone map allocation as a fallback to avoid warnings due to high order allocation - send, avoid unaligned encoded writes when attempting to clone range * tag 'for-6.1-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: sysfs: normalize the error handling branch in btrfs_init_sysfs() btrfs: do not modify log tree while holding a leaf from fs tree locked btrfs: use kvcalloc in btrfs_get_dev_zone_info btrfs: qgroup: fix sleep from invalid context bug in btrfs_qgroup_inherit() btrfs: send: avoid unaligned encoded writes when attempting to clone range btrfs: zoned: fix missing endianness conversion in sb_write_pointer btrfs: free btrfs_path before copying subvol info to userspace btrfs: free btrfs_path before copying fspath to userspace btrfs: free btrfs_path before copying inodes to userspace btrfs: free btrfs_path before copying root refs to userspace btrfs: fix assertion failure and blocking during nowait buffered write
2022-11-25btrfs: replace INT_LIMIT(loff_t) with OFFSET_MAXZhen Lei1-3/+3
OFFSET_MAX is self-annotated and more readable. Signed-off-by: Zhen Lei <[email protected]> Acked-by: David Sterba <[email protected]> Reviewed-by: Eric Biggers <[email protected]> Signed-off-by: Al Viro <[email protected]>
2022-11-25fscrypt: add comment for fscrypt_valid_enc_modes_v1()Eric Biggers1-0/+7
Make it clear that nothing new should be added to this function. Signed-off-by: Eric Biggers <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-25Merge tag 'mm-hotfixes-stable-2022-11-24' of ↵Linus Torvalds2-1/+9
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "24 MM and non-MM hotfixes. 8 marked cc:stable and 16 for post-6.0 issues. There have been a lot of hotfixes this cycle, and this is quite a large batch given how far we are into the -rc cycle. Presumably a reflection of the unusually large amount of MM material which went into 6.1-rc1" * tag 'mm-hotfixes-stable-2022-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (24 commits) test_kprobes: fix implicit declaration error of test_kprobes nilfs2: fix nilfs_sufile_mark_dirty() not set segment usage as dirty mm/cgroup/reclaim: fix dirty pages throttling on cgroup v1 mm: fix unexpected changes to {failslab|fail_page_alloc}.attr swapfile: fix soft lockup in scan_swap_map_slots hugetlb: fix __prep_compound_gigantic_page page flag setting kfence: fix stack trace pruning proc/meminfo: fix spacing in SecPageTables mm: multi-gen LRU: retry folios written back while isolated mailmap: update email address for Satya Priya mm/migrate_device: return number of migrating pages in args->cpages kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible MAINTAINERS: update Alex Hung's email address mailmap: update Alex Hung's email address mm: mmap: fix documentation for vma_mas_szero mm/damon/sysfs-schemes: skip stats update if the scheme directory is removed mm/memory: return vm_fault_t result from migrate_to_ram() callback mm: correctly charge compressed memory to its memcg ipc/shm: call underlying open/close vm_ops gcov: clang: fix the buffer overflow issue ...
2022-11-25Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds2-2/+12
Pull vfs fixes from Al Viro: "A couple of fixes, one of them for this cycle regression..." * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: vfs_tmpfile: ensure O_EXCL flag is enforced fs: use acquire ordering in __fget_light()
2022-11-25use less confusing names for iov_iter direction initializersAl Viro31-72/+72
READ/WRITE proved to be actively confusing - the meanings are "data destination, as used with read(2)" and "data source, as used with write(2)", but people keep interpreting those as "we read data from it" and "we write data to it", i.e. exactly the wrong way. Call them ITER_DEST and ITER_SOURCE - at least that is harder to misinterpret... Signed-off-by: Al Viro <[email protected]>
2022-11-25zonefs: Fix active zone accountingDamien Le Moal2-2/+15
If a file zone transitions to the offline or readonly state from an active state, we must clear the zone active flag and decrement the active seq file counter. Do so in zonefs_account_active() using the new zonefs inode flags ZONEFS_ZONE_OFFLINE and ZONEFS_ZONE_READONLY. These flags are set if necessary in zonefs_check_zone_condition() based on the result of report zones operation after an IO error. Fixes: 87c9ce3ffec9 ("zonefs: Add active seq file accounting") Cc: [email protected] Signed-off-by: Damien Le Moal <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]>
2022-11-25vfs: fix copy_file_range() averts filesystem freeze protectionAmir Goldstein3-9/+20
Commit 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") removed fallback to generic_copy_file_range() for cross-fs cases inside vfs_copy_file_range(). To preserve behavior of nfsd and ksmbd server-side-copy, the fallback to generic_copy_file_range() was added in nfsd and ksmbd code, but that call is missing sb_start_write(), fsnotify hooks and more. Ideally, nfsd and ksmbd would pass a flag to vfs_copy_file_range() that will take care of the fallback, but that code would be subtle and we got vfs_copy_file_range() logic wrong too many times already. Instead, add a flag to explicitly request vfs_copy_file_range() to perform only generic_copy_file_range() and let nfsd and ksmbd use this flag only in the fallback path. This choise keeps the logic changes to minimum in the non-nfsd/ksmbd code paths to reduce the risk of further regressions. Fixes: 868f9f2f8e00 ("vfs: fix copy_file_range() regression in cross-fs copies") Tested-by: Namjae Jeon <[email protected]> Tested-by: Luis Henriques <[email protected]> Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Al Viro <[email protected]>
2022-11-25fs: simplify vfs_get_superChristoph Hellwig1-51/+9
Remove the pointless keying argument and associated enum and pass the fill_super callback and a "bool reconf" instead. Also mark the function static given that there are no users outside of super.c. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Al Viro <[email protected]>
2022-11-24[elf] get rid of get_note_info_size()Al Viro1-6/+1
it's trivial now... Signed-off-by: Al Viro <[email protected]>
2022-11-24[elf] unify regset and non-regset casesAl Viro1-192/+38
The only real difference is in filling per-thread notes - getting the values of registers. And this is the only part that is worth an ifdef - we don't need to duplicate the logics regarding gathering threads, filling other notes, etc. It would've been hard to do back when regset-based variant had been introduced, mostly due to sharing bits and pieces of helpers with aout coredumps. As the result, too much had been duplicated and the copies had drifted away since then. Now it can be done cleanly... Signed-off-by: Al Viro <[email protected]>
2022-11-24[elf][non-regset] use elf_core_copy_task_regs() for dumper as wellAl Viro1-1/+1
elf_core_copy_regs() is equivalent to elf_core_copy_task_regs() of current on all architectures. Signed-off-by: Al Viro <[email protected]>
2022-11-24[elf][non-regset] uninline elf_core_copy_task_fpregs() (and lose pt_regs ↵Al Viro1-3/+2
argument) Don't bother with pointless macros - we are not sharing it with aout coredumps anymore. Just convert the underlying functions to the same arguments (nobody uses regs, actually) and call them elf_core_copy_task_fpregs(). And unexport the entire bunch, while we are at it. [added missing includes in arch/{csky,m68k,um}/kernel/process.c to avoid extra warnings about the lack of externs getting added to huge piles for those files. Pointless, but...] Signed-off-by: Al Viro <[email protected]>
2022-11-24copy_mnt_ns(): handle a corner case (overmounted mntns bindings) sanerAl Viro1-1/+2
copy_mnt_ns() has the old tree copied, with mntns binding *and* anything bound on top of them skipped. Then it proceeds to walk both trees in parallel. Unfortunately, it doesn't get the "skip the stuff we'd skipped when copying" quite right. Consequences are minor (the ->mnt_root comparison will return the situation to sanity pretty soon and the worst we get is the unexpected subset of opened non-directories being switched to new namespace), but it's confusing enough and it's not hard to get the expected behaviour... Signed-off-by: Al Viro <[email protected]>
2022-11-24Merge tag 'ext4_for_linus_stable2' of ↵Linus Torvalds2-16/+32
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Fix a regression in the lazytime code that was introduced in v6.1-rc1, and a use-after-free that can be triggered by a maliciously corrupted file system" * tag 'ext4_for_linus_stable2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: fs: do not update freeing inode i_io_list ext4: fix use-after-free in ext4_ext_shift_extents
2022-11-24driver core: make struct class.devnode() take a const *Greg Kroah-Hartman1-1/+1
The devnode() in struct class should not be modifying the device that is passed into it, so mark it as a const * and propagate the function signature changes out into all relevant subsystems that use this callback. Cc: Fenghua Yu <[email protected]> Cc: Reinette Chatre <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: [email protected] Cc: "H. Peter Anvin" <[email protected]> Cc: FUJITA Tomonori <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Justin Sanders <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Benjamin Gaignard <[email protected]> Cc: Liam Mark <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Brian Starkey <[email protected]> Cc: John Stultz <[email protected]> Cc: "Christian König" <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Leon Romanovsky <[email protected]> Cc: Dennis Dalessandro <[email protected]> Cc: Dmitry Torokhov <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Cc: Sean Young <[email protected]> Cc: Frank Haverkamp <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: "Michael S. Tsirkin" <[email protected]> Cc: Jason Wang <[email protected]> Cc: Alex Williamson <[email protected]> Cc: Cornelia Huck <[email protected]> Cc: Kees Cook <[email protected]> Cc: Anton Vorontsov <[email protected]> Cc: Colin Cross <[email protected]> Cc: Tony Luck <[email protected]> Cc: Jaroslav Kysela <[email protected]> Cc: Takashi Iwai <[email protected]> Cc: Hans Verkuil <[email protected]> Cc: Christophe JAILLET <[email protected]> Cc: Xie Yongji <[email protected]> Cc: Gautam Dawar <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: Eli Cohen <[email protected]> Cc: Parav Pandit <[email protected]> Cc: Maxime Coquelin <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-11-23NFSD: Fix reads with a non-zero offset that don't end on a page boundaryChuck Lever1-3/+4
This was found when virtual machines with nfs-mounted qcow2 disks failed to boot properly. Reported-by: Anders Blomdell <[email protected]> Suggested-by: Al Viro <[email protected]> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2142132 Fixes: bfbfb6182ad1 ("nfsd_splice_actor(): handle compound pages") Signed-off-by: Chuck Lever <[email protected]>
2022-11-23fscache: fix OOB Read in __fscache_acquire_volumeDavid Howells1-2/+5
The type of a->key[0] is char in fscache_volume_same(). If the length of cache volume key is greater than 127, the value of a->key[0] is less than 0. In this case, klen becomes much larger than 255 after type conversion, because the type of klen is size_t. As a result, memcmp() is read out of bounds. This causes a slab-out-of-bounds Read in __fscache_acquire_volume(), as reported by Syzbot. Fix this by changing the type of the stored key to "u8 *" rather than "char *" (it isn't a simple string anyway). Also put in a check that the volume name doesn't exceed NAME_MAX. BUG: KASAN: slab-out-of-bounds in memcmp+0x16f/0x1c0 lib/string.c:757 Read of size 8 at addr ffff888016f3aa90 by task syz-executor344/3613 Call Trace: memcmp+0x16f/0x1c0 lib/string.c:757 memcmp include/linux/fortify-string.h:420 [inline] fscache_volume_same fs/fscache/volume.c:133 [inline] fscache_hash_volume fs/fscache/volume.c:171 [inline] __fscache_acquire_volume+0x76c/0x1080 fs/fscache/volume.c:328 fscache_acquire_volume include/linux/fscache.h:204 [inline] v9fs_cache_session_get_cookie+0x143/0x240 fs/9p/cache.c:34 v9fs_session_init+0x1166/0x1810 fs/9p/v9fs.c:473 v9fs_mount+0xba/0xc90 fs/9p/vfs_super.c:126 legacy_get_tree+0x105/0x220 fs/fs_context.c:610 vfs_get_tree+0x89/0x2f0 fs/super.c:1530 do_new_mount fs/namespace.c:3040 [inline] path_mount+0x1326/0x1e20 fs/namespace.c:3370 do_mount fs/namespace.c:3383 [inline] __do_sys_mount fs/namespace.c:3591 [inline] __se_sys_mount fs/namespace.c:3568 [inline] __x64_sys_mount+0x27f/0x300 fs/namespace.c:3568 Fixes: 62ab63352350 ("fscache: Implement volume registration") Reported-by: [email protected] Signed-off-by: David Howells <[email protected]> Reviewed-by: Zhang Peng <[email protected]> Reviewed-by: Jingbo Xu <[email protected]> cc: Dominique Martinet <[email protected]> cc: Jeff Layton <[email protected]> cc: [email protected] cc: [email protected] Link: https://lore.kernel.org/r/[email protected]/ # Zhang Peng's v1 fix Link: https://lore.kernel.org/r/[email protected]/ # Zhang Peng's v2 fix Link: https://lore.kernel.org/r/166869954095.3793579.8500020902371015443.stgit@warthog.procyon.org.uk/ # v1 Signed-off-by: Linus Torvalds <[email protected]>
2022-11-23kernfs: fix all kernel-doc warnings and multiple typosRandy Dunlap6-48/+74
Fix kernel-doc warnings. Many of these are about a function's return value, so use the kernel-doc Return: format to fix those Use % prefix on numeric constant values. dir.c: fix typos/spellos file.c fix typo: s/taret/target/ Fix all of these kernel-doc warnings: dir.c:305: warning: missing initial short description on line: * kernfs_name_hash dir.c:137: warning: No description found for return value of 'kernfs_path_from_node_locked' dir.c:196: warning: No description found for return value of 'kernfs_name' dir.c:224: warning: No description found for return value of 'kernfs_path_from_node' dir.c:292: warning: No description found for return value of 'kernfs_get_parent' dir.c:312: warning: No description found for return value of 'kernfs_name_hash' dir.c:404: warning: No description found for return value of 'kernfs_unlink_sibling' dir.c:588: warning: No description found for return value of 'kernfs_node_from_dentry' dir.c:806: warning: No description found for return value of 'kernfs_find_ns' dir.c:879: warning: No description found for return value of 'kernfs_find_and_get_ns' dir.c:904: warning: No description found for return value of 'kernfs_walk_and_get_ns' dir.c:927: warning: No description found for return value of 'kernfs_create_root' dir.c:996: warning: No description found for return value of 'kernfs_root_to_node' dir.c:1016: warning: No description found for return value of 'kernfs_create_dir_ns' dir.c:1048: warning: No description found for return value of 'kernfs_create_empty_dir' dir.c:1306: warning: No description found for return value of 'kernfs_next_descendant_post' dir.c:1568: warning: No description found for return value of 'kernfs_remove_self' dir.c:1630: warning: No description found for return value of 'kernfs_remove_by_name_ns' dir.c:1667: warning: No description found for return value of 'kernfs_rename_ns' file.c:66: warning: No description found for return value of 'of_on' file.c:88: warning: No description found for return value of 'kernfs_deref_open_node_locked' file.c:1036: warning: No description found for return value of '__kernfs_create_file' inode.c:100: warning: No description found for return value of 'kernfs_setattr' mount.c:160: warning: No description found for return value of 'kernfs_root_from_sb' mount.c:198: warning: No description found for return value of 'kernfs_node_dentry' mount.c:302: warning: No description found for return value of 'kernfs_super_ns' mount.c:318: warning: No description found for return value of 'kernfs_get_tree' symlink.c:28: warning: No description found for return value of 'kernfs_create_link' Signed-off-by: Randy Dunlap <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Tejun Heo <[email protected]> Acked-by: Tejun Heo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-11-23btrfs: sysfs: normalize the error handling branch in btrfs_init_sysfs()Zhen Lei1-2/+5
Although kset_unregister() can eventually remove all attribute files, explicitly rolling back with the matching function makes the code logic look clearer. CC: [email protected] # 5.4+ Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Zhen Lei <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-11-23btrfs: do not modify log tree while holding a leaf from fs tree lockedFilipe Manana1-4/+55
When logging an inode in full mode, or when logging xattrs or when logging the dir index items of a directory, we are modifying the log tree while holding a read lock on a leaf from the fs/subvolume tree. This can lead to a deadlock in rare circumstances, but it is a real possibility, and it was recently reported by syzbot with the following trace from lockdep: WARNING: possible circular locking dependency detected 6.1.0-rc5-next-20221116-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor.1/16154 is trying to acquire lock: ffff88807e3084a0 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node.part.0+0xa1/0xf30 fs/btrfs/delayed-inode.c:256 but task is already holding lock: ffff88807df33078 (btrfs-log-00){++++}-{3:3}, at: __btrfs_tree_lock+0x32/0x3d0 fs/btrfs/locking.c:197 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (btrfs-log-00){++++}-{3:3}: down_read_nested+0x9e/0x450 kernel/locking/rwsem.c:1634 __btrfs_tree_read_lock+0x32/0x350 fs/btrfs/locking.c:135 btrfs_tree_read_lock fs/btrfs/locking.c:141 [inline] btrfs_read_lock_root_node+0x82/0x3a0 fs/btrfs/locking.c:280 btrfs_search_slot_get_root fs/btrfs/ctree.c:1678 [inline] btrfs_search_slot+0x3ca/0x2c70 fs/btrfs/ctree.c:1998 btrfs_lookup_csum+0x116/0x3f0 fs/btrfs/file-item.c:209 btrfs_csum_file_blocks+0x40e/0x1370 fs/btrfs/file-item.c:1021 log_csums.isra.0+0x244/0x2d0 fs/btrfs/tree-log.c:4258 copy_items.isra.0+0xbfb/0xed0 fs/btrfs/tree-log.c:4403 copy_inode_items_to_log+0x13d6/0x1d90 fs/btrfs/tree-log.c:5873 btrfs_log_inode+0xb19/0x4680 fs/btrfs/tree-log.c:6495 btrfs_log_inode_parent+0x890/0x2a20 fs/btrfs/tree-log.c:6982 btrfs_log_dentry_safe+0x59/0x80 fs/btrfs/tree-log.c:7083 btrfs_sync_file+0xa41/0x13c0 fs/btrfs/file.c:1921 vfs_fsync_range+0x13e/0x230 fs/sync.c:188 generic_write_sync include/linux/fs.h:2856 [inline] iomap_dio_complete+0x73a/0x920 fs/iomap/direct-io.c:128 btrfs_direct_write fs/btrfs/file.c:1536 [inline] btrfs_do_write_iter+0xba2/0x1470 fs/btrfs/file.c:1668 call_write_iter include/linux/fs.h:2160 [inline] do_iter_readv_writev+0x20b/0x3b0 fs/read_write.c:735 do_iter_write+0x182/0x700 fs/read_write.c:861 vfs_iter_write+0x74/0xa0 fs/read_write.c:902 iter_file_splice_write+0x745/0xc90 fs/splice.c:686 do_splice_from fs/splice.c:764 [inline] direct_splice_actor+0x114/0x180 fs/splice.c:931 splice_direct_to_actor+0x335/0x8a0 fs/splice.c:886 do_splice_direct+0x1ab/0x280 fs/splice.c:974 do_sendfile+0xb19/0x1270 fs/read_write.c:1255 __do_sys_sendfile64 fs/read_write.c:1323 [inline] __se_sys_sendfile64 fs/read_write.c:1309 [inline] __x64_sys_sendfile64+0x259/0x2c0 fs/read_write.c:1309 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd -> #1 (btrfs-tree-00){++++}-{3:3}: __lock_release kernel/locking/lockdep.c:5382 [inline] lock_release+0x371/0x810 kernel/locking/lockdep.c:5688 up_write+0x2a/0x520 kernel/locking/rwsem.c:1614 btrfs_tree_unlock_rw fs/btrfs/locking.h:189 [inline] btrfs_unlock_up_safe+0x1e3/0x290 fs/btrfs/locking.c:238 search_leaf fs/btrfs/ctree.c:1832 [inline] btrfs_search_slot+0x265e/0x2c70 fs/btrfs/ctree.c:2074 btrfs_insert_empty_items+0xbd/0x1c0 fs/btrfs/ctree.c:4133 btrfs_insert_delayed_item+0x826/0xfa0 fs/btrfs/delayed-inode.c:746 btrfs_insert_delayed_items fs/btrfs/delayed-inode.c:824 [inline] __btrfs_commit_inode_delayed_items fs/btrfs/delayed-inode.c:1111 [inline] __btrfs_run_delayed_items+0x280/0x590 fs/btrfs/delayed-inode.c:1153 flush_space+0x147/0xe90 fs/btrfs/space-info.c:728 btrfs_async_reclaim_metadata_space+0x541/0xc10 fs/btrfs/space-info.c:1086 process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289 worker_thread+0x669/0x1090 kernel/workqueue.c:2436 kthread+0x2e8/0x3a0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308 -> #0 (&delayed_node->mutex){+.+.}-{3:3}: check_prev_add kernel/locking/lockdep.c:3097 [inline] check_prevs_add kernel/locking/lockdep.c:3216 [inline] validate_chain kernel/locking/lockdep.c:3831 [inline] __lock_acquire+0x2a43/0x56d0 kernel/locking/lockdep.c:5055 lock_acquire kernel/locking/lockdep.c:5668 [inline] lock_acquire+0x1e3/0x630 kernel/locking/lockdep.c:5633 __mutex_lock_common kernel/locking/mutex.c:603 [inline] __mutex_lock+0x12f/0x1360 kernel/locking/mutex.c:747 __btrfs_release_delayed_node.part.0+0xa1/0xf30 fs/btrfs/delayed-inode.c:256 __btrfs_release_delayed_node fs/btrfs/delayed-inode.c:251 [inline] btrfs_release_delayed_node fs/btrfs/delayed-inode.c:281 [inline] btrfs_remove_delayed_node+0x52/0x60 fs/btrfs/delayed-inode.c:1285 btrfs_evict_inode+0x511/0xf30 fs/btrfs/inode.c:5554 evict+0x2ed/0x6b0 fs/inode.c:664 dispose_list+0x117/0x1e0 fs/inode.c:697 prune_icache_sb+0xeb/0x150 fs/inode.c:896 super_cache_scan+0x391/0x590 fs/super.c:106 do_shrink_slab+0x464/0xce0 mm/vmscan.c:843 shrink_slab_memcg mm/vmscan.c:912 [inline] shrink_slab+0x388/0x660 mm/vmscan.c:991 shrink_node_memcgs mm/vmscan.c:6088 [inline] shrink_node+0x93d/0x1f30 mm/vmscan.c:6117 shrink_zones mm/vmscan.c:6355 [inline] do_try_to_free_pages+0x3b4/0x17a0 mm/vmscan.c:6417 try_to_free_mem_cgroup_pages+0x3a4/0xa70 mm/vmscan.c:6732 reclaim_high.constprop.0+0x182/0x230 mm/memcontrol.c:2393 mem_cgroup_handle_over_high+0x190/0x520 mm/memcontrol.c:2578 try_charge_memcg+0xe0c/0x12f0 mm/memcontrol.c:2816 try_charge mm/memcontrol.c:2827 [inline] charge_memcg+0x90/0x3b0 mm/memcontrol.c:6889 __mem_cgroup_charge+0x2b/0x90 mm/memcontrol.c:6910 mem_cgroup_charge include/linux/memcontrol.h:667 [inline] __filemap_add_folio+0x615/0xf80 mm/filemap.c:852 filemap_add_folio+0xaf/0x1e0 mm/filemap.c:934 __filemap_get_folio+0x389/0xd80 mm/filemap.c:1976 pagecache_get_page+0x2e/0x280 mm/folio-compat.c:104 find_or_create_page include/linux/pagemap.h:612 [inline] alloc_extent_buffer+0x2b9/0x1580 fs/btrfs/extent_io.c:4588 btrfs_init_new_buffer fs/btrfs/extent-tree.c:4869 [inline] btrfs_alloc_tree_block+0x2e1/0x1320 fs/btrfs/extent-tree.c:4988 __btrfs_cow_block+0x3b2/0x1420 fs/btrfs/ctree.c:440 btrfs_cow_block+0x2fa/0x950 fs/btrfs/ctree.c:595 btrfs_search_slot+0x11b0/0x2c70 fs/btrfs/ctree.c:2038 btrfs_update_root+0xdb/0x630 fs/btrfs/root-tree.c:137 update_log_root fs/btrfs/tree-log.c:2841 [inline] btrfs_sync_log+0xbfb/0x2870 fs/btrfs/tree-log.c:3064 btrfs_sync_file+0xdb9/0x13c0 fs/btrfs/file.c:1947 vfs_fsync_range+0x13e/0x230 fs/sync.c:188 generic_write_sync include/linux/fs.h:2856 [inline] iomap_dio_complete+0x73a/0x920 fs/iomap/direct-io.c:128 btrfs_direct_write fs/btrfs/file.c:1536 [inline] btrfs_do_write_iter+0xba2/0x1470 fs/btrfs/file.c:1668 call_write_iter include/linux/fs.h:2160 [inline] do_iter_readv_writev+0x20b/0x3b0 fs/read_write.c:735 do_iter_write+0x182/0x700 fs/read_write.c:861 vfs_iter_write+0x74/0xa0 fs/read_write.c:902 iter_file_splice_write+0x745/0xc90 fs/splice.c:686 do_splice_from fs/splice.c:764 [inline] direct_splice_actor+0x114/0x180 fs/splice.c:931 splice_direct_to_actor+0x335/0x8a0 fs/splice.c:886 do_splice_direct+0x1ab/0x280 fs/splice.c:974 do_sendfile+0xb19/0x1270 fs/read_write.c:1255 __do_sys_sendfile64 fs/read_write.c:1323 [inline] __se_sys_sendfile64 fs/read_write.c:1309 [inline] __x64_sys_sendfile64+0x259/0x2c0 fs/read_write.c:1309 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd other info that might help us debug this: Chain exists of: &delayed_node->mutex --> btrfs-tree-00 --> btrfs-log-00 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(btrfs-log-00); lock(btrfs-tree-00); lock(btrfs-log-00); lock(&delayed_node->mutex); Holding a read lock on a leaf from a fs/subvolume tree creates a nasty lock dependency when we are COWing extent buffers for the log tree and we have two tasks modifying the log tree, with each one in one of the following 2 scenarios: 1) Modifying the log tree triggers an extent buffer allocation while holding a write lock on a parent extent buffer from the log tree. Allocating the pages for an extent buffer, or the extent buffer struct, can trigger inode eviction and finally the inode eviction will trigger a release/remove of a delayed node, which requires taking the delayed node's mutex; 2) Allocating a metadata extent for a log tree can trigger the async reclaim thread and make us wait for it to release enough space and unblock our reservation ticket. The reclaim thread can start flushing delayed items, and that in turn results in the need to lock delayed node mutexes and in the need to write lock extent buffers of a subvolume tree - all this while holding a write lock on the parent extent buffer in the log tree. So one task in scenario 1) running in parallel with another task in scenario 2) could lead to a deadlock, one wanting to lock a delayed node mutex while having a read lock on a leaf from the subvolume, while the other is holding the delayed node's mutex and wants to write lock the same subvolume leaf for flushing delayed items. Fix this by cloning the leaf of the fs/subvolume tree, release/unlock the fs/subvolume leaf and use the clone leaf instead. Reported-by: [email protected] Link: https://lore.kernel.org/linux-btrfs/[email protected]/ CC: [email protected] # 6.0+ Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-11-23btrfs: use kvcalloc in btrfs_get_dev_zone_infoChristoph Hellwig1-3/+3
Otherwise the kernel memory allocator seems to be unhappy about failing order 6 allocations for the zones array, that cause 100% reproducible mount failures in my qemu setup: [26.078981] mount: page allocation failure: order:6, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null) [26.079741] CPU: 0 PID: 2965 Comm: mount Not tainted 6.1.0-rc5+ #185 [26.080181] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [26.080950] Call Trace: [26.081132] <TASK> [26.081291] dump_stack_lvl+0x56/0x6f [26.081554] warn_alloc+0x117/0x140 [26.081808] ? __alloc_pages_direct_compact+0x1b5/0x300 [26.082174] __alloc_pages_slowpath.constprop.0+0xd0e/0xde0 [26.082569] __alloc_pages+0x32a/0x340 [26.082836] __kmalloc_large_node+0x4d/0xa0 [26.083133] ? trace_kmalloc+0x29/0xd0 [26.083399] kmalloc_large+0x14/0x60 [26.083654] btrfs_get_dev_zone_info+0x1b9/0xc00 [26.083980] ? _raw_spin_unlock_irqrestore+0x28/0x50 [26.084328] btrfs_get_dev_zone_info_all_devices+0x54/0x80 [26.084708] open_ctree+0xed4/0x1654 [26.084974] btrfs_mount_root.cold+0x12/0xde [26.085288] ? lock_is_held_type+0xe2/0x140 [26.085603] legacy_get_tree+0x28/0x50 [26.085876] vfs_get_tree+0x1d/0xb0 [26.086139] vfs_kern_mount.part.0+0x6c/0xb0 [26.086456] btrfs_mount+0x118/0x3a0 [26.086728] ? lock_is_held_type+0xe2/0x140 [26.087043] legacy_get_tree+0x28/0x50 [26.087323] vfs_get_tree+0x1d/0xb0 [26.087587] path_mount+0x2ba/0xbe0 [26.087850] ? _raw_spin_unlock_irqrestore+0x38/0x50 [26.088217] __x64_sys_mount+0xfe/0x140 [26.088506] do_syscall_64+0x35/0x80 [26.088776] entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 5b316468983d ("btrfs: get zone information of zoned block devices") CC: [email protected] # 5.15+ Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2022-11-23fuse: Rearrange fuse_allow_current_process checksDave Marchevsky2-16/+21
This is a followup to a previous commit of mine [0], which added the allow_sys_admin_access && capable(CAP_SYS_ADMIN) check. This patch rearranges the order of checks in fuse_allow_current_process without changing functionality. Commit 9ccf47b26b73 ("fuse: Add module param for CAP_SYS_ADMIN access bypassing allow_other") added allow_sys_admin_access && capable(CAP_SYS_ADMIN) check to the beginning of the function, with the reasoning that allow_sys_admin_access should be an 'escape hatch' for users with CAP_SYS_ADMIN, allowing them to skip any subsequent checks. However, placing this new check first results in many capable() calls when allow_sys_admin_access is set, where another check would've also returned 1. This can be problematic when a BPF program is tracing capable() calls. At Meta we ran into such a scenario recently. On a host where allow_sys_admin_access is set but most of the FUSE access is from processes which would pass other checks - i.e. they don't need CAP_SYS_ADMIN 'escape hatch' - this results in an unnecessary capable() call for each fs op. We also have a daemon tracing capable() with BPF and doing some data collection, so tracing these extraneous capable() calls has the potential to regress performance for an application doing many FUSE ops. So rearrange the order of these checks such that CAP_SYS_ADMIN 'escape hatch' is checked last. Add a small helper, fuse_permissible_uidgid, to make the logic easier to understand. Previously, if allow_other is set on the fuse_conn, uid/git checking doesn't happen as current_in_userns result is returned. These semantics are maintained here: fuse_permissible_uidgid check only happens if allow_other is not set. Signed-off-by: Dave Marchevsky <[email protected]> Suggested-by: Andrii Nakryiko <[email protected]> Reviewed-by: Christian Brauner (Microsoft) <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2022-11-23fuse: allow non-extending parallel direct writes on the same fileDharmendra Singh1-3/+40
In general, as of now, in FUSE, direct writes on the same file are serialized over inode lock i.e we hold inode lock for the full duration of the write request. I could not find in fuse code and git history a comment which clearly explains why this exclusive lock is taken for direct writes. Following might be the reasons for acquiring an exclusive lock but not be limited to 1) Our guess is some USER space fuse implementations might be relying on this lock for serialization. 2) The lock protects against file read/write size races. 3) Ruling out any issues arising from partial write failures. This patch relaxes the exclusive lock for direct non-extending writes only. File size extending writes might not need the lock either, but we are not entirely sure if there is a risk to introduce any kind of regression. Furthermore, benchmarking with fio does not show a difference between patch versions that take on file size extension a) an exclusive lock and b) a shared lock. A possible example of an issue with i_size extending writes are write error cases. Some writes might succeed and others might fail for file system internal reasons - for example ENOSPACE. With parallel file size extending writes it _might_ be difficult to revert the action of the failing write, especially to restore the right i_size. With these changes, we allow non-extending parallel direct writes on the same file with the help of a flag called FOPEN_PARALLEL_DIRECT_WRITES. If this flag is set on the file (flag is passed from libfuse to fuse kernel as part of file open/create), we do not take exclusive lock anymore, but instead use a shared lock that allows non-extending writes to run in parallel. FUSE implementations which rely on this inode lock for serialization can continue to do so and serialized direct writes are still the default. Implementations that do not do write serialization need to be updated and need to set the FOPEN_PARALLEL_DIRECT_WRITES flag in their file open/create reply. On patch review there were concerns that network file systems (or vfs multiple mounts of the same file system) might have issues with parallel writes. We believe this is not the case, as this is just a local lock, which network file systems could not rely on anyway. I.e. this lock is just for local consistency. Signed-off-by: Dharmendra Singh <[email protected]> Signed-off-by: Bernd Schubert <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2022-11-23fuse: remove the unneeded result variableye xingchen1-4/+1
Return the value fuse_dev_release() directly instead of storing it in another redundant variable. Reported-by: Zeal Robot <[email protected]> Signed-off-by: ye xingchen <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2022-11-23fuse: port to vfs{g,u}id_t and associated helpersChristian Brauner1-1/+1
A while ago we introduced a dedicated vfs{g,u}id_t type in commit 1e5267cd0895 ("mnt_idmapping: add vfs{g,u}id_t"). We already switched over a good part of the VFS. Ultimately we will remove all legacy idmapped mount helpers that operate only on k{g,u}id_t in favor of the new type safe helpers that operate on vfs{g,u}id_t. Cc: Seth Forshee (Digital Ocean) <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Christian Brauner (Microsoft) <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2022-11-23fuse: Remove user_ns check for FUSE_DEV_IOC_CLONEJann Horn1-2/+1
Commit 8ed1f0e22f49e ("fs/fuse: fix ioctl type confusion") fixed a type confusion bug by adding an ->f_op comparison. Based on some off-list discussion back then, another check was added to compare the f_cred->user_ns. This is not for security reasons, but was based on the idea that a FUSE device FD should be using the UID/GID mappings of its f_cred->user_ns, and those translations are done using fc->user_ns, which matches the f_cred->user_ns of the initial FUSE device FD thanks to the check in fuse_fill_super(). See also commit 8cb08329b0809 ("fuse: Support fuse filesystems outside of init_user_ns"). But FUSE_DEV_IOC_CLONE is, at a higher level, a *cloning* operation that copies an existing context (with a weird API that involves first opening /dev/fuse, then tying the resulting new FUSE device FD to an existing FUSE instance). So if an application is already passing FUSE FDs across userns boundaries and dealing with the resulting ID mapping complications somehow, it doesn't make much sense to block this cloning operation. I've heard that this check is an obstacle for some folks, and I don't see a good reason to keep it, so remove it. Signed-off-by: Jann Horn <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2022-11-23fuse: always revalidate rename target dentryJiachen Zhang1-1/+1
The previous commit df8629af2934 ("fuse: always revalidate if exclusive create") ensures that the dentries are revalidated on O_EXCL creates. This commit complements it by also performing revalidation for rename target dentries. Otherwise, a rename target file that only exists in kernel dentry cache but not in the filesystem will result in EEXIST if RENAME_NOREPLACE flag is used. Signed-off-by: Jiachen Zhang <[email protected]> Signed-off-by: Zhang Tianci <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>