Blaster4385/linux-IllusionX

Author	SHA1	Message	Date
Mauricio Faria de Oliveira	45680967ee	ocfs2: ratelimit the 'max lookup times reached' notice Running stress-ng on ocfs2 completely fills the kernel log with 'max lookup times reached, filesystem may have nested directories.' Let's ratelimit this message as done with others in the code. Test-case: # mkfs.ocfs2 --mount local $DEV # mount $DEV $MNT # cd $MNT # dmesg -C # stress-ng --dirdeep 1 --dirdeep-ops 1000 # dmesg \| grep -c 'max lookup times reached' Before: # dmesg -C # stress-ng --dirdeep 1 --dirdeep-ops 1000 ... stress-ng: info: [11116] successful run completed in 3.03s # dmesg \| grep -c 'max lookup times reached' 967 After: # dmesg -C # stress-ng --dirdeep 1 --dirdeep-ops 1000 ... stress-ng: info: [739] successful run completed in 0.96s # dmesg \| grep -c 'max lookup times reached' 10 # dmesg [ 259.086086] ocfs2_check_if_ancestor: 1990 callbacks suppressed [ 259.086092] (stress-ng-dirde,740,1):ocfs2_check_if_ancestor:1091 max lookup times reached, filesystem may have nested directories, src inode: 18007, dest inode: 17940. ... Link: https://lkml.kernel.org/r/20201001224417.478263-1-mfo@canonical.com Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-12-15 12:13:37 -08:00
Tom Rix	a0823b5e44	fs/ocfs2/cluster/tcp.c: remove unneeded break A break is not needed if it is preceded by a goto Link: https://lkml.kernel.org/r/20201019175216.2329-1-trix@redhat.com Signed-off-by: Tom Rix <trix@redhat.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-12-15 12:13:37 -08:00
Alex Shi	4dad18f477	fs/ntfs: remove unused variable attr_len This variable isn't used anymore, remove it to skip W=1 warning: fs/ntfs/inode.c:2350:6: warning: variable `attr_len' set but not used [-Wunused-but-set-variable] Link: https://lkml.kernel.org/r/4194376f-898b-b602-81c3-210567712092@linux.alibaba.com Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Acked-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-12-15 12:13:37 -08:00
Alex Shi	3f10c2fa40	fs/ntfs: remove unused varibles We actually don't use these varibles, so remove them to avoid gcc warning: fs/ntfs/file.c:326:14: warning: variable `base_ni' set but not used [-Wunused-but-set-variable] fs/ntfs/logfile.c:481:21: warning: variable `log_page_mask' set but not used [-Wunused-but-set-variable] Link: https://lkml.kernel.org/r/1604821092-54631-1-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Acked-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-12-15 12:13:37 -08:00
Linus Torvalds	31d00f6eb1	io_uring-5.10-2020-12-11 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl/UMOwQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpmI3EAC22QohUftfWAJeks142J9wFKQgYfNZpZHs iJR07NTYUSghp9rpfGVNGY/ZonUoKds0VSbDZpRU8TTjglGf9GguzYE4iOMedtG3 MA9diNOnAAK0gwZrBx1uB882X4pf1qYGChMe9fXXJ4mQsHpG84q00GnJI0cPjkKn PaYRVugr8AxCzG4dHHL0ckQtsEHWaPwe7i5u4UkoOC3IsrVjRri6wYAa98xDuznv fqssoMp2cSOcVHsfypQAVZbg0heR3Zo2OuXE4uw6yjB7w5j8tuLIV9XKtqwiRn2n 57rdnGfY6vSfJb9EPm6fwBRnve2cI2Xbr30x/XWSIx+QcUOFOsBfdblaJikX8cSJ FG5c0AgBeRo1heWC4FRNIFStAubm0RPJAPqNm9p6T0DzO4hC3xTTdkl04B5bA4Ms 4tNWhk2fP8YRKhC0gRcw7zRHAl/qjuMjrUor5L68jxGlmFR7h+uMky0AsS4KZKNZ o1+HFmc/m+3ZBNzL7NoXkJzWFaKzZVf0iuEqJP3Xqy91hCA85pvh7DGAlQpZBppM C0eNeJ59rWvfSuUS9bLlLfoMO3Ab5BsE4IEji7K2lBNZsCXQpXHVv4xJxgKJRja9 t+SuBgYpNOfHrIPSY0LaJh93MNzoP22TP8qF8U5gWQ0Bus/mVKE9S1EXSUAvfHPm IyOfTB9M6g== =wjOp -----END PGP SIGNATURE----- Merge tag 'io_uring-5.10-2020-12-11' of git://git.kernel.dk/linux-block Pull io_uring fixes from Jens Axboe: "Two fixes in here, fixing issues introduced in this merge window" * tag 'io_uring-5.10-2020-12-11' of git://git.kernel.dk/linux-block: io_uring: fix file leak on error path of io ctx creation io_uring: fix mis-seting personality's creds	2020-12-12 09:45:01 -08:00
Linus Torvalds	782598ecea	zonefs fixes for 5.10-rc7 A single patch in this pull request to fix a BIO and page reference leak when writing sequential zone files. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCX9NY0gAKCRDdoc3SxdoY dga9AQCq+hAOyA1FeCkWwBG0gywIIVhPscNphe1bH576JoaIZAEA86DsGB9BED7H yaezfh5W1lr63ctxtSVdyo+x5172fgY= =Te7s -----END PGP SIGNATURE----- Merge tag 'zonefs-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs Pull zonefs fix from Damien Le Moal: "A single patch in this pull request to fix a BIO and page reference leak when writing sequential zone files" * tag 'zonefs-5.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs: zonefs: fix page reference and BIO leak	2020-12-11 14:22:42 -08:00
Miles Chen	40d6366e9d	proc: use untagged_addr() for pagemap_read addresses When we try to visit the pagemap of a tagged userspace pointer, we find that the start_vaddr is not correct because of the tag. To fix it, we should untag the userspace pointers in pagemap_read(). I tested with 5.10-rc4 and the issue remains. Explanation from Catalin in [1]: "Arguably, that's a user-space bug since tagged file offsets were never supported. In this case it's not even a tag at bit 56 as per the arm64 tagged address ABI but rather down to bit 47. You could say that the problem is caused by the C library (malloc()) or whoever created the tagged vaddr and passed it to this function. It's not a kernel regression as we've never supported it. Now, pagemap is a special case where the offset is usually not generated as a classic file offset but rather derived by shifting a user virtual address. I guess we can make a concession for pagemap (only) and allow such offset with the tag at bit (56 - PAGE_SHIFT + 3)" My test code is based on [2]: A userspace pointer which has been tagged by 0xb4: 0xb400007662f541c8 userspace program: uint64 OsLayer::VirtualToPhysical(void vaddr) { uint64 frame, paddr, pfnmask, pagemask; int pagesize = sysconf(_SC_PAGESIZE); off64_t off = ((uintptr_t)vaddr) / pagesize 8; // off = 0xb400007662f541c8 / pagesize * 8 = 0x5a00003b317aa0 int fd = open(kPagemapPath, O_RDONLY); ... if (lseek64(fd, off, SEEK_SET) != off \|\| read(fd, &frame, 8) != 8) { int err = errno; string errtxt = ErrorString(err); if (fd >= 0) close(fd); return 0; } ... } kernel fs/proc/task_mmu.c: static ssize_t pagemap_read(struct file file, char __user buf, size_t count, loff_t ppos) { ... src = ppos; svpfn = src / PM_ENTRY_BYTES; // svpfn == 0xb400007662f54 start_vaddr = svpfn << PAGE_SHIFT; // start_vaddr == 0xb400007662f54000 end_vaddr = mm->task_size; /* watch out for wraparound */ // svpfn == 0xb400007662f54 // (mm->task_size >> PAGE) == 0x8000000 if (svpfn > mm->task_size >> PAGE_SHIFT) // the condition is true because of the tag 0xb4 start_vaddr = end_vaddr; ret = 0; while (count && (start_vaddr < end_vaddr)) { // we cannot visit correct entry because start_vaddr is set to end_vaddr int len; unsigned long end; ... } ... } [1] https://lore.kernel.org/patchwork/patch/1343258/ [2] https://github.com/stressapptest/stressapptest/blob/master/src/os.cc#L158 Link: https://lkml.kernel.org/r/20201204024347.8295-1-miles.chen@mediatek.com Signed-off-by: Miles Chen <miles.chen@mediatek.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Will Deacon <will@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com> Cc: <stable@vger.kernel.org> [5.4-] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-12-11 14:02:14 -08:00
Linus Torvalds	6840a3dcc2	More NFS Client Bugfixes for Linux 5.10 - Bugfixes: - Fix array overflow when flexfiles mirroring is enabled - Fix rpcrdma_inline_fixup() crash with new LISTXATTRS - Fix 5 second delay when doing inter-server copy - Disable READ_PLUS by default -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAl/Sl70ACgkQ18tUv7Cl QOuGdw/9GMmrLF26brO3q3lSTD9xdljJGQi0q1DEF2amNUL9cPgxB0xZB9lyUPqs AbpLxsCgfJ+rN3sbOnDdz8Ae/wiWS1PkeJWw4cu0/kIIjyPD2Uu4jWuIwxpFvP1v lFfBHeEdsNgqS5t9t9cF9u30PjPKt9SHG/64/8GY92zlrKDRUaJUyvW1zUKo4ne4 m3JxZ1rCxNkBYV+odrQIYE9gqFI6OBWpVsUeIXjBWWDZ/+zK9rGiO5fxZ8oe+yT2 cmrR/vOznoPgUGaH380HrCbXcIKwim2mn76t1MH9fScZQc1ocSWckrZRRdOVWlJ+ 228ImDPACLFaTqpgi8ZhFmuD3Mwbz4AtPLce52lvZLuMAPVMoFhR8xrm/VpJFSim pNjwsl/j9qlGSAe5XdGW+X/DzmHPVd6uhfzOFAd7ZErt7rU0gjyJTWzrmayoB3N1 GF5g4LaK1dYYj7SG8d8DrEV2CGyYle2UP4aDm8qJ/ZLQhd10RHeVt4BHfSKwY05l 03Gim7wlH/ercYfgU6wrgmEG2SWjuFJr+j1v//aqcmJVhA7FvTKygGpGyNK16pNL 9dZ81wUHKHzVt5C/ruG+3tARvnobvmB4ngQAQmtu0aXe9c+Kr2XlaXUAusHPK+Me pVqtmaKBgddUtJwFms8JF6dePiSBRQgpwyvluqNi1D7ntRhS+Ik= =EXrS -----END PGP SIGNATURE----- Merge tag 'nfs-for-5.10-3' of git://git.linux-nfs.org/projects/anna/linux-nfs Pull NFS client fixes from Anna Schumaker: "Here are a handful more bugfixes for 5.10. Unfortunately, we found some problems with the new READ_PLUS operation that aren't easy to fix. We've decided to disable this codepath through a Kconfig option for now, but a series of patches going into 5.11 will clean up the code and fix the issues at the same time. This seemed like the best way to go about it. Summary: - Fix array overflow when flexfiles mirroring is enabled - Fix rpcrdma_inline_fixup() crash with new LISTXATTRS - Fix 5 second delay when doing inter-server copy - Disable READ_PLUS by default" * tag 'nfs-for-5.10-3' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFS: Disable READ_PLUS by default NFSv4.2: Fix 5 seconds delay when doing inter server copy NFS: Fix rpcrdma_inline_fixup() crash with new LISTXATTRS operation pNFS/flexfiles: Fix array overflow when flexfiles mirroring is enabled	2020-12-10 15:36:09 -08:00
Anna Schumaker	21e31401fc	NFS: Disable READ_PLUS by default We've been seeing failures with xfstests generic/091 and generic/263 when using READ_PLUS. I've made some progress on these issues, and the tests fail later on but still don't pass. Let's disable READ_PLUS by default until we can work out what is going on. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2020-12-10 16:48:03 -05:00
Dai Ngo	fe8eb820e3	NFSv4.2: Fix 5 seconds delay when doing inter server copy Since commit `b4868b44c5` ("NFSv4: Wait for stateid updates after CLOSE/OPEN_DOWNGRADE"), every inter server copy operation suffers 5 seconds delay regardless of the size of the copy. The delay is from nfs_set_open_stateid_locked when the check by nfs_stateid_is_sequential fails because the seqid in both nfs4_state and nfs4_stateid are 0. Fix __nfs42_ssc_open to delay setting of NFS_OPEN_STATE in nfs4_state, until after the call to update_open_stateid, to indicate this is the 1st open. This fix is part of a 2 patches, the other patch is the fix in the source server to return the stateid for COPY_NOTIFY request with seqid 1 instead of 0. Fixes: `ce0887ac96` ("NFSD add nfs4 inter ssc to nfsd4_copy") Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2020-12-10 16:48:03 -05:00
Chuck Lever	1c87b85162	NFS: Fix rpcrdma_inline_fixup() crash with new LISTXATTRS operation By switching to an XFS-backed export, I am able to reproduce the ibcomp worker crash on my client with xfstests generic/013. For the failing LISTXATTRS operation, xdr_inline_pages() is called with page_len=12 and buflen=128. - When ->send_request() is called, rpcrdma_marshal_req() does not set up a Reply chunk because buflen is smaller than the inline threshold. Thus rpcrdma_convert_iovs() does not get invoked at all and the transport's XDRBUF_SPARSE_PAGES logic is not invoked on the receive buffer. - During reply processing, rpcrdma_inline_fixup() tries to copy received data into rq_rcv_buf->pages because page_len is positive. But there are no receive pages because rpcrdma_marshal_req() never allocated them. The result is that the ibcomp worker faults and dies. Sometimes that causes a visible crash, and sometimes it results in a transport hang without other symptoms. RPC/RDMA's XDRBUF_SPARSE_PAGES support is not entirely correct, and should eventually be fixed or replaced. However, my preference is that upper-layer operations should explicitly allocate their receive buffers (using GFP_KERNEL) when possible, rather than relying on XDRBUF_SPARSE_PAGES. Reported-by: Olga kornievskaia <kolga@netapp.com> Suggested-by: Olga kornievskaia <kolga@netapp.com> Fixes: `c10a75145f` ("NFSv4.2: add the extended attribute proc functions.") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Olga kornievskaia <kolga@netapp.com> Reviewed-by: Frank van der Linden <fllinden@amazon.com> Tested-by: Olga kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2020-12-10 16:48:03 -05:00
Damien Le Moal	6bea0225a4	zonefs: fix page reference and BIO leak In zonefs_file_dio_append(), the pages obtained using bio_iov_iter_get_pages() are not released on completion of the REQ_OP_APPEND BIO, nor when bio_iov_iter_get_pages() fails. Furthermore, a call to bio_put() is missing when bio_iov_iter_get_pages() fails. Fix these resource leaks by adding BIO resource release code (bio_put()i and bio_release_pages()) at the end of the function after the BIO execution and add a jump to this resource cleanup code in case of bio_iov_iter_get_pages() failure. While at it, also fix the call to task_io_account_write() to be passed the correct BIO size instead of bio_iov_iter_get_pages() return value. Reported-by: Christoph Hellwig <hch@lst.de> Fixes: `02ef12a663` ("zonefs: use REQ_OP_ZONE_APPEND for sync DIO") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2020-12-10 15:14:19 +09:00
David Howells	4cb6829647	afs: Fix memory leak when mounting with multiple source parameters There's a memory leak in afs_parse_source() whereby multiple source= parameters overwrite fc->source in the fs_context struct without freeing the previously recorded source. Fix this by only permitting a single source parameter and rejecting with an error all subsequent ones. This was caught by syzbot with the kernel memory leak detector, showing something like the following trace: unreferenced object 0xffff888114375440 (size 32): comm "repro", pid 5168, jiffies 4294923723 (age 569.948s) backtrace: slab_post_alloc_hook+0x42/0x79 __kmalloc_track_caller+0x125/0x16a kmemdup_nul+0x24/0x3c vfs_parse_fs_string+0x5a/0xa1 generic_parse_monolithic+0x9d/0xc5 do_new_mount+0x10d/0x15a do_mount+0x5f/0x8e __do_sys_mount+0xff/0x127 do_syscall_64+0x2d/0x3a entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `13fcc68370` ("afs: Add fs_context support") Reported-by: syzbot+86dc6632faaca40133ab@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-12-08 15:59:25 -08:00
Linus Torvalds	7d8761ba27	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull seq_file fix from Al Viro: "This fixes a regression introduced in this cycle wrt iov_iter based variant for reading a seq_file" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix return values of seq_read_iter()	2020-12-08 12:20:34 -08:00
Hillf Danton	f26c08b444	io_uring: fix file leak on error path of io ctx creation Put file as part of error handling when setting up io ctx to fix memory leaks like the following one. BUG: memory leak unreferenced object 0xffff888101ea2200 (size 256): comm "syz-executor355", pid 8470, jiffies 4294953658 (age 32.400s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 20 59 03 01 81 88 ff ff 80 87 a8 10 81 88 ff ff Y.............. backtrace: [<000000002e0a7c5f>] kmem_cache_zalloc include/linux/slab.h:654 [inline] [<000000002e0a7c5f>] __alloc_file+0x1f/0x130 fs/file_table.c:101 [<000000001a55b73a>] alloc_empty_file+0x69/0x120 fs/file_table.c:151 [<00000000fb22349e>] alloc_file+0x33/0x1b0 fs/file_table.c:193 [<000000006e1465bb>] alloc_file_pseudo+0xb2/0x140 fs/file_table.c:233 [<000000007118092a>] anon_inode_getfile fs/anon_inodes.c:91 [inline] [<000000007118092a>] anon_inode_getfile+0xaa/0x120 fs/anon_inodes.c:74 [<000000002ae99012>] io_uring_get_fd fs/io_uring.c:9198 [inline] [<000000002ae99012>] io_uring_create fs/io_uring.c:9377 [inline] [<000000002ae99012>] io_uring_setup+0x1125/0x1630 fs/io_uring.c:9411 [<000000008280baad>] do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 [<00000000685d8cf0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported-by: syzbot+71c4697e27c99fddcf17@syzkaller.appspotmail.com Fixes: `0f2122045b` ("io_uring: don't rely on weak ->files references") Cc: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-12-08 08:54:26 -07:00
Pavel Begunkov	e8c954df23	io_uring: fix mis-seting personality's creds After io_identity_cow() copies an work.identity it wants to copy creds to the new just allocated id, not the old one. Otherwise it's akin to req->work.identity->creds = req->work.identity->creds. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-12-07 08:43:44 -07:00
Menglong Dong	2bf509d96d	coredump: fix core_pattern parse error 'format_corename()' will splite 'core_pattern' on spaces when it is in pipe mode, and take helper_argv[0] as the path to usermode executable. It works fine in most cases. However, if there is a space between '\|' and '/file/path', such as '\| /usr/lib/systemd/systemd-coredump %P %u %g', then helper_argv[0] will be parsed as '', and users will get a 'Core dump to \| disabled'. It is not friendly to users, as the pattern above was valid previously. Fix this by ignoring the spaces between '\|' and '/file/path'. Fixes: `315c69261d` ("coredump: split pipe command whitespace before expanding template") Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Wise <pabs3@bonedaddy.net> Cc: Jakub Wilk <jwilk@jwilk.net> [https://bugs.debian.org/924398] Cc: Neil Horman <nhorman@tuxdriver.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/5fb62870.1c69fb81.8ef5d.af76@mx.google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-12-06 10:19:07 -08:00
Linus Torvalds	619ca2664c	io_uring-5.10-2020-12-05 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl/L/o4QHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpifFD/9tqDSwqcRIKRhSHqsD6OJRagZ0UB37cPvH ZXmtHblK9BkeXc83OPy+NmlcPq1mpDmbMgy0rls8UOGnZs0RINN9En5hHGOeRVD4 Ma6Abh9ypzXhNKqY4HO+lcsN+elGqoaYNSnu/Nbcib7Z5yYzUSNDnpIeTQy7Onbw dTGNTJalZYqrQ+6ytglfVeQuXSSKYPJfjFdcYd1It6Ks0aSKDgcQxJZgPeuF6xE+ mxEWADs+zpZG8UqZYyLnS6+9dxuJwhx+Cd47Ntky0VdaWmxCogQiI3X/Q3vspnVq gz9VqnUIA3Qgabm0UJkyjCi87W2KNDf4r5OOaQw4dLc0kDhIVS6ii67iGVs9Dl7F xCZMCE4pkEvs0pPYCyITUUNPh/fBONHgDHAWxxhztkE8ldvsAUmqNyV/LWMMvtSl SUQli6OZMVyyUX9sroHOWKhi8rQYt1UZa0tm+UZE/+YrZLfp6qBCZHFC6hdNfQoS k7PiuxecsH4i/OweJMPtggChJ7VWHZUorTljfHnl14oDuFuVD22AJvc3q+OSStI1 Ua5N27b2/pWXnf9P3dPGzikprOU/vXPQnlifmBxXNj+/TD29+GPUHhV2+kibc5/F gMbrhX3TX4tstHIGxUCoeB8XepiBqOmn1eLkWNQQGi8RbqCKfE4j/5o07HgxST2P Ns2GaEKb4A== =CPrM -----END PGP SIGNATURE----- Merge tag 'io_uring-5.10-2020-12-05' of git://git.kernel.dk/linux-block Pull io_uring fix from Jens Axboe: "Just a small fix this time, for an issue with 32-bit compat apps and buffer selection with recvmsg" * tag 'io_uring-5.10-2020-12-05' of git://git.kernel.dk/linux-block: io_uring: fix recvmsg setup with compat buf-select	2020-12-05 14:39:59 -08:00
Linus Torvalds	d4e904198c	Three smb3 fixes, two for stable -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAl/K3U0ACgkQiiy9cAdy T1Gewwv7Bjtq+c4tvc0LZj04oCfgsmcdu9s82Xff+e81lJlPyK+04zrp1kPiVQ8O iV5+bQr96BPgA96f2Et/OdnO4ei4T1vp9yTfXkTHQCWHQhd57N98TNudtJ+JNKCv ziaY+X6BFBBFnbCiPQxLZjvXsXenS3PFTq+wJB6ot4Qsj7z3kk88nXrvDDP7kiFL 3WqZzUWKPJaRHjJQIrUch9M77P1cGk1duKT8Ydj/XKQujocWFWYXEWVYMGKcFFpK cXBEbN6vBUTogn6fA8lQKy3KTarRGT8qpl8c91UCZ61sH8t6RlHQ53rtPrLPG/9k KH8xyN+Lu4+w9LDni5HqsIHAqi0+dnPvKRZWIkxMI/ZbZcV4Mwoi4kMZ/H85psp6 wEDgA6czaQSS9c3jZ0lvNgOHvpKkz1lz8sXwUuTYVehMaQrTcxNdEgh8TdlfZ9P6 St404ZXSsthXChac2N22G9JX4XcM/wocSWq/sfpoUvr9eKJMjFMdB7rP7ORPQofb S/u8v5pD =3b+C -----END PGP SIGNATURE----- Merge tag '5.10-rc6-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6 Pull cifs fixes from Steve French: "Three smb3 fixes (two for stable) fixing - a null pointer issue in a DFS error path - a problem with excessive padding when mounted with "idsfromsid" causing owner fields to get corrupted - a more recent problem with compounded reparse point query found in testing to the Linux kernel server" * tag '5.10-rc6-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: refactor create_sd_buf() and and avoid corrupting the buffer cifs: add NULL check for ses->tcon_ipc smb3: set COMPOUND_FID to FileID field of subsequent compound request	2020-12-05 11:09:07 -08:00
Ronnie Sahlberg	ea64370bca	cifs: refactor create_sd_buf() and and avoid corrupting the buffer When mounting with "idsfromsid" mount option, Azure corrupted the owner SIDs due to excessive padding caused by placing the owner fields at the end of the security descriptor on create. Placing owners at the front of the security descriptor (rather than the end) is also safer, as the number of ACEs (that follow it) are variable. Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Suggested-by: Rohith Surabattula <rohiths@microsoft.com> CC: Stable <stable@vger.kernel.org> # v5.8 Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-03 17:12:14 -06:00
Aurelien Aptel	59463eb888	cifs: add NULL check for ses->tcon_ipc In some scenarios (DFS and BAD_NETWORK_NAME) set_root_set() can be called with a NULL ses->tcon_ipc. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz> CC: Stable <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-03 17:06:03 -06:00
Namjae Jeon	7963178485	smb3: set COMPOUND_FID to FileID field of subsequent compound request For an operation compounded with an SMB2 CREATE request, client must set COMPOUND_FID(0xFFFFFFFFFFFFFFFF) to FileID field of smb2 ioctl. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Fixes: `2e4564b31b` ("smb3: add support stat of WSL reparse points for special file types") Reviewed-by: Aurelien Aptel <aaptel@suse.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-12-03 17:02:52 -06:00
Linus Torvalds	c82a505c00	Restore splice functionality for 9p -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE/IPbcYBuWt0zoYhOq06b7GqY5nAFAl/IvNUACgkQq06b7GqY 5nBRAA//UQJ3ioEvTlIpdHR6wKeH9XV2bucqjbAO2PRU0r8qEBC6wRr8IDBv4J4D xUyQ+teWjBLoKekdhoXsz3xXWtwqPNchVXClFOtjsUqYn2F71pUYNhRYJL9Xy+yy pumUxG3vJpJ4QTWkuSeB7/Ha561waWHILIeo1knT6z0Q49jmYI1J9Bl84q7RiC/7 wH+bgyE0RSORoIfkfapIvgwdjamUsnzMVhSVKNOol3gwkjFJGfOtk16VVSKq/RlK 5uOnjmWiVAkfPdDOA93mu6KE3FA4FB/sQaESwBR2fawsd646WooG5hn8fhMa0kHP y6AZo6fdYglcOD/OE1S0GpcSOocAdglfcU51kfxX8Z7BwfAbl7WCPCS2Rl75PYFf PhTzidD+vQsroGEaCF2+w90yIZBGgLz0Lh6az3zmaNGo7lcrRyP6GzQMmKILp2O1 MuPWNFRzt5ZAmyq7roVVclMzXnXEQM2v5YS3ib+wuUpQHWwGAUzdV6u9uFsp6vXx QSnsD4Y9wwMSI03QkkzW7TaH1laImRt820emivo3rOoWk1f7sCG5r3WlNPRIh1cN 9n6oCCOvFgXpTmiE0btqSKeMMxCqjtPHGZ5dQTTnftf0euNXGIZDPMSL45ANDXh+ 5Ezhjggqa91ohUSBRJgwc9MGsZy0GxOneDWPMpXAIDe9OlRkSC0= =zCUA -----END PGP SIGNATURE----- Merge tag '9p-for-5.10-rc7' of git://github.com/martinetd/linux Pull 9p fixes from Dominique Martinet: "Restore splice functionality for 9p" * tag '9p-for-5.10-rc7' of git://github.com/martinetd/linux: fs: 9p: add generic splice_write file operation fs: 9p: add generic splice_read file operations	2020-12-03 11:47:13 -08:00
Linus Torvalds	34816d20f1	Various gfs2 fixes -----BEGIN PGP SIGNATURE----- iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAl/H/pUUHGFncnVlbmJh QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTqxhRAAuTKHz1jMYHNAWm1NVVRiGVENhjlV g5kRcRmf8m/VAWmI+HM0J34CbHpWc1IgJnjID/vZ/VB8QQPmoDoEFrD/ple9V6A4 xQJzQcG9uvLir5h43R2SWR6OSSijtEDjvpQpRQSK3B2e1otRV+W7mLjPBkg410rY l9Cj01QhCwCv1dVO7VknSEB1Jpkhe0Vvr1z4DropeU5JEBSADTl5deTkFWbinDo1 9nygDw01AdWXhR1FNJTAy8XJYlcM3dx6xDuHD9xJH9HciWsocOr+EqtPfJEEsCSf X6CrWY3mYZBr1kJFcvp2HD5rKYhw0xC0gZiHEnvmZx5XKTKoDKq2+2s2wQOwWk3B QIvEfLoAzV6mKBHRC8mCDw2bImj8FDrvBkmmJnlyL6nJDrriBXimFAJPX7MryKTJ jdvm0JFRyY0ja3LxLxBVwiKnKiw6yY6fPjpMWzi042P/PhshRSvazxgUf+pc69CB Czo3W9MnNGWzP73ALIsuVxG81RW48CvAJdXOmoWIqjEeqnC0YZ3vgwInQcqBzton fCI38XN6CG1UbwaVxoZGoPS5FIfBbrTL5v62e3U8JIv3J4ol5fWCEruUhImsByGI w3UkiuNqWjOab/774m9ZS0eBXCEn6LAizvZSgf3pt+8VRoN20RxF3k1Jhvm7g7I3 IbOHK9kddrycH9Q= =EHzo -----END PGP SIGNATURE----- Merge tag 'gfs2-v5.10-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fixes from Andreas Gruenbacher: "Various gfs2 fixes" * tag 'gfs2-v5.10-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix deadlock between gfs2_{create_inode,inode_lookup} and delete_work_func gfs2: Upgrade shared glocks for atime updates gfs2: Don't freeze the file system during unmount gfs2: check for empty rgrp tree in gfs2_ri_update gfs2: set lockdep subclass for iopen glocks gfs2: Fix deadlock dumping resource group glocks	2020-12-02 17:25:23 -08:00
Dominique Martinet	960f4f8a4e	fs: 9p: add generic splice_write file operation The default splice operations got removed recently, add it back to 9p with iter_file_splice_write like many other filesystems do. Link: http://lkml.kernel.org/r/1606837496-21717-1-git-send-email-asmadeus@codewreck.org Fixes: `36e2c7421f` ("fs: don't allow splice read/write without explicit ops") Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2020-12-01 21:40:47 +01:00
Toke Høiland-Jørgensen	cf03f316ad	fs: 9p: add generic splice_read file operations The v9fs file operations were missing the splice_read operations, which breaks sendfile() of files on such a filesystem. I discovered this while trying to load an eBPF program using iproute2 inside a 'virtme' environment which uses 9pfs for the virtual file system. iproute2 relies on sendfile() with an AF_ALG socket to hash files, which was erroring out in the virtual environment. Since generic_file_splice_read() seems to just implement splice_read in terms of the read_iter operation, I simply added the generic implementation to the file operations, which fixed the error I was seeing. A quick grep indicates that this is what most other file systems do as well. Link: http://lkml.kernel.org/r/20201201135409.55510-1-toke@redhat.com Fixes: `36e2c7421f` ("fs: don't allow splice read/write without explicit ops") Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>	2020-12-01 17:53:49 +01:00
Andreas Gruenbacher	dd0ecf5441	gfs2: Fix deadlock between gfs2_{create_inode,inode_lookup} and delete_work_func In gfs2_create_inode and gfs2_inode_lookup, make sure to cancel any pending delete work before taking the inode glock. Otherwise, gfs2_cancel_delete_work may block waiting for delete_work_func to complete, and delete_work_func may block trying to acquire the inode glock in gfs2_inode_lookup. Reported-by: Alexander Aring <aahringo@redhat.com> Fixes: `a0e3cc65fa` ("gfs2: Turn gl_delete into a delayed work") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-12-01 00:21:10 +01:00
Paulo Alcantara	212253367d	cifs: fix potential use-after-free in cifs_echo_request() This patch fixes a potential use-after-free bug in cifs_echo_request(). For instance, thread 1 -------- cifs_demultiplex_thread() clean_demultiplex_info() kfree(server) thread 2 (workqueue) -------- apic_timer_interrupt() smp_apic_timer_interrupt() irq_exit() __do_softirq() run_timer_softirq() call_timer_fn() cifs_echo_request() <- use-after-free in server ptr Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> CC: Stable <stable@vger.kernel.org> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-11-30 15:23:45 -06:00
Paulo Alcantara	6988a619f5	cifs: allow syscalls to be restarted in __smb_send_rqst() A customer has reported that several files in their multi-threaded app were left with size of 0 because most of the read(2) calls returned -EINTR and they assumed no bytes were read. Obviously, they could have fixed it by simply retrying on -EINTR. We noticed that most of the -EINTR on read(2) were due to real-time signals sent by glibc to process wide credential changes (SIGRT_1), and its signal handler had been established with SA_RESTART, in which case those calls could have been automatically restarted by the kernel. Let the kernel decide to whether or not restart the syscalls when there is a signal pending in __smb_send_rqst() by returning -ERESTARTSYS. If it can't, it will return -EINTR anyway. Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz> CC: Stable <stable@vger.kernel.org> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>	2020-11-30 15:23:31 -06:00
Pavel Begunkov	2d280bc893	io_uring: fix recvmsg setup with compat buf-select __io_compat_recvmsg_copy_hdr() with REQ_F_BUFFER_SELECT reads out iov len but never assigns it to iov/fast_iov, leaving sr->len with garbage. Hopefully, following io_buffer_select() truncates it to the selected buffer size, but the value is still may be under what was specified. Cc: <stable@vger.kernel.org> # 5.7 Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-11-30 11:12:03 -07:00
Trond Myklebust	63e2fffa59	pNFS/flexfiles: Fix array overflow when flexfiles mirroring is enabled If the flexfiles mirroring is enabled, then the read code expects to be able to set pgio->pg_mirror_idx to point to the data server that is being used for this particular read. However it does not change the pg_mirror_count because we only need to send a single read. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2020-11-30 10:52:22 -05:00
Linus Torvalds	1214917e00	More EFI fixes for v5.10-rc: - revert efivarfs kmemleak fix again - it was a false positive; - make CONFIG_EFI_EARLYCON depend on CONFIG_EFI explicitly so it does not pull in other dependencies unnecessarily if CONFIG_EFI is not set - defer attempts to load SSDT overrides from EFI vars until after the efivar layer is up. -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAl+/i/UACgkQw08iOZLZ jyRb+Qv/RVQSvZW+u6MrYPZthmVxnNQ4pFHKAjibD3h9isbrWdq6AETtNghUGWQr nr1WeOj4Qa2aDe4z63Sra3QNsdLnSn0FsHuJPD8rozMd4N4jiChtpLukdShibXXz 25yfs7KpNyqwj3QnFd2LpJBXGqzdoKFrzWbnWSnFEMrSfkqptBhogslhVxzal8Uz 4hUyGhe/iBfgU720uoVCmofPpYxqV/cndEmsnA8rZnW5yTPXIn6f9c4KUOcFxgLS LchxZYRd9GZoQ4Yt40ih9JX1ILZNhrhXh96cfNuUiVwnf9Lg7xSOGIX3+e3PR9Lz 1VI4UuA7HM8qDZx+N5iiBRrBIgtdtMgKLEip3n+x84/7P/p26HXe3OJCPBWh0Q0q aWCcV8qiwju6VYK6dv4Gz+2/OYiSKXrXZCx57dnCr7tv5srYenwsvCCYa9wLTpMW 16/DRmkHcfbAdfS38Bhs3qY/zK+He7XE/sPJao/NuVwoJ3Nu0dA4LR7JFG7TVjKm bepO3A8W =gvBz -----END PGP SIGNATURE----- Merge tag 'efi-urgent-for-v5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Borislav Petkov: "More EFI fixes forwarded from Ard Biesheuvel: - revert efivarfs kmemleak fix again - it was a false positive - make CONFIG_EFI_EARLYCON depend on CONFIG_EFI explicitly so it does not pull in other dependencies unnecessarily if CONFIG_EFI is not set - defer attempts to load SSDT overrides from EFI vars until after the efivar layer is up" * tag 'efi-urgent-for-v5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: EFI_EARLYCON should depend on EFI efivarfs: revert "fix memory leak in efivarfs_create()" efi/efivars: Set generic ops before loading SSDT	2020-11-29 10:18:53 -08:00
Linus Torvalds	9223e74f99	io_uring-5.10-2020-11-27 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl/BZBwQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpku+EACqrIZ6wuA/mIWbKs9wWf280hRbIQL1q7zy Ejj2A6DfclZz8XuUUfrRQ2diDDPBNnfPVRfiIzAi/AO7f7jDn91+2ZYfoJJWLZbq kCDHScPI6+OjDho5r+3vocxdoXZfuYxBlKsIzdiLl32mGEWbYjd+rmlAVJ74FPYz mFddmQCZoTBuEArpW6h8bKoPLgoC4zzZ2MgcjjpHT7Me/ai8t7OoA+FBpUjRcu+q Bt/ZfjIiWOzss9+psk5BSrHo5yXY51TWiiuV8hVM3RwVsL2dabG4WPgaLzum3H9p UZ4NAvXdRX9gWfF85mo7PEo1/0REmRy4BOJQ8qtBnvq6eepttAXHBx0SFDfpcLzG oAB4HaxuixAdsutnIynBXad3MskhNtFzbz/4UqOcnKbpT5Zxi65YiBKAsNwXjIdU bc3rZ3hyP4FifzNFa86wFLQ9w9gKV8mEogAz122lxL44AyguZzwN1I3H/YSEt2iX tKqHNWxnrb9MP3ycAuMvwjF3aNruns3QVv5fZQ9CrAUnAVF6Q8o/E2aTt4rmiJUW kQAvQJQj6MhBi/GXJk6YmTTDYgEMZ5JxFV3IVW17YZiMlpvIFML4X2nKo12qhwnq 8eZ0qJHFRyycW9eBY3qznw8S5Z7Nh8r91MjJs5sullHg99xKJbvR+IoswImfJMme Gk857imwGw== =HJnn -----END PGP SIGNATURE----- Merge tag 'io_uring-5.10-2020-11-27' of git://git.kernel.dk/linux-block Pull io_uring fixes from Jens Axboe: - Out of bounds fix for the cq size cap from earlier this release (Joseph) - iov_iter type check fix (Pavel) - Files grab + cancelation fix (Pavel) * tag 'io_uring-5.10-2020-11-27' of git://git.kernel.dk/linux-block: io_uring: fix files grab/cancel race io_uring: fix ITER_BVEC check io_uring: fix shift-out-of-bounds when round up cq size	2020-11-27 12:56:04 -08:00
Linus Torvalds	a17a3ca55e	for-5.10-rc5-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl/BGBQACgkQxWXV+ddt WDtRYQ/+IUGjJ4l6MyL3PwLgTVabKsSpm2R3y3M/0tVJ0FIhDXYbkpjB2CkpIdcH jUvHnRL4H59hwG6rwlXU2oC298FNbbrLGIU+c9DR50RuyQCDGnT02XvxwfDIEFzp WLNZ/CAhRkm6boj//70lV26BpeXT59KMYwNixfCNTXq9Ir0qYHHCGg4cEBQLS++2 JUU8XVTLURIYiFOLbwmABI7V43OgDhdORIr+qnR8xjCUyhusZsjVVbvIdW3BDi/S wK7NJsfuqgsF0zD9URJjpwTFiJL9SvBLWR8JnM9NiLW3ZbkGBL+efL6mdWuH7534 gruJRS2zYPMO2/Kpjy/31CWLap3PUSD3i8cKF+uo3liojWuSUhN8kfvcNxJVd5se NkEK+4zOjsDIVbv7gcjThSv4KTnOUO/XfN9TWUMuduaMBmGQNaQut1FpGV98utiK yW6x8xqcR4SI+lqY6ILqCK+qUHf19BLSsuyzZdTIontKKRA9F9hY8a4XTZuzTWml BGYmFGP640vOo8C9GjrQfpAwa7CB/DnF/cg1AAmuZ8vrEm9zYjmauFKK8ZPcveA3 KGrnmIlYjhAIX16oRbfwOgj9D2xa1loBzJyHQHByvCMXGVFBnqRTRANRHFrdQWJB qh9+J4EJcUXPE9WGHxAW/g9vpFkV7IRABHs7aUB8zApxI9nGA0Q= =kcxn -----END PGP SIGNATURE----- Merge tag 'for-5.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few fixes for various warnings that accumulated over past two weeks: - tree-checker: add missing return values for some errors - lockdep fixes - when reading qgroup config and starting quota rescan - reverse order of quota ioctl lock and VFS freeze lock - avoid accessing potentially stale fs info during device scan, reported by syzbot - add scope NOFS protection around qgroup relation changes - check for running transaction before flushing qgroups - fix tracking of new delalloc ranges for some cases" * tag 'for-5.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix lockdep splat when enabling and disabling qgroups btrfs: do nofs allocations when adding and removing qgroup relations btrfs: fix lockdep splat when reading qgroup config on mount btrfs: tree-checker: add missing returns after data_ref alignment checks btrfs: don't access possibly stale fs_info data for printing duplicate device btrfs: tree-checker: add missing return after error in root_item btrfs: qgroup: don't commit transaction when we already hold the handle btrfs: fix missing delalloc new bit for new delalloc ranges	2020-11-27 12:42:13 -08:00
Andreas Gruenbacher	82e938bd53	gfs2: Upgrade shared glocks for atime updates Commit `20f829999c` ("gfs2: Rework read and page fault locking") lifted the glock lock taking from the low-level ->readpage and ->readahead address space operations to the higher-level ->read_iter file and ->fault vm operations. The glocks are still taken in LM_ST_SHARED mode only. On filesystems mounted without the noatime option, ->read_iter sometimes needs to update the atime as well, though. Right now, this leads to a failed locking mode assertion in gfs2_dirty_inode. Fix that by introducing a new update_time inode operation. There, if the glock is held non-exclusively, upgrade it to an exclusive lock. Reported-by: Alexander Aring <aahringo@redhat.com> Fixes: `20f829999c` ("gfs2: Rework read and page fault locking") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-11-26 19:58:25 +01:00
Pavel Begunkov	af60470347	io_uring: fix files grab/cancel race When one task is in io_uring_cancel_files() and another is doing io_prep_async_work() a race may happen. That's because after accounting a request inflight in first call to io_grab_identity() it still may fail and go to io_identity_cow(), which migh briefly keep dangling work.identity and not only. Grab files last, so io_prep_async_work() won't fail if it did get into ->inflight_list. note: the bug shouldn't exist after making io_uring_cancel_files() not poking into other tasks' requests. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-11-26 08:50:21 -07:00
Bob Peterson	f39e7d3aae	gfs2: Don't freeze the file system during unmount GFS2's freeze/thaw mechanism uses a special freeze glock to control its operation. It does this with a sync glock operation (glops.c) called freeze_go_sync. When the freeze glock is demoted (glock's do_xmote) the glops function causes the file system to be frozen. This is intended. However, GFS2's mount and unmount processes also hold the freeze glock to prevent other processes, perhaps on different cluster nodes, from mounting the frozen file system in read-write mode. Before this patch, there was no check in freeze_go_sync for whether a freeze in intended or whether the glock demote was caused by a normal unmount. So it was trying to freeze the file system it's trying to unmount, which ends up in a deadlock. This patch adds an additional check to freeze_go_sync so that demotes of the freeze glock are ignored if they come from the unmount process. Fixes: `20b3291290` ("gfs2: Fix regression in freeze_go_sync") Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-11-25 18:12:08 +01:00
Bob Peterson	778721510e	gfs2: check for empty rgrp tree in gfs2_ri_update If gfs2 tries to mount a (corrupt) file system that has no resource groups it still tries to set preferences on the first one, which causes a kernel null pointer dereference. This patch adds a check to function gfs2_ri_update so this condition is detected and reported back as an error. Reported-by: syzbot+e3f23ce40269a4c9053a@syzkaller.appspotmail.com Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-11-25 18:10:55 +01:00
Ard Biesheuvel	ff04f3b6f2	efivarfs: revert "fix memory leak in efivarfs_create()" The memory leak addressed by commit `fe5186cf12` is a false positive: all allocations are recorded in a linked list, and freed when the filesystem is unmounted. This leads to double frees, and as reported by David, leads to crashes if SLUB is configured to self destruct when double frees occur. So drop the redundant kfree() again, and instead, mark the offending pointer variable so the allocation is ignored by kmemleak. Cc: Vamshi K Sthambamkadi <vamshi.k.sthambamkadi@gmail.com> Fixes: `fe5186cf12` ("efivarfs: fix memory leak in efivarfs_create()") Reported-by: David Laight <David.Laight@aculab.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2020-11-25 16:55:02 +01:00
Linus Torvalds	127c501a03	Four smb3 fixes for stable -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAl+9eV4ACgkQiiy9cAdy T1EXOAv+JF6QZiwB6TELiusDLw+6UWpT7CcT1guL+eSrFVYEPIhqF4V4QXY0oA0Y F8jTpHxEx8wdECAPPNGHh/a4E+Y1vV/W8Nv5DkglAwjeXAD2Y84VAp8hH890jnn0 M8I9qdnbfSodRueshpKScRPHbfp4Smlz1BR9R0syk7T7TmCy8aKNwYN1lBy5Nf9f ICMn1F5e9z4nX43NJIwzO+NSPehtLm8ULFZER/pQ+tGDhwXTdFc9HPzfu0ZoYbEO zADjmY4PItVYRINnWBntEBLYcAFeAB0finPTP2kCfXfRDF5cPgElp84F3Uro7se5 bioboePO+bUS0jigIiP3qZ7zTHEdJoICsiJzVGmDZYsawK3MAwamp2EH3axAr4B/ h4LULgN7nCatPW5lMo3/3EPZXVbXVTOYIB2REtqJugK8USQ9+v9SMLNT/qWn0GE5 bzZoZ22wkHEOn4EIxYSCX4tgj9cJ2v9B/0NMpTQLTECKBQi3iV32GxLglvMwZru6 eWKL5tZj =lxdD -----END PGP SIGNATURE----- Merge tag '5.10-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6 Pull cifs fixes from Steve French: "Four smb3 fixes for stable: one fixes a memleak, the other three address a problem found with decryption offload that can cause a use after free" * tag '5.10-rc5-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb3: Handle error case during offload read path smb3: Avoid Mid pending list corruption smb3: Call cifs reconnect from demultiplex thread cifs: fix a memleak with modefromsid	2020-11-24 15:33:18 -08:00
Alexander Aring	515b269d5b	gfs2: set lockdep subclass for iopen glocks This patch introduce a new globs attribute to define the subclass of the glock lockref spinlock. This avoid the following lockdep warning, which occurs when we lock an inode lock while an iopen lock is held: ============================================ WARNING: possible recursive locking detected 5.10.0-rc3+ #4990 Not tainted -------------------------------------------- kworker/0:1/12 is trying to acquire lock: ffff9067d45672d8 (&gl->gl_lockref.lock){+.+.}-{3:3}, at: lockref_get+0x9/0x20 but task is already holding lock: ffff9067da308588 (&gl->gl_lockref.lock){+.+.}-{3:3}, at: delete_work_func+0x164/0x260 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&gl->gl_lockref.lock); lock(&gl->gl_lockref.lock); * DEADLOCK * May be due to missing lock nesting notation 3 locks held by kworker/0:1/12: #0: ffff9067c1bfdd38 ((wq_completion)delete_workqueue){+.+.}-{0:0}, at: process_one_work+0x1b7/0x540 #1: ffffac594006be70 ((work_completion)(&(&gl->gl_delete)->work)){+.+.}-{0:0}, at: process_one_work+0x1b7/0x540 #2: ffff9067da308588 (&gl->gl_lockref.lock){+.+.}-{3:3}, at: delete_work_func+0x164/0x260 stack backtrace: CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.10.0-rc3+ #4990 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Workqueue: delete_workqueue delete_work_func Call Trace: dump_stack+0x8b/0xb0 __lock_acquire.cold+0x19e/0x2e3 lock_acquire+0x150/0x410 ? lockref_get+0x9/0x20 _raw_spin_lock+0x27/0x40 ? lockref_get+0x9/0x20 lockref_get+0x9/0x20 delete_work_func+0x188/0x260 process_one_work+0x237/0x540 worker_thread+0x4d/0x3b0 ? process_one_work+0x540/0x540 kthread+0x127/0x140 ? __kthread_bind_mask+0x60/0x60 ret_from_fork+0x22/0x30 Suggested-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-11-24 23:45:58 +01:00
Alexander Aring	16e6281b6b	gfs2: Fix deadlock dumping resource group glocks Commit `0e539ca1bb` ("gfs2: Fix NULL pointer dereference in gfs2_rgrp_dump") introduced additional locking in gfs2_rgrp_go_dump, which is also used for dumping resource group glocks via debugfs. However, on that code path, the glock spin lock is already taken in dump_glock, and taking it again in gfs2_glock2rgrp leads to deadlock. This can be reproduced with: $ mkfs.gfs2 -O -p lock_nolock /dev/FOO $ mount /dev/FOO /mnt/foo $ touch /mnt/foo/bar $ cat /sys/kernel/debug/gfs2/FOO/glocks Fix that by not taking the glock spin lock inside the go_dump callback. Fixes: `0e539ca1bb` ("gfs2: Fix NULL pointer dereference in gfs2_rgrp_dump") Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-11-24 23:45:58 +01:00
Pavel Begunkov	9c3a205c5f	io_uring: fix ITER_BVEC check iov_iter::type is a bitmask that also keeps direction etc., so it shouldn't be directly compared against ITER_*. Use proper helper. Fixes: `ff6165b2d7` ("io_uring: retain iov_iter state over io_read/io_write calls") Reported-by: David Howells <dhowells@redhat.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Cc: <stable@vger.kernel.org> # 5.9 Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-11-24 07:54:30 -07:00
Joseph Qi	eb2667b343	io_uring: fix shift-out-of-bounds when round up cq size Abaci Fuzz reported a shift-out-of-bounds BUG in io_uring_create(): [ 59.598207] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13 [ 59.599665] shift exponent 64 is too large for 64-bit type 'long unsigned int' [ 59.601230] CPU: 0 PID: 963 Comm: a.out Not tainted 5.10.0-rc4+ #3 [ 59.602502] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 59.603673] Call Trace: [ 59.604286] dump_stack+0x107/0x163 [ 59.605237] ubsan_epilogue+0xb/0x5a [ 59.606094] __ubsan_handle_shift_out_of_bounds.cold+0xb2/0x20e [ 59.607335] ? lock_downgrade+0x6c0/0x6c0 [ 59.608182] ? rcu_read_lock_sched_held+0xaf/0xe0 [ 59.609166] io_uring_create.cold+0x99/0x149 [ 59.610114] io_uring_setup+0xd6/0x140 [ 59.610975] ? io_uring_create+0x2510/0x2510 [ 59.611945] ? lockdep_hardirqs_on_prepare+0x286/0x400 [ 59.613007] ? syscall_enter_from_user_mode+0x27/0x80 [ 59.614038] ? trace_hardirqs_on+0x5b/0x180 [ 59.615056] do_syscall_64+0x2d/0x40 [ 59.615940] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 59.617007] RIP: 0033:0x7f2bb8a0b239 This is caused by roundup_pow_of_two() if the input entries larger enough, e.g. 2^32-1. For sq_entries, it will check first and we allow at most IORING_MAX_ENTRIES, so it is okay. But for cq_entries, we do round up first, that may overflow and truncate it to 0, which is not the expected behavior. So check the cq size first and then do round up. Fixes: `88ec3211e4` ("io_uring: round-up cq size before comparing with rounded sq size") Reported-by: Abaci Fuzz <abaci@linux.alibaba.com> Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-11-24 07:54:30 -07:00
Filipe Manana	a855fbe692	btrfs: fix lockdep splat when enabling and disabling qgroups When running test case btrfs/017 from fstests, lockdep reported the following splat: [ 1297.067385] ====================================================== [ 1297.067708] WARNING: possible circular locking dependency detected [ 1297.068022] 5.10.0-rc4-btrfs-next-73 #1 Not tainted [ 1297.068322] ------------------------------------------------------ [ 1297.068629] btrfs/189080 is trying to acquire lock: [ 1297.068929] ffff9f2725731690 (sb_internal#2){.+.+}-{0:0}, at: btrfs_quota_enable+0xaf/0xa70 [btrfs] [ 1297.069274] but task is already holding lock: [ 1297.069868] ffff9f2702b61a08 (&fs_info->qgroup_ioctl_lock){+.+.}-{3:3}, at: btrfs_quota_enable+0x3b/0xa70 [btrfs] [ 1297.070219] which lock already depends on the new lock. [ 1297.071131] the existing dependency chain (in reverse order) is: [ 1297.071721] -> #1 (&fs_info->qgroup_ioctl_lock){+.+.}-{3:3}: [ 1297.072375] lock_acquire+0xd8/0x490 [ 1297.072710] __mutex_lock+0xa3/0xb30 [ 1297.073061] btrfs_qgroup_inherit+0x59/0x6a0 [btrfs] [ 1297.073421] create_subvol+0x194/0x990 [btrfs] [ 1297.073780] btrfs_mksubvol+0x3fb/0x4a0 [btrfs] [ 1297.074133] __btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs] [ 1297.074498] btrfs_ioctl_snap_create+0x58/0x80 [btrfs] [ 1297.074872] btrfs_ioctl+0x1a90/0x36f0 [btrfs] [ 1297.075245] __x64_sys_ioctl+0x83/0xb0 [ 1297.075617] do_syscall_64+0x33/0x80 [ 1297.075993] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1297.076380] -> #0 (sb_internal#2){.+.+}-{0:0}: [ 1297.077166] check_prev_add+0x91/0xc60 [ 1297.077572] __lock_acquire+0x1740/0x3110 [ 1297.077984] lock_acquire+0xd8/0x490 [ 1297.078411] start_transaction+0x3c5/0x760 [btrfs] [ 1297.078853] btrfs_quota_enable+0xaf/0xa70 [btrfs] [ 1297.079323] btrfs_ioctl+0x2c60/0x36f0 [btrfs] [ 1297.079789] __x64_sys_ioctl+0x83/0xb0 [ 1297.080232] do_syscall_64+0x33/0x80 [ 1297.080680] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1297.081139] other info that might help us debug this: [ 1297.082536] Possible unsafe locking scenario: [ 1297.083510] CPU0 CPU1 [ 1297.084005] ---- ---- [ 1297.084500] lock(&fs_info->qgroup_ioctl_lock); [ 1297.084994] lock(sb_internal#2); [ 1297.085485] lock(&fs_info->qgroup_ioctl_lock); [ 1297.085974] lock(sb_internal#2); [ 1297.086454] * DEADLOCK * [ 1297.087880] 3 locks held by btrfs/189080: [ 1297.088324] #0: ffff9f2725731470 (sb_writers#14){.+.+}-{0:0}, at: btrfs_ioctl+0xa73/0x36f0 [btrfs] [ 1297.088799] #1: ffff9f2702b60cc0 (&fs_info->subvol_sem){++++}-{3:3}, at: btrfs_ioctl+0x1f4d/0x36f0 [btrfs] [ 1297.089284] #2: ffff9f2702b61a08 (&fs_info->qgroup_ioctl_lock){+.+.}-{3:3}, at: btrfs_quota_enable+0x3b/0xa70 [btrfs] [ 1297.089771] stack backtrace: [ 1297.090662] CPU: 5 PID: 189080 Comm: btrfs Not tainted 5.10.0-rc4-btrfs-next-73 #1 [ 1297.091132] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 1297.092123] Call Trace: [ 1297.092629] dump_stack+0x8d/0xb5 [ 1297.093115] check_noncircular+0xff/0x110 [ 1297.093596] check_prev_add+0x91/0xc60 [ 1297.094076] ? kvm_clock_read+0x14/0x30 [ 1297.094553] ? kvm_sched_clock_read+0x5/0x10 [ 1297.095029] __lock_acquire+0x1740/0x3110 [ 1297.095510] lock_acquire+0xd8/0x490 [ 1297.095993] ? btrfs_quota_enable+0xaf/0xa70 [btrfs] [ 1297.096476] start_transaction+0x3c5/0x760 [btrfs] [ 1297.096962] ? btrfs_quota_enable+0xaf/0xa70 [btrfs] [ 1297.097451] btrfs_quota_enable+0xaf/0xa70 [btrfs] [ 1297.097941] ? btrfs_ioctl+0x1f4d/0x36f0 [btrfs] [ 1297.098429] btrfs_ioctl+0x2c60/0x36f0 [btrfs] [ 1297.098904] ? do_user_addr_fault+0x20c/0x430 [ 1297.099382] ? kvm_clock_read+0x14/0x30 [ 1297.099854] ? kvm_sched_clock_read+0x5/0x10 [ 1297.100328] ? sched_clock+0x5/0x10 [ 1297.100801] ? sched_clock_cpu+0x12/0x180 [ 1297.101272] ? __x64_sys_ioctl+0x83/0xb0 [ 1297.101739] __x64_sys_ioctl+0x83/0xb0 [ 1297.102207] do_syscall_64+0x33/0x80 [ 1297.102673] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1297.103148] RIP: 0033:0x7f773ff65d87 This is because during the quota enable ioctl we lock first the mutex qgroup_ioctl_lock and then start a transaction, and starting a transaction acquires a fs freeze semaphore (at the VFS level). However, every other code path, except for the quota disable ioctl path, we do the opposite: we start a transaction and then lock the mutex. So fix this by making the quota enable and disable paths to start the transaction without having the mutex locked, and then, after starting the transaction, lock the mutex and check if some other task already enabled or disabled the quotas, bailing with success if that was the case. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2020-11-23 21:16:43 +01:00
Filipe Manana	7aa6d35984	btrfs: do nofs allocations when adding and removing qgroup relations When adding or removing a qgroup relation we are doing a GFP_KERNEL allocation which is not safe because we are holding a transaction handle open and that can make us deadlock if the allocator needs to recurse into the filesystem. So just surround those calls with a nofs context. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2020-11-23 21:16:40 +01:00
Filipe Manana	3d05cad3c3	btrfs: fix lockdep splat when reading qgroup config on mount Lockdep reported the following splat when running test btrfs/190 from fstests: [ 9482.126098] ====================================================== [ 9482.126184] WARNING: possible circular locking dependency detected [ 9482.126281] 5.10.0-rc4-btrfs-next-73 #1 Not tainted [ 9482.126365] ------------------------------------------------------ [ 9482.126456] mount/24187 is trying to acquire lock: [ 9482.126534] ffffa0c869a7dac0 (&fs_info->qgroup_rescan_lock){+.+.}-{3:3}, at: qgroup_rescan_init+0x43/0xf0 [btrfs] [ 9482.126647] but task is already holding lock: [ 9482.126777] ffffa0c892ebd3a0 (btrfs-quota-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x27/0x120 [btrfs] [ 9482.126886] which lock already depends on the new lock. [ 9482.127078] the existing dependency chain (in reverse order) is: [ 9482.127213] -> #1 (btrfs-quota-00){++++}-{3:3}: [ 9482.127366] lock_acquire+0xd8/0x490 [ 9482.127436] down_read_nested+0x45/0x220 [ 9482.127528] __btrfs_tree_read_lock+0x27/0x120 [btrfs] [ 9482.127613] btrfs_read_lock_root_node+0x41/0x130 [btrfs] [ 9482.127702] btrfs_search_slot+0x514/0xc30 [btrfs] [ 9482.127788] update_qgroup_status_item+0x72/0x140 [btrfs] [ 9482.127877] btrfs_qgroup_rescan_worker+0xde/0x680 [btrfs] [ 9482.127964] btrfs_work_helper+0xf1/0x600 [btrfs] [ 9482.128039] process_one_work+0x24e/0x5e0 [ 9482.128110] worker_thread+0x50/0x3b0 [ 9482.128181] kthread+0x153/0x170 [ 9482.128256] ret_from_fork+0x22/0x30 [ 9482.128327] -> #0 (&fs_info->qgroup_rescan_lock){+.+.}-{3:3}: [ 9482.128464] check_prev_add+0x91/0xc60 [ 9482.128551] __lock_acquire+0x1740/0x3110 [ 9482.128623] lock_acquire+0xd8/0x490 [ 9482.130029] __mutex_lock+0xa3/0xb30 [ 9482.130590] qgroup_rescan_init+0x43/0xf0 [btrfs] [ 9482.131577] btrfs_read_qgroup_config+0x43a/0x550 [btrfs] [ 9482.132175] open_ctree+0x1228/0x18a0 [btrfs] [ 9482.132756] btrfs_mount_root.cold+0x13/0xed [btrfs] [ 9482.133325] legacy_get_tree+0x30/0x60 [ 9482.133866] vfs_get_tree+0x28/0xe0 [ 9482.134392] fc_mount+0xe/0x40 [ 9482.134908] vfs_kern_mount.part.0+0x71/0x90 [ 9482.135428] btrfs_mount+0x13b/0x3e0 [btrfs] [ 9482.135942] legacy_get_tree+0x30/0x60 [ 9482.136444] vfs_get_tree+0x28/0xe0 [ 9482.136949] path_mount+0x2d7/0xa70 [ 9482.137438] do_mount+0x75/0x90 [ 9482.137923] __x64_sys_mount+0x8e/0xd0 [ 9482.138400] do_syscall_64+0x33/0x80 [ 9482.138873] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 9482.139346] other info that might help us debug this: [ 9482.140735] Possible unsafe locking scenario: [ 9482.141594] CPU0 CPU1 [ 9482.142011] ---- ---- [ 9482.142411] lock(btrfs-quota-00); [ 9482.142806] lock(&fs_info->qgroup_rescan_lock); [ 9482.143216] lock(btrfs-quota-00); [ 9482.143629] lock(&fs_info->qgroup_rescan_lock); [ 9482.144056] * DEADLOCK * [ 9482.145242] 2 locks held by mount/24187: [ 9482.145637] #0: ffffa0c8411c40e8 (&type->s_umount_key#44/1){+.+.}-{3:3}, at: alloc_super+0xb9/0x400 [ 9482.146061] #1: ffffa0c892ebd3a0 (btrfs-quota-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x27/0x120 [btrfs] [ 9482.146509] stack backtrace: [ 9482.147350] CPU: 1 PID: 24187 Comm: mount Not tainted 5.10.0-rc4-btrfs-next-73 #1 [ 9482.147788] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 9482.148709] Call Trace: [ 9482.149169] dump_stack+0x8d/0xb5 [ 9482.149628] check_noncircular+0xff/0x110 [ 9482.150090] check_prev_add+0x91/0xc60 [ 9482.150561] ? kvm_clock_read+0x14/0x30 [ 9482.151017] ? kvm_sched_clock_read+0x5/0x10 [ 9482.151470] __lock_acquire+0x1740/0x3110 [ 9482.151941] ? __btrfs_tree_read_lock+0x27/0x120 [btrfs] [ 9482.152402] lock_acquire+0xd8/0x490 [ 9482.152887] ? qgroup_rescan_init+0x43/0xf0 [btrfs] [ 9482.153354] __mutex_lock+0xa3/0xb30 [ 9482.153826] ? qgroup_rescan_init+0x43/0xf0 [btrfs] [ 9482.154301] ? qgroup_rescan_init+0x43/0xf0 [btrfs] [ 9482.154768] ? qgroup_rescan_init+0x43/0xf0 [btrfs] [ 9482.155226] qgroup_rescan_init+0x43/0xf0 [btrfs] [ 9482.155690] btrfs_read_qgroup_config+0x43a/0x550 [btrfs] [ 9482.156160] open_ctree+0x1228/0x18a0 [btrfs] [ 9482.156643] btrfs_mount_root.cold+0x13/0xed [btrfs] [ 9482.157108] ? rcu_read_lock_sched_held+0x5d/0x90 [ 9482.157567] ? kfree+0x31f/0x3e0 [ 9482.158030] legacy_get_tree+0x30/0x60 [ 9482.158489] vfs_get_tree+0x28/0xe0 [ 9482.158947] fc_mount+0xe/0x40 [ 9482.159403] vfs_kern_mount.part.0+0x71/0x90 [ 9482.159875] btrfs_mount+0x13b/0x3e0 [btrfs] [ 9482.160335] ? rcu_read_lock_sched_held+0x5d/0x90 [ 9482.160805] ? kfree+0x31f/0x3e0 [ 9482.161260] ? legacy_get_tree+0x30/0x60 [ 9482.161714] legacy_get_tree+0x30/0x60 [ 9482.162166] vfs_get_tree+0x28/0xe0 [ 9482.162616] path_mount+0x2d7/0xa70 [ 9482.163070] do_mount+0x75/0x90 [ 9482.163525] __x64_sys_mount+0x8e/0xd0 [ 9482.163986] do_syscall_64+0x33/0x80 [ 9482.164437] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 9482.164902] RIP: 0033:0x7f51e907caaa This happens because at btrfs_read_qgroup_config() we can call qgroup_rescan_init() while holding a read lock on a quota btree leaf, acquired by the previous call to btrfs_search_slot_for_read(), and qgroup_rescan_init() acquires the mutex qgroup_rescan_lock. A qgroup rescan worker does the opposite: it acquires the mutex qgroup_rescan_lock, at btrfs_qgroup_rescan_worker(), and then tries to update the qgroup status item in the quota btree through the call to update_qgroup_status_item(). This inversion of locking order between the qgroup_rescan_lock mutex and quota btree locks causes the splat. Fix this simply by releasing and freeing the path before calling qgroup_rescan_init() at btrfs_read_qgroup_config(). CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2020-11-23 21:16:30 +01:00
David Sterba	6d06b0ad94	btrfs: tree-checker: add missing returns after data_ref alignment checks There are sectorsize alignment checks that are reported but then check_extent_data_ref continues. This was not intended, wrong alignment is not a minor problem and we should return with error. CC: stable@vger.kernel.org # 5.4+ Fixes: `0785a9aacf` ("btrfs: tree-checker: Add EXTENT_DATA_REF check") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2020-11-23 21:16:21 +01:00
Johannes Thumshirn	0697d9a610	btrfs: don't access possibly stale fs_info data for printing duplicate device Syzbot reported a possible use-after-free when printing a duplicate device warning device_list_add(). At this point it can happen that a btrfs_device::fs_info is not correctly setup yet, so we're accessing stale data, when printing the warning message using the btrfs_printk() wrappers. ================================================================== BUG: KASAN: use-after-free in btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245 Read of size 8 at addr ffff8880878e06a8 by task syz-executor225/7068 CPU: 1 PID: 7068 Comm: syz-executor225 Not tainted 5.9.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1d6/0x29e lib/dump_stack.c:118 print_address_description+0x66/0x620 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report+0x132/0x1d0 mm/kasan/report.c:530 btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245 device_list_add+0x1a88/0x1d60 fs/btrfs/volumes.c:943 btrfs_scan_one_device+0x196/0x490 fs/btrfs/volumes.c:1359 btrfs_mount_root+0x48f/0xb60 fs/btrfs/super.c:1634 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x88/0x270 fs/super.c:1547 fc_mount fs/namespace.c:978 [inline] vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008 btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x88/0x270 fs/super.c:1547 do_new_mount fs/namespace.c:2875 [inline] path_mount+0x179d/0x29e0 fs/namespace.c:3192 do_mount fs/namespace.c:3205 [inline] __do_sys_mount fs/namespace.c:3413 [inline] __se_sys_mount+0x126/0x180 fs/namespace.c:3390 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x44840a RSP: 002b:00007ffedfffd608 EFLAGS: 00000293 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 00007ffedfffd670 RCX: 000000000044840a RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffedfffd630 RBP: 00007ffedfffd630 R08: 00007ffedfffd670 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000001a R13: 0000000000000004 R14: 0000000000000003 R15: 0000000000000003 Allocated by task 6945: kasan_save_stack mm/kasan/common.c:48 [inline] kasan_set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc+0x100/0x130 mm/kasan/common.c:461 kmalloc_node include/linux/slab.h:577 [inline] kvmalloc_node+0x81/0x110 mm/util.c:574 kvmalloc include/linux/mm.h:757 [inline] kvzalloc include/linux/mm.h:765 [inline] btrfs_mount_root+0xd0/0xb60 fs/btrfs/super.c:1613 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x88/0x270 fs/super.c:1547 fc_mount fs/namespace.c:978 [inline] vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008 btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x88/0x270 fs/super.c:1547 do_new_mount fs/namespace.c:2875 [inline] path_mount+0x179d/0x29e0 fs/namespace.c:3192 do_mount fs/namespace.c:3205 [inline] __do_sys_mount fs/namespace.c:3413 [inline] __se_sys_mount+0x126/0x180 fs/namespace.c:3390 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 6945: kasan_save_stack mm/kasan/common.c:48 [inline] kasan_set_track+0x3d/0x70 mm/kasan/common.c:56 kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355 __kasan_slab_free+0xdd/0x110 mm/kasan/common.c:422 __cache_free mm/slab.c:3418 [inline] kfree+0x113/0x200 mm/slab.c:3756 deactivate_locked_super+0xa7/0xf0 fs/super.c:335 btrfs_mount_root+0x72b/0xb60 fs/btrfs/super.c:1678 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x88/0x270 fs/super.c:1547 fc_mount fs/namespace.c:978 [inline] vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008 btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x88/0x270 fs/super.c:1547 do_new_mount fs/namespace.c:2875 [inline] path_mount+0x179d/0x29e0 fs/namespace.c:3192 do_mount fs/namespace.c:3205 [inline] __do_sys_mount fs/namespace.c:3413 [inline] __se_sys_mount+0x126/0x180 fs/namespace.c:3390 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The buggy address belongs to the object at ffff8880878e0000 which belongs to the cache kmalloc-16k of size 16384 The buggy address is located 1704 bytes inside of 16384-byte region [ffff8880878e0000, ffff8880878e4000) The buggy address belongs to the page: page:0000000060704f30 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x878e0 head:0000000060704f30 order:3 compound_mapcount:0 compound_pincount:0 flags: 0xfffe0000010200(slab\|head) raw: 00fffe0000010200 ffffea00028e9a08 ffffea00021e3608 ffff8880aa440b00 raw: 0000000000000000 ffff8880878e0000 0000000100000001 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8880878e0580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880878e0600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff8880878e0680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8880878e0700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880878e0780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== The syzkaller reproducer for this use-after-free crafts a filesystem image and loop mounts it twice in a loop. The mount will fail as the crafted image has an invalid chunk tree. When this happens btrfs_mount_root() will call deactivate_locked_super(), which then cleans up fs_info and fs_info::sb. If a second thread now adds the same block-device to the filesystem, it will get detected as a duplicate device and device_list_add() will reject the duplicate and print a warning. But as the fs_info pointer passed in is non-NULL this will result in a use-after-free. Instead of printing possibly uninitialized or already freed memory in btrfs_printk(), explicitly pass in a NULL fs_info so the printing of the device name will be skipped altogether. There was a slightly different approach discussed in https://lore.kernel.org/linux-btrfs/20200114060920.4527-1-anand.jain@oracle.com/t/#u Link: https://lore.kernel.org/linux-btrfs/000000000000c9e14b05afcc41ba@google.com Reported-by: syzbot+582e66e5edf36a22c7b0@syzkaller.appspotmail.com CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2020-11-23 21:16:12 +01:00
Linus Torvalds	68d3fa235f	Couple of EFI fixes for v5.10: - fix memory leak in efivarfs driver - fix HYP mode issue in 32-bit ARM version of the EFI stub when built in Thumb2 mode - avoid leaking EFI pgd pages on allocation failure -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEnNKg2mrY9zMBdeK7wjcgfpV0+n0FAl+tRcoACgkQwjcgfpV0 +n1LZQf/af1p9A0zT1nC3IrcaABceJDnD3dJuV9SD6QhFuD2Dw/Mshr+MVzDsO+3 btvuu0r4CzQ5ajfpfcGcvBFFWbbPTwKvWQe++9Unwoz5acw7hpV5yxNwMivdaJEh 3o4pkgpCmwtliTwiroDficO7Vlqefqf4LZd7/iQYQTnuPK3waYQBjwp9t2D9tlx7 kXiEQDP2BDRCUrKEjlR7AHTZ156mw+UsiquAuxMCGTKBqwiELEEV6aPseqa5MmNV RDV1IXWdhOQyQfzg0s6vTzwGeN0JubSxHng6O9UbE+tctz4EqaaHIEsRuMBq8oLD Y8JeGp1ovypTJxeLE5t6eEzsfvTRsg== =GnmM -----END PGP SIGNATURE----- Merge tag 'efi-urgent-for-v5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fixes from Borislav Petkov: "Forwarded EFI fixes from Ard Biesheuvel: - fix memory leak in efivarfs driver - fix HYP mode issue in 32-bit ARM version of the EFI stub when built in Thumb2 mode - avoid leaking EFI pgd pages on allocation failure" * tag 'efi-urgent-for-v5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/x86: Free efi_pgd with free_pages() efivarfs: fix memory leak in efivarfs_create() efi/arm: set HSCTLR Thumb2 bit correctly for HVC calls from HYP	2020-11-22 13:05:48 -08:00

1 2 3 4 5 ...

67441 commits