blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2022-08-21	asm goto: eradicate CC_HAS_ASM_GOTO	Nick Desaulniers	10	-89/+7
	GCC has supported asm goto since 4.5, and Clang has since version 9.0.0. The minimum supported versions of these tools for the build according to Documentation/process/changes.rst are 5.1 and 11.0.0 respectively. Remove the feature detection script, Kconfig option, and clean up some fallback code that is no longer supported. The removed script was also testing for a GCC specific bug that was fixed in the 4.7 release. Also remove workarounds for bpftrace using clang older than 9.0.0, since other BPF backend fixes are required at this point. Link: https://lore.kernel.org/lkml/CAK7LNATSr=BXKfkdW8f-H5VT_w=xBpT2ZQcZ7rm6JfkdE+QnmA@mail.gmail.com/ Link: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48637 Acked-by: Borislav Petkov <[email protected]> Suggested-by: Masahiro Yamada <[email protected]> Suggested-by: Alexei Starovoitov <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Reviewed-by: Ingo Molnar <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Reviewed-by: Alexandre Belloni <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-08-21	i2c: imx: Make sure to unregister adapter on remove()	Uwe Kleine-König	1	-9/+11
	If for whatever reasons pm_runtime_resume_and_get() fails and .remove() is exited early, the i2c adapter stays around and the irq still calls its handler, while the driver data and the register mapping go away. So if later the i2c adapter is accessed or the irq triggers this results in havoc accessing freed memory and unmapped registers. So unregister the software resources even if resume failed, and only skip the hardware access in that case. Fixes: 588eb93ea49f ("i2c: imx: add runtime pm support to improve the performance") Signed-off-by: Uwe Kleine-König <[email protected]> Acked-by: Oleksij Rempel <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2022-08-21	Revert "i2c: scmi: Replace open coded device_get_match_data()"	Wolfram Sang	1	-2/+7
	This reverts commit 9ae551ded5ba55f96a83cd0811f7ef8c2f329d0c. We got a regression report, so ensure this machine boots again. We will come back with a better version hopefully. Reported-by: Josef Johansson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Wolfram Sang <[email protected]>
2022-08-21	parisc: Fix exception handler for fldw and fstw instructions	Helge Deller	1	-1/+1
	The exception handler is broken for unaligned memory acceses with fldw and fstw instructions, because it trashes or uses randomly some other floating point register than the one specified in the instruction word on loads and stores. The instruction "fldw 0(addr),%fr22L" (and the other fldw/fstw instructions) encode the target register (%fr22) in the rightmost 5 bits of the instruction word. The 7th rightmost bit of the instruction word defines if the left or right half of %fr22 should be used. While processing unaligned address accesses, the FR3() define is used to extract the offset into the local floating-point register set. But the calculation in FR3() was buggy, so that for example instead of %fr22, register %fr12 [((22 * 2) & 0x1f) = 12] was used. This bug has been since forever in the parisc kernel and I wonder why it wasn't detected earlier. Interestingly I noticed this bug just because the libime debian package failed to build on native hardware, while it successfully built in qemu. This patch corrects the bitshift and masking calculation in FR3(). Signed-off-by: Helge Deller <[email protected]> Cc: <[email protected]>
2022-08-20	kprobes: don't call disarm_kprobe() for disabled kprobes	Kuniyuki Iwashima	1	-4/+5
	The assumption in __disable_kprobe() is wrong, and it could try to disarm an already disarmed kprobe and fire the WARN_ONCE() below. [0] We can easily reproduce this issue. 1. Write 0 to /sys/kernel/debug/kprobes/enabled. # echo 0 > /sys/kernel/debug/kprobes/enabled 2. Run execsnoop. At this time, one kprobe is disabled. # /usr/share/bcc/tools/execsnoop & [1] 2460 PCOMM PID PPID RET ARGS # cat /sys/kernel/debug/kprobes/list ffffffff91345650 r __x64_sys_execve+0x0 [FTRACE] ffffffff91345650 k __x64_sys_execve+0x0 [DISABLED][FTRACE] 3. Write 1 to /sys/kernel/debug/kprobes/enabled, which changes kprobes_all_disarmed to false but does not arm the disabled kprobe. # echo 1 > /sys/kernel/debug/kprobes/enabled # cat /sys/kernel/debug/kprobes/list ffffffff91345650 r __x64_sys_execve+0x0 [FTRACE] ffffffff91345650 k __x64_sys_execve+0x0 [DISABLED][FTRACE] 4. Kill execsnoop, when __disable_kprobe() calls disarm_kprobe() for the disabled kprobe and hits the WARN_ONCE() in __disarm_kprobe_ftrace(). # fg /usr/share/bcc/tools/execsnoop ^C Actually, WARN_ONCE() is fired twice, and __unregister_kprobe_top() misses some cleanups and leaves the aggregated kprobe in the hash table. Then, __unregister_trace_kprobe() initialises tk->rp.kp.list and creates an infinite loop like this. aggregated kprobe.list -> kprobe.list -. ^ \| '.__.' In this situation, these commands fall into the infinite loop and result in RCU stall or soft lockup. cat /sys/kernel/debug/kprobes/list : show_kprobe_addr() enters into the infinite loop with RCU. /usr/share/bcc/tools/execsnoop : warn_kprobe_rereg() holds kprobe_mutex, and __get_valid_kprobe() is stuck in the loop. To avoid the issue, make sure we don't call disarm_kprobe() for disabled kprobes. [0] Failed to disarm kprobe-ftrace at __x64_sys_execve+0x0/0x40 (error -2) WARNING: CPU: 6 PID: 2460 at kernel/kprobes.c:1130 __disarm_kprobe_ftrace.isra.19 (kernel/kprobes.c:1129) Modules linked in: ena CPU: 6 PID: 2460 Comm: execsnoop Not tainted 5.19.0+ #28 Hardware name: Amazon EC2 c5.2xlarge/, BIOS 1.0 10/16/2017 RIP: 0010:__disarm_kprobe_ftrace.isra.19 (kernel/kprobes.c:1129) Code: 24 8b 02 eb c1 80 3d c4 83 f2 01 00 75 d4 48 8b 75 00 89 c2 48 c7 c7 90 fa 0f 92 89 04 24 c6 05 ab 83 01 e8 e4 94 f0 ff <0f> 0b 8b 04 24 eb b1 89 c6 48 c7 c7 60 fa 0f 92 89 04 24 e8 cc 94 RSP: 0018:ffff9e6ec154bd98 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffffff930f7b00 RCX: 0000000000000001 RDX: 0000000080000001 RSI: ffffffff921461c5 RDI: 00000000ffffffff RBP: ffff89c504286da8 R08: 0000000000000000 R09: c0000000fffeffff R10: 0000000000000000 R11: ffff9e6ec154bc28 R12: ffff89c502394e40 R13: ffff89c502394c00 R14: ffff9e6ec154bc00 R15: 0000000000000000 FS: 00007fe800398740(0000) GS:ffff89c812d80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c00057f010 CR3: 0000000103b54006 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> __disable_kprobe (kernel/kprobes.c:1716) disable_kprobe (kernel/kprobes.c:2392) __disable_trace_kprobe (kernel/trace/trace_kprobe.c:340) disable_trace_kprobe (kernel/trace/trace_kprobe.c:429) perf_trace_event_unreg.isra.2 (./include/linux/tracepoint.h:93 kernel/trace/trace_event_perf.c:168) perf_kprobe_destroy (kernel/trace/trace_event_perf.c:295) _free_event (kernel/events/core.c:4971) perf_event_release_kernel (kernel/events/core.c:5176) perf_release (kernel/events/core.c:5186) __fput (fs/file_table.c:321) task_work_run (./include/linux/sched.h:2056 (discriminator 1) kernel/task_work.c:179 (discriminator 1)) exit_to_user_mode_prepare (./include/linux/resume_user_mode.h:49 kernel/entry/common.c:169 kernel/entry/common.c:201) syscall_exit_to_user_mode (./arch/x86/include/asm/jump_label.h:55 ./arch/x86/include/asm/nospec-branch.h:384 ./arch/x86/include/asm/entry-common.h:94 kernel/entry/common.c:133 kernel/entry/common.c:296) do_syscall_64 (arch/x86/entry/common.c:87) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120) RIP: 0033:0x7fe7ff210654 Code: 15 79 89 20 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb be 0f 1f 00 8b 05 9a cd 20 00 48 63 ff 85 c0 75 11 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 3a f3 c3 48 83 ec 18 48 89 7c 24 08 e8 34 fc RSP: 002b:00007ffdbd1d3538 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 0000000000000008 RCX: 00007fe7ff210654 RDX: 0000000000000000 RSI: 0000000000002401 RDI: 0000000000000008 RBP: 0000000000000000 R08: 94ae31d6fda838a4 R0900007fe8001c9d30 R10: 00007ffdbd1d34b0 R11: 0000000000000246 R12: 00007ffdbd1d3600 R13: 0000000000000000 R14: fffffffffffffffc R15: 00007ffdbd1d3560 </TASK> Link: https://lkml.kernel.org/r/[email protected] Fixes: 69d54b916d83 ("kprobes: makes kprobes/enabled works correctly for optimized kprobes.") Signed-off-by: Kuniyuki Iwashima <[email protected]> Reported-by: Ayushman Dutta <[email protected]> Cc: "Naveen N. Rao" <[email protected]> Cc: Anil S Keshavamurthy <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Wang Nan <[email protected]> Cc: Kuniyuki Iwashima <[email protected]> Cc: Kuniyuki Iwashima <[email protected]> Cc: Ayushman Dutta <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-20	mm/shmem: shmem_replace_page() remember NR_SHMEM	Hugh Dickins	1	-0/+2
	Elsewhere, NR_SHMEM is updated at the same time as shmem NR_FILE_PAGES; but shmem_replace_page() was forgetting to do that - so NR_SHMEM stats could grow too big or too small, in those unusual cases when it's used. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Hugh Dickins <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Cc: "Darrick J. Wong" <[email protected]> Cc: Radoslaw Burny <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-20	mm/shmem: tmpfs fallocate use file_modified()	Hugh Dickins	1	-1/+2
	5.18 fixed the btrfs and ext4 fallocates to use file_modified(), as xfs was already doing, to drop privileges: and fstests generic/{683,684,688} expect this. There's no need to argue over keep-size allocation (which could just update ctime): fix shmem_fallocate() to behave the same way. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Hugh Dickins <[email protected]> Acked-by: Christian Brauner (Microsoft) <[email protected]> Cc: "Darrick J. Wong" <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Radoslaw Burny <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-20	mm/shmem: fix chattr fsflags support in tmpfs	Hugh Dickins	2	-32/+35
	ext[234] have always allowed unimplemented chattr flags to be set, but other filesystems have tended to be stricter. Follow the stricter approach for tmpfs: I don't want to have to explain why csu attributes don't actually work, and we won't need to update the chattr(1) manpage; and it's never wrong to start off strict, relaxing later if persuaded. Allow only a (append only) i (immutable) A (no atime) and d (no dump). Although lsattr showed 'A' inherited, the NOATIME behavior was not being inherited: because nothing sync'ed FS_NOATIME_FL to S_NOATIME. Add shmem_set_inode_flags() to sync the flags, using inode_set_flags() to avoid that instant of lost immutablility during fileattr_set(). But that change switched generic/079 from passing to failing: because FS_IMMUTABLE_FL and FS_APPEND_FL had been unconventionally included in the INHERITED fsflags: remove them and generic/079 is back to passing. Link: https://lkml.kernel.org/r/[email protected] Fixes: e408e695f5f1 ("mm/shmem: support FS_IOC_[SG]ETFLAGS in tmpfs") Signed-off-by: Hugh Dickins <[email protected]> Cc: "Theodore Ts'o" <[email protected]> Cc: Radoslaw Burny <[email protected]> Cc: "Darrick J. Wong" <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-20	mm/hugetlb: support write-faults in shared mappings	David Hildenbrand	1	-7/+19
	If we ever get a write-fault on a write-protected page in a shared mapping, we'd be in trouble (again). Instead, we can simply map the page writable. And in fact, there is even a way right now to trigger that code via uffd-wp ever since we stared to support it for shmem in 5.19: -------------------------------------------------------------------------- #include <stdio.h> #include <stdlib.h> #include <string.h> #include <fcntl.h> #include <unistd.h> #include <errno.h> #include <sys/mman.h> #include <sys/syscall.h> #include <sys/ioctl.h> #include <linux/userfaultfd.h> #define HUGETLB_SIZE (2 * 1024 * 1024u) static char map; int uffd; static int temp_setup_uffd(void) { struct uffdio_api uffdio_api; struct uffdio_register uffdio_register; struct uffdio_writeprotect uffd_writeprotect; struct uffdio_range uffd_range; uffd = syscall(__NR_userfaultfd, O_CLOEXEC \| O_NONBLOCK \| UFFD_USER_MODE_ONLY); if (uffd < 0) { fprintf(stderr, "syscall() failed: %d\n", errno); return -errno; } uffdio_api.api = UFFD_API; uffdio_api.features = UFFD_FEATURE_PAGEFAULT_FLAG_WP; if (ioctl(uffd, UFFDIO_API, &uffdio_api) < 0) { fprintf(stderr, "UFFDIO_API failed: %d\n", errno); return -errno; } if (!(uffdio_api.features & UFFD_FEATURE_PAGEFAULT_FLAG_WP)) { fprintf(stderr, "UFFD_FEATURE_WRITEPROTECT missing\n"); return -ENOSYS; } / Register UFFD-WP / uffdio_register.range.start = (unsigned long) map; uffdio_register.range.len = HUGETLB_SIZE; uffdio_register.mode = UFFDIO_REGISTER_MODE_WP; if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) < 0) { fprintf(stderr, "UFFDIO_REGISTER failed: %d\n", errno); return -errno; } / Writeprotect a single page. / uffd_writeprotect.range.start = (unsigned long) map; uffd_writeprotect.range.len = HUGETLB_SIZE; uffd_writeprotect.mode = UFFDIO_WRITEPROTECT_MODE_WP; if (ioctl(uffd, UFFDIO_WRITEPROTECT, &uffd_writeprotect)) { fprintf(stderr, "UFFDIO_WRITEPROTECT failed: %d\n", errno); return -errno; } / Unregister UFFD-WP without prior writeunprotection. / uffd_range.start = (unsigned long) map; uffd_range.len = HUGETLB_SIZE; if (ioctl(uffd, UFFDIO_UNREGISTER, &uffd_range)) { fprintf(stderr, "UFFDIO_UNREGISTER failed: %d\n", errno); return -errno; } return 0; } int main(int argc, char argv) { int fd; fd = open("/dev/hugepages/tmp", O_RDWR \| O_CREAT); if (!fd) { fprintf(stderr, "open() failed\n"); return -errno; } if (ftruncate(fd, HUGETLB_SIZE)) { fprintf(stderr, "ftruncate() failed\n"); return -errno; } map = mmap(NULL, HUGETLB_SIZE, PROT_READ\|PROT_WRITE, MAP_SHARED, fd, 0); if (map == MAP_FAILED) { fprintf(stderr, "mmap() failed\n"); return -errno; } map = 0; if (temp_setup_uffd()) return 1; *map = 0; return 0; } -------------------------------------------------------------------------- Above test fails with SIGBUS when there is only a single free hugetlb page. # echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages # ./test Bus error (core dumped) And worse, with sufficient free hugetlb pages it will map an anonymous page into a shared mapping, for example, messing up accounting during unmap and breaking MAP_SHARED semantics: # echo 2 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages # ./test # cat /proc/meminfo \| grep HugePages_ HugePages_Total: 2 HugePages_Free: 1 HugePages_Rsvd: 18446744073709551615 HugePages_Surp: 0 Reason is that uffd-wp doesn't clear the uffd-wp PTE bit when unregistering and consequently keeps the PTE writeprotected. Reason for this is to avoid the additional overhead when unregistering. Note that this is the case also for !hugetlb and that we will end up with writable PTEs that still have the uffd-wp PTE bit set once we return from hugetlb_wp(). I'm not touching the uffd-wp PTE bit for now, because it seems to be a generic thing -- wp_page_reuse() also doesn't clear it. VM_MAYSHARE handling in hugetlb_fault() for FAULT_FLAG_WRITE indicates that MAP_SHARED handling was at least envisioned, but could never have worked as expected. While at it, make sure that we never end up in hugetlb_wp() on write faults without VM_WRITE, because we don't support maybe_mkwrite() semantics as commonly used in the !hugetlb case -- for example, in wp_page_reuse(). Note that there is no need to do any kind of reservation in hugetlb_fault() in this case ... because we already have a hugetlb page mapped R/O that we will simply map writable and we are not dealing with COW/unsharing. Link: https://lkml.kernel.org/r/[email protected] Fixes: b1f9e876862d ("mm/uffd: enable write protection for shmem & hugetlbfs") Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Cyrill Gorcunov <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jamie Liu <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Muchun Song <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Pavel Emelyanov <[email protected]> Cc: Peter Feiner <[email protected]> Cc: Peter Xu <[email protected]> Cc: <[email protected]> [5.19] Signed-off-by: Andrew Morton <[email protected]>
2022-08-20	mm/hugetlb: fix hugetlb not supporting softdirty tracking	David Hildenbrand	1	-2/+5
	Patch series "mm/hugetlb: fix write-fault handling for shared mappings", v2. I observed that hugetlb does not support/expect write-faults in shared mappings that would have to map the R/O-mapped page writable -- and I found two case where we could currently get such faults and would erroneously map an anon page into a shared mapping. Reproducers part of the patches. I propose to backport both fixes to stable trees. The first fix needs a small adjustment. This patch (of 2): Staring at hugetlb_wp(), one might wonder where all the logic for shared mappings is when stumbling over a write-protected page in a shared mapping. In fact, there is none, and so far we thought we could get away with that because e.g., mprotect() should always do the right thing and map all pages directly writable. Looks like we were wrong: -------------------------------------------------------------------------- #include <stdio.h> #include <stdlib.h> #include <string.h> #include <fcntl.h> #include <unistd.h> #include <errno.h> #include <sys/mman.h> #define HUGETLB_SIZE (2 * 1024 * 1024u) static void clear_softdirty(void) { int fd = open("/proc/self/clear_refs", O_WRONLY); const char ctrl = "4"; int ret; if (fd < 0) { fprintf(stderr, "open(clear_refs) failed\n"); exit(1); } ret = write(fd, ctrl, strlen(ctrl)); if (ret != strlen(ctrl)) { fprintf(stderr, "write(clear_refs) failed\n"); exit(1); } close(fd); } int main(int argc, char argv) { char map; int fd; fd = open("/dev/hugepages/tmp", O_RDWR \| O_CREAT); if (!fd) { fprintf(stderr, "open() failed\n"); return -errno; } if (ftruncate(fd, HUGETLB_SIZE)) { fprintf(stderr, "ftruncate() failed\n"); return -errno; } map = mmap(NULL, HUGETLB_SIZE, PROT_READ\|PROT_WRITE, MAP_SHARED, fd, 0); if (map == MAP_FAILED) { fprintf(stderr, "mmap() failed\n"); return -errno; } map = 0; if (mprotect(map, HUGETLB_SIZE, PROT_READ)) { fprintf(stderr, "mmprotect() failed\n"); return -errno; } clear_softdirty(); if (mprotect(map, HUGETLB_SIZE, PROT_READ\|PROT_WRITE)) { fprintf(stderr, "mmprotect() failed\n"); return -errno; } map = 0; return 0; } -------------------------------------------------------------------------- Above test fails with SIGBUS when there is only a single free hugetlb page. # echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages # ./test Bus error (core dumped) And worse, with sufficient free hugetlb pages it will map an anonymous page into a shared mapping, for example, messing up accounting during unmap and breaking MAP_SHARED semantics: # echo 2 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages # ./test # cat /proc/meminfo \| grep HugePages_ HugePages_Total: 2 HugePages_Free: 1 HugePages_Rsvd: 18446744073709551615 HugePages_Surp: 0 Reason in this particular case is that vma_wants_writenotify() will return "true", removing VM_SHARED in vma_set_page_prot() to map pages write-protected. Let's teach vma_wants_writenotify() that hugetlb does not support softdirty tracking. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 64e455079e1b ("mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared") Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: Peter Feiner <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Cyrill Gorcunov <[email protected]> Cc: Pavel Emelyanov <[email protected]> Cc: Jamie Liu <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Muchun Song <[email protected]> Cc: Peter Xu <[email protected]> Cc: <[email protected]> [3.18+] Signed-off-by: Andrew Morton <[email protected]>
2022-08-20	mm/uffd: reset write protection when unregister with wp-mode	Peter Xu	3	-11/+24
	The motivation of this patch comes from a recent report and patchfix from David Hildenbrand on hugetlb shared handling of wr-protected page [1]. With the reproducer provided in commit message of [1], one can leverage the uffd-wp lazy-reset of ptes to trigger a hugetlb issue which can affect not only the attacker process, but also the whole system. The lazy-reset mechanism of uffd-wp was used to make unregister faster, meanwhile it has an assumption that any leftover pgtable entries should only affect the process on its own, so not only the user should be aware of anything it does, but also it should not affect outside of the process. But it seems that this is not true, and it can also be utilized to make some exploit easier. So far there's no clue showing that the lazy-reset is important to any userfaultfd users because normally the unregister will only happen once for a specific range of memory of the lifecycle of the process. Considering all above, what this patch proposes is to do explicit pte resets when unregister an uffd region with wr-protect mode enabled. It should be the same as calling ioctl(UFFDIO_WRITEPROTECT, wp=false) right before ioctl(UFFDIO_UNREGISTER) for the user. So potentially it'll make the unregister slower. From that pov it's a very slight abi change, but hopefully nothing should break with this change either. Regarding to the change itself - core of uffd write [un]protect operation is moved into a separate function (uffd_wp_range()) and it is reused in the unregister code path. Note that the new function will not check for anything, e.g. ranges or memory types, because they should have been checked during the previous UFFDIO_REGISTER or it should have failed already. It also doesn't check mmap_changing because we're with mmap write lock held anyway. I added a Fixes upon introducing of uffd-wp shmem+hugetlbfs because that's the only issue reported so far and that's the commit David's reproducer will start working (v5.19+). But the whole idea actually applies to not only file memories but also anonymous. It's just that we don't need to fix anonymous prior to v5.19- because there's no known way to exploit. IOW, this patch can also fix the issue reported in [1] as the patch 2 does. [1] https://lore.kernel.org/all/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Fixes: b1f9e876862d ("mm/uffd: enable write protection for shmem & hugetlbfs") Signed-off-by: Peter Xu <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Axel Rasmussen <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-20	mm/smaps: don't access young/dirty bit if pte unpresent	Peter Xu	1	-3/+4
	These bits should only be valid when the ptes are present. Introducing two booleans for it and set it to false when !pte_present() for both pte and pmd accountings. The bug is found during code reading and no real world issue reported, but logically such an error can cause incorrect readings for either smaps or smaps_rollup output on quite a few fields. For example, it could cause over-estimate on values like Shared_Dirty, Private_Dirty, Referenced. Or it could also cause under-estimate on values like LazyFree, Shared_Clean, Private_Clean. Link: https://lkml.kernel.org/r/[email protected] Fixes: b1d4d9e0cbd0 ("proc/smaps: carefully handle migration entries") Fixes: c94b6923fa0a ("/proc/PID/smaps: Add PMD migration entry parsing") Signed-off-by: Peter Xu <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Yang Shi <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Huang Ying <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-20	mm: add DEVICE_ZONE to FOR_ALL_ZONES	Hao Lee	2	-5/+19
	FOR_ALL_ZONES should be consistent with enum zone_type. Otherwise, __count_zid_vm_events have the potential to add count to wrong item when zid is ZONE_DEVICE. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Hao Lee <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-20	kernel/sys_ni: add compat entry for fadvise64_64	Randy Dunlap	1	-0/+1
	When CONFIG_ADVISE_SYSCALLS is not set/enabled and CONFIG_COMPAT is set/enabled, the riscv compat_syscall_table references 'compat_sys_fadvise64_64', which is not defined: riscv64-linux-ld: arch/riscv/kernel/compat_syscall_table.o:(.rodata+0x6f8): undefined reference to `compat_sys_fadvise64_64' Add 'fadvise64_64' to kernel/sys_ni.c as a conditional COMPAT function so that when CONFIG_ADVISE_SYSCALLS is not set, there is a fallback function available. Link: https://lkml.kernel.org/r/[email protected] Fixes: d3ac21cacc24 ("mm: Support compiling out madvise and fadvise") Signed-off-by: Randy Dunlap <[email protected]> Suggested-by: Arnd Bergmann <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-20	mm/gup: fix FOLL_FORCE COW security issue and remove FOLL_COW	David Hildenbrand	3	-44/+89
	Ever since the Dirty COW (CVE-2016-5195) security issue happened, we know that FOLL_FORCE can be possibly dangerous, especially if there are races that can be exploited by user space. Right now, it would be sufficient to have some code that sets a PTE of a R/O-mapped shared page dirty, in order for it to erroneously become writable by FOLL_FORCE. The implications of setting a write-protected PTE dirty might not be immediately obvious to everyone. And in fact ever since commit 9ae0f87d009c ("mm/shmem: unconditionally set pte dirty in mfill_atomic_install_pte"), we can use UFFDIO_CONTINUE to map a shmem page R/O while marking the pte dirty. This can be used by unprivileged user space to modify tmpfs/shmem file content even if the user does not have write permissions to the file, and to bypass memfd write sealing -- Dirty COW restricted to tmpfs/shmem (CVE-2022-2590). To fix such security issues for good, the insight is that we really only need that fancy retry logic (FOLL_COW) for COW mappings that are not writable (!VM_WRITE). And in a COW mapping, we really only broke COW if we have an exclusive anonymous page mapped. If we have something else mapped, or the mapped anonymous page might be shared (!PageAnonExclusive), we have to trigger a write fault to break COW. If we don't find an exclusive anonymous page when we retry, we have to trigger COW breaking once again because something intervened. Let's move away from this mandatory-retry + dirty handling and rely on our PageAnonExclusive() flag for making a similar decision, to use the same COW logic as in other kernel parts here as well. In case we stumble over a PTE in a COW mapping that does not map an exclusive anonymous page, COW was not properly broken and we have to trigger a fake write-fault to break COW. Just like we do in can_change_pte_writable() added via commit 64fe24a3e05e ("mm/mprotect: try avoiding write faults for exclusive anonymous pages when changing protection") and commit 76aefad628aa ("mm/mprotect: fix soft-dirty check in can_change_pte_writable()"), take care of softdirty and uffd-wp manually. For example, a write() via /proc/self/mem to a uffd-wp-protected range has to fail instead of silently granting write access and bypassing the userspace fault handler. Note that FOLL_FORCE is not only used for debug access, but also triggered by applications without debug intentions, for example, when pinning pages via RDMA. This fixes CVE-2022-2590. Note that only x86_64 and aarch64 are affected, because only those support CONFIG_HAVE_ARCH_USERFAULTFD_MINOR. Fortunately, FOLL_COW is no longer required to handle FOLL_FORCE. So let's just get rid of it. Thanks to Nadav Amit for pointing out that the pte_dirty() check in FOLL_FORCE code is problematic and might be exploitable. Note 1: We don't check for the PTE being dirty because it doesn't matter for making a "was COWed" decision anymore, and whoever modifies the page has to set the page dirty either way. Note 2: Kernels before extended uffd-wp support and before PageAnonExclusive (< 5.19) can simply revert the problematic commit instead and be safe regarding UFFDIO_CONTINUE. A backport to v5.19 requires minor adjustments due to lack of vma_soft_dirty_enabled(). Link: https://lkml.kernel.org/r/[email protected] Fixes: 9ae0f87d009c ("mm/shmem: unconditionally set pte dirty in mfill_atomic_install_pte") Signed-off-by: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Axel Rasmussen <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Peter Xu <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: David Laight <[email protected]> Cc: <[email protected]> [5.16] Signed-off-by: Andrew Morton <[email protected]>
2022-08-20	Revert "zram: remove double compression logic"	Jiri Slaby	2	-10/+33
	This reverts commit e7be8d1dd983156b ("zram: remove double compression logic") as it causes zram failures. It does not revert cleanly, PTR_ERR handling was introduced in the meantime. This is handled by appropriate IS_ERR. When under memory pressure, zs_malloc() can fail. Before the above commit, the allocation was retried with direct reclaim enabled (GFP_NOIO). After the commit, it is not -- only __GFP_KSWAPD_RECLAIM is tried. So when the failure occurs under memory pressure, the overlaying filesystem such as ext2 (mounted by ext4 module in this case) can emit failures, making the (file)system unusable: EXT4-fs warning (device zram0): ext4_end_bio:343: I/O error 10 writing to inode 16386 starting block 159744) Buffer I/O error on device zram0, logical block 159744 With direct reclaim, memory is really reclaimed and allocation succeeds, eventually. In the worst case, the oom killer is invoked, which is proper outcome if user sets up zram too large (in comparison to available RAM). This very diff doesn't apply to 5.19 (stable) cleanly (see PTR_ERR note above). Use revert of e7be8d1dd983 directly. Link: https://bugzilla.suse.com/show_bug.cgi?id=1202203 Link: https://lkml.kernel.org/r/[email protected] Fixes: e7be8d1dd983 ("zram: remove double compression logic") Signed-off-by: Jiri Slaby <[email protected]> Reviewed-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Alexey Romanov <[email protected]> Cc: Dmitry Rokosov <[email protected]> Cc: Lukas Czerner <[email protected]> Cc: <[email protected]> [5.19] Signed-off-by: Andrew Morton <[email protected]>
2022-08-20	get_maintainer: add Alan to .get_maintainer.ignore	Dan Carpenter	1	-0/+2
	Alan asked to be added to the .get_maintainer.ignore list. Link: https://lkml.kernel.org/r/YvN30KhO9aD5Sza9@kili Signed-off-by: Dan Carpenter <[email protected]> Cc: Alan Cox <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-20	Merge tag 'kbuild-fixes-v6.0' of ↵	Linus Torvalds	5	-9/+5
	git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix module versioning broken on some architectures - Make dummy-tools enable CONFIG_PPC_LONG_DOUBLE_128 - Remove -Wformat-zero-length, which has no warning instance - Fix the order between drivers and libs in modules.order - Fix false-positive warnings in clang-analyzer * tag 'kbuild-fixes-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: scripts/clang-tools: Remove DeprecatedOrUnsafeBufferHandling check kbuild: fix the modules order between drivers and libs scripts/Makefile.extrawarn: Do not disable clang's -Wformat-zero-length kbuild: dummy-tools: pretend we understand __LONG_DOUBLE_128__ modpost: fix module versioning when a symbol lacks valid CRC
2022-08-20	Merge tag 'perf-tools-fixes-for-v6.0-2022-08-19' of ↵	Linus Torvalds	27	-235/+976
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix alignment for cpu map masks in event encoding. - Support reading PERF_FORMAT_LOST, perf tool counterpart for a feature that was added in this merge window. - Sync perf tools copies of kernel headers: socket, msr-index, fscrypt, cpufeatures, i915_drm, kvm, vhost, perf_event. * tag 'perf-tools-fixes-for-v6.0-2022-08-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf tools: Support reading PERF_FORMAT_LOST libperf: Add a test case for read formats libperf: Handle read format in perf_evsel__read() tools headers UAPI: Sync linux/perf_event.h with the kernel sources tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources tools headers UAPI: Sync KVM's vmx.h header with the kernel sources tools include UAPI: Sync linux/vhost.h with the kernel sources tools headers kvm s390: Sync headers with the kernel sources tools headers UAPI: Sync linux/kvm.h with the kernel sources tools headers UAPI: Sync drm/i915_drm.h with the kernel sources tools headers cpufeatures: Sync with the kernel sources tools headers UAPI: Sync linux/fscrypt.h with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources perf beauty: Update copy of linux/socket.h with the kernel sources perf cpumap: Fix alignment for masks in event encoding perf cpumap: Compute mask size in constant time perf cpumap: Synthetic events and const/static perf cpumap: Const map for max()
2022-08-20	Merge tag 's390-6.0-1' of ↵	Linus Torvalds	4	-2/+9
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Fix a KVM crash on z12 and older machines caused by a wrong assumption that Query AP Configuration Information is always available. - Lower severity of excessive Hypervisor filesystem error messages when booting under KVM. * tag 's390-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/ap: fix crash on older machines based on QCI info missing s390/hypfs: avoid error message under KVM
2022-08-20	Merge tag 'powerpc-6.0-3' of ↵	Linus Torvalds	3	-10/+44
	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix atomic sleep warnings at boot due to get_phb_number() taking a mutex with a spinlock held on some machines. - Add missing PMU selftests to .gitignores. Thanks to Guenter Roeck and Russell Currey. * tag 'powerpc-6.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Add missing PMU selftests to .gitignores powerpc/pci: Fix get_phb_number() locking
2022-08-20	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma	Linus Torvalds	7	-40/+42
	Pull rdma fixes from Jason Gunthorpe: "A few minor fixes: - Fix buffer management in SRP to correct a regression with the login authentication feature from v5.17 - Don't iterate over non-present ports in mlx5 - Fix an error introduced by the foritify work in cxgb4 - Two bug fixes for the recently merged ERDMA driver - Unbreak RDMA dmabuf support, a regresion from v5.19" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA: Handle the return code from dma_resv_wait_timeout() properly RDMA/erdma: Correct the max_qp and max_cq capacities of the device RDMA/erdma: Using the key in FMR WR instead of MR structure RDMA/cxgb4: fix accept failure due to increased cpl_t5_pass_accept_rpl size RDMA/mlx5: Use the proper number of ports IB/iser: Fix login with authentication
2022-08-21	scripts/clang-tools: Remove DeprecatedOrUnsafeBufferHandling check	Guru Das Srinagesh	1	-0/+1
	This `clang-analyzer` check flags the use of memset(), suggesting a more secure version of the API, such as memset_s(), which does not exist in the kernel: warning: Call to function 'memset' is insecure as it does not provide security checks introduced in the C11 standard. Replace with analogous functions that support length arguments or provides boundary checks such as 'memset_s' in case of C11 [clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling] Signed-off-by: Guru Das Srinagesh <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2022-08-21	kbuild: fix the modules order between drivers and libs	Masahiro Yamada	1	-4/+2
	Commit b2c885549122 ("kbuild: update modules.order only when contained modules are updated") accidentally changed the modules order. Prior to that commit, the modules order was determined based on vmlinux-dirs, which lists core-y/m, drivers-y/m, libs-y/m, in this order. Now, subdir-modorder lists them in a different order: core-y/m, libs-y/m, drivers-y/m. Presumably, there was no practical issue because the modules in drivers and libs are orthogonal, but there is no reason to have this distortion. Get back to the original order. Fixes: b2c885549122 ("kbuild: update modules.order only when contained modules are updated") Signed-off-by: Masahiro Yamada <[email protected]>
2022-08-21	scripts/Makefile.extrawarn: Do not disable clang's -Wformat-zero-length	Nathan Chancellor	1	-1/+0
	There are no instances of this warning in the tree across several difference architectures and configurations. This was added by commit 26ea6bb1fef0 ("kbuild, LLVMLinux: Supress warnings unless W=1-3") back in 2014, where it might have been necessary, but there are no instances of it now so stop disabling it to increase warning coverage for clang. Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2022-08-21	kbuild: dummy-tools: pretend we understand __LONG_DOUBLE_128__	Jiri Slaby	1	-1/+1
	There is a test in powerpc's Kconfig which checks __LONG_DOUBLE_128__ and sets CONFIG_PPC_LONG_DOUBLE_128 if it is understood by the compiler. We currently don't handle it, so this results in PPC_LONG_DOUBLE_128 not being in super-config generated by dummy-tools. So take this into account in the gcc script and preprocess __LONG_DOUBLE_128__ as "1". Signed-off-by: Jiri Slaby <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2022-08-21	modpost: fix module versioning when a symbol lacks valid CRC	Masahiro Yamada	1	-3/+1
	Since commit 7b4537199a4a ("kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS"), module versioning is broken on some architectures. Loading a module fails with "disagrees about version of symbol module_layout". On such architectures (e.g. ARCH=sparc build with sparc64_defconfig), modpost shows a warning, like follows: WARNING: modpost: EXPORT symbol "_mcount" [vmlinux] version generation failed, symbol will not be versioned. Is "_mcount" prototyped in <asm/asm-prototypes.h>? Previously, it was a harmless warning (CRC check was just skipped), but now wrong CRCs are used for comparison because invalid CRCs are just skipped. $ sparc64-linux-gnu-nm -n vmlinux [snip] 0000000000c2cea0 r __ksymtab__kstrtol 0000000000c2ceb8 r __ksymtab__kstrtoul 0000000000c2ced0 r __ksymtab__local_bh_enable 0000000000c2cee8 r __ksymtab__mcount 0000000000c2cf00 r __ksymtab__printk 0000000000c2cf18 r __ksymtab__raw_read_lock 0000000000c2cf30 r __ksymtab__raw_read_lock_bh [snip] 0000000000c53b34 D __crc__kstrtol 0000000000c53b38 D __crc__kstrtoul 0000000000c53b3c D __crc__local_bh_enable 0000000000c53b40 D __crc__printk 0000000000c53b44 D __crc__raw_read_lock 0000000000c53b48 D __crc__raw_read_lock_bh Please notice __crc__mcount is missing here. When the module subsystem looks up a CRC that comes after, it results in reading out a wrong address. For example, when __crc__printk is needed, the module subsystem reads 0xc53b44 instead of 0xc53b40. All CRC entries must be output for correct index accessing. Invalid CRCs will be unused, but are needed to keep the one-to-one mapping between __ksymtab_* and __crc_*. The best is to fix all modpost warnings, but several warnings are still remaining on less popular architectures. Fixes: 7b4537199a4a ("kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS") Reported-by: matoro <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]> Tested-by: matoro <[email protected]>
2022-08-20	Merge tag 'block-6.0-2022-08-19' of git://git.kernel.dk/linux-block	Linus Torvalds	3	-28/+28
	Pull block fixes from Jens Axboe: "A few fixes that should go into this release: - Small series of patches for ublk (ZiyangZhang) - Remove dead function (Yu) - Fix for running a block queue in case of resource starvation (Yufen)" * tag 'block-6.0-2022-08-19' of git://git.kernel.dk/linux-block: blk-mq: run queue no matter whether the request is the last request blk-mq: remove unused function blk_mq_queue_stopped() ublk_drv: do not add a re-issued request aborted previously to ioucmd's task_work ublk_drv: update comment for __ublk_fail_req() ublk_drv: check ubq_daemon_is_dying() in __ublk_rq_task_work() ublk_drv: update iod->addr for UBLK_IO_NEED_GET_DATA
2022-08-20	Merge tag 'io_uring-6.0-2022-08-19' of git://git.kernel.dk/linux-block	Linus Torvalds	2	-12/+12
	Pull io_uring fixes from Jens Axboe: "A few fixes for regressions in this cycle: - Two instances of using the wrong "has async data" helper (Pavel) - Fixup zero-copy address import (Pavel) - Bump zero-copy notification slot limit (Pavel)" * tag 'io_uring-6.0-2022-08-19' of git://git.kernel.dk/linux-block: io_uring/net: use right helpers for async_data io_uring/notif: raise limit on notification slots io_uring/net: improve zc addr import error handling io_uring/net: use right helpers for async recycle
2022-08-20	Merge tag 'ata-6.0-rc2' of ↵	Linus Torvalds	2	-1/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ATA fixes from Damien Le Moal: - Add a missing command name definition for ata_get_cmd_name(), from me. - A fix to address a performance regression due to the default max_sectors queue limit for ATA devices connected to AHCI adapters being too small, from John. * tag 'ata-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: libata: Set __ATA_BASE_SHT max_sectors ata: libata-eh: Add missing command name
2022-08-20	Merge tag 'mmc-v6.0-rc1' of ↵	Linus Torvalds	4	-6/+26
	git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC host fixes from Ulf Hansson: - meson-gx: Fix error handling in ->probe() - mtk-sd: Fix a command problem when using cqe off/disable - pxamci: Fix error handling in ->probe() - sdhci-of-dwcmshc: Fix broken support for the BlueField-3 variant * tag 'mmc-v6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-of-dwcmshc: Re-enable support for the BlueField-3 SoC mmc: meson-gx: Fix an error handling path in meson_mmc_probe() mmc: mtk-sd: Clear interrupts when cqe off/disable mmc: pxamci: Fix another error handling path in pxamci_probe() mmc: pxamci: Fix an error handling path in pxamci_probe()
2022-08-21	ata: libata: Set __ATA_BASE_SHT max_sectors	John Garry	1	-1/+2
	Commit 0568e6122574 ("ata: libata-scsi: cap ata_device->max_sectors according to shost->max_sectors") inadvertently capped the max_sectors value for some SATA disks to a value which is lower than we would want. For a device which supports LBA48, we would previously have request queue max_sectors_kb and max_hw_sectors_kb values of 1280 and 32767 respectively. For AHCI controllers, the value chosen for shost max sectors comes from the minimum of the SCSI host default max sectors in SCSI_DEFAULT_MAX_SECTORS (1024) and the shost DMA device mapping limit. This means that we would now set the max_sectors_kb and max_hw_sectors_kb values for a disk which supports LBA48 at 512, ignoring DMA mapping limit. As report by Oliver at [0], this caused a performance regression. Fix by picking a large enough max sectors value for ATA host controllers such that we don't needlessly reduce max_sectors_kb for LBA48 disks. [0] https://lore.kernel.org/linux-ide/YvsGbidf3na5FpGb@xsang-OptiPlex-9020/T/#m22d9fc5ad15af66066dd9fecf3d50f1b1ef11da3 Fixes: 0568e6122574 ("ata: libata-scsi: cap ata_device->max_sectors according to shost->max_sectors") Reported-by: Oliver Sang <[email protected]> Signed-off-by: John Garry <[email protected]> Signed-off-by: Damien Le Moal <[email protected]>
2022-08-19	SUNRPC: RPC level errors should set task->tk_rpc_status	Trond Myklebust	1	-1/+1
	Fix up a case in call_encode() where we're failing to set task->tk_rpc_status when an RPC level error occurred. Fixes: 9c5948c24869 ("SUNRPC: task should be exit if encode return EKEYEXPIRED more times") Signed-off-by: Trond Myklebust <[email protected]>
2022-08-19	NFSv4.2 fix problems with __nfs42_ssc_open	Olga Kornievskaia	1	-0/+6
	A destination server while doing a COPY shouldn't accept using the passed in filehandle if its not a regular filehandle. If alloc_file_pseudo() has failed, we need to decrement a reference on the newly created inode, otherwise it leaks. Reported-by: Al Viro <[email protected]> Fixes: ec4b092508982 ("NFS: inter ssc open") Signed-off-by: Olga Kornievskaia <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2022-08-19	NFS: unlink/rmdir shouldn't call d_delete() twice on ENOENT	NeilBrown	1	-1/+2
	nfs_unlink() calls d_delete() twice if it receives ENOENT from the server - once in nfs_dentry_handle_enoent() from nfs_safe_remove and once in nfs_dentry_remove_handle_error(). nfs_rmddir() also calls it twice - the nfs_dentry_handle_enoent() call is direct and inside a region locked with ->rmdir_sem It is safe to call d_delete() twice if the refcount > 1 as the dentry is simply unhashed. If the refcount is 1, the first call sets d_inode to NULL and the second call crashes. This patch guards the d_delete() call from nfs_dentry_handle_enoent() leaving the one under ->remdir_sem in case that is important. In mainline it would be safe to remove the d_delete() call. However in older kernels to which this might be backported, that would change the behaviour of nfs_unlink(). nfs_unlink() used to unhash the dentry which resulted in nfs_dentry_handle_enoent() not calling d_delete(). So in older kernels we need the d_delete() in nfs_dentry_remove_handle_error() when called from nfs_unlink() but not when called from nfs_rmdir(). To make the code work correctly for old and new kernels, and from both nfs_unlink() and nfs_rmdir(), we protect the d_delete() call with simple_positive(). This ensures it is never called in a circumstance where it could crash. Fixes: 3c59366c207e ("NFS: don't unhash dentry during unlink/rename") Fixes: 9019fb391de0 ("NFS: Label the dentry with a verifier in nfs_rmdir() and nfs_unlink()") Signed-off-by: NeilBrown <[email protected]> Tested-by: Olga Kornievskaia <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2022-08-19	selftests/vm: fix inability to build any vm tests	Axel Rasmussen	1	-0/+1
	When we stopped using KSFT_KHDR_INSTALL, a side effect is we also changed the value of `top_srcdir`. This can be seen by looking at the code removed by commit 49de12ba06ef ("selftests: drop KSFT_KHDR_INSTALL make target"). (Note though that this commit didn't break this, technically the one before it did since that's the one that stopped KSFT_KHDR_INSTALL from being used, even though the code was still there.) Previously lib.mk reconfigured `top_srcdir` when KSFT_KHDR_INSTALL was being used. Now, that's no longer the case. As a result, the path to gup_test.h in vm/Makefile was wrong, and since it's a dependency of all of the vm binaries none of them could be built. Instead, we'd get an "error" like: make[1]: *** No rule to make target '/[...]/tools/testing/selftests/vm/compaction_test', needed by 'all'. Stop. So, modify lib.mk so it once again sets top_srcdir to the root of the kernel tree. Fixes: f2745dc0ba3d ("selftests: stop using KSFT_KHDR_INSTALL") Signed-off-by: Axel Rasmussen <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2022-08-19	Revert "net: macsec: update SCI upon MAC address change."	Sabrina Dubroca	1	-6/+5
	This reverts commit 6fc498bc82929ee23aa2f35a828c6178dfd3f823. Commit 6fc498bc8292 states: SCI should be updated, because it contains MAC in its first 6 octets. That's not entirely correct. The SCI can be based on the MAC address, but doesn't have to be. We can also use any 64-bit number as the SCI. When the SCI based on the MAC address, it uses a 16-bit "port number" provided by userspace, which commit 6fc498bc8292 overwrites with 1. In addition, changing the SCI after macsec has been setup can just confuse the receiver. If we configure the RXSC on the peer based on the original SCI, we should keep the same SCI on TX. When the macsec device is being managed by a userspace key negotiation daemon such as wpa_supplicant, commit 6fc498bc8292 would also overwrite the SCI defined by userspace. Fixes: 6fc498bc8292 ("net: macsec: update SCI upon MAC address change.") Signed-off-by: Sabrina Dubroca <[email protected]> Link: https://lore.kernel.org/r/9b1a9d28327e7eb54550a92eebda45d25e54dd0d.1660667033.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <[email protected]>
2022-08-19	net: dpaa: Fix <1G ethernet on LS1046ARDB	Sean Anderson	1	-1/+5
	As discussed in commit 73a21fa817f0 ("dpaa_eth: support all modes with rate adapting PHYs"), we must add a workaround for Aquantia phys with in-tree support in order to keep 1G support working. Update this workaround for the AQR113C phy found on revision C LS1046ARDB boards. Fixes: 12cf1b89a668 ("net: phy: Add support for AQR113C EPHY") Signed-off-by: Sean Anderson <[email protected]> Acked-by: Camelia Groza <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-08-19	Merge tag 'execve-v6.0-rc2' of ↵	Linus Torvalds	1	-7/+7
	git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull execve fix from Kees Cook: - Replace remaining kmap() uses with kmap_local_page() (Fabio M. De Francesco) * tag 'execve-v6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: exec: Replace kmap{,_atomic}() with kmap_local_page()
2022-08-19	Merge tag 'hardening-v6.0-rc2' of ↵	Linus Torvalds	2	-5/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - Also undef LATENT_ENTROPY_PLUGIN for per-file disabling (Andrew Donnellan) - Return EFAULT on copy_from_user() failures in LoadPin (Kees Cook) * tag 'hardening-v6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: gcc-plugins: Undefine LATENT_ENTROPY_PLUGIN when plugin disabled for a file LoadPin: Return EFAULT on copy_from_user() failures
2022-08-19	Merge tag 'riscv-for-linus-6.0-rc2' of ↵	Linus Torvalds	2	-2/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A fix to make the ISA extension static keys writable after init. This manifests at least as a crash when loading modules (including KVM). - A fixup for a build warning related to a poorly formed comment in our perf driver. * tag 'riscv-for-linus-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: perf: riscv legacy: fix kerneldoc comment warning riscv: Ensure isa-ext static keys are writable
2022-08-19	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	25	-156/+157
	Pull kvm fixes from Paolo Bonzini: "ARM: - Fix unexpected sign extension of KVM_ARM_DEVICE_ID_MASK - Tidy-up handling of AArch32 on asymmetric systems x86: - Fix 'missing ENDBR' BUG for fastop functions Generic: - Some cleanup and static analyzer patches - More fixes to KVM_CREATE_VM unwind paths" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Drop unnecessary initialization of "ops" in kvm_ioctl_create_device() KVM: Drop unnecessary initialization of "npages" in hva_to_pfn_slow() x86/kvm: Fix "missing ENDBR" BUG for fastop functions x86/kvm: Simplify FOP_SETCC() x86/ibt, objtool: Add IBT_NOSEAL() KVM: Rename mmu_notifier_* to mmu_invalidate_* KVM: Rename KVM_PRIVATE_MEM_SLOTS to KVM_INTERNAL_MEM_SLOTS KVM: MIPS: remove unnecessary definition of KVM_PRIVATE_MEM_SLOTS KVM: Move coalesced MMIO initialization (back) into kvm_create_vm() KVM: Unconditionally get a ref to /dev/kvm module when creating a VM KVM: Properly unwind VM creation if creating debugfs fails KVM: arm64: Reject 32bit user PSTATE on asymmetric systems KVM: arm64: Treat PMCR_EL1.LC as RES1 on asymmetric systems KVM: arm64: Fix compile error due to sign extension
2022-08-19	Merge tag 'for-6.0-rc1-tag' of ↵	Linus Torvalds	12	-101/+176
	git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A few short fixes and a lockdep warning fix (needs moving some code): - tree-log replay fixes: - fix error handling when looking up extent refs - fix warning when setting inode number of links - relocation fixes: - reset block group read-only status when relocation fails - unset control structure if transaction fails when starting to process a block group - add lockdep annotations to fix a warning during relocation where blocks temporarily belong to another tree and can lead to reversed dependencies - tree-checker verifies that extent items don't overlap" * tag 'for-6.0-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: tree-checker: check for overlapping extent items btrfs: fix warning during log replay when bumping inode link count btrfs: fix lost error handling when looking up extended ref on log replay btrfs: fix lockdep splat with reloc root extent buffers btrfs: move lockdep class helpers to locking.c btrfs: unset reloc control if transaction commit fails in prepare_to_relocate() btrfs: reset RO counter on block group if we fail to relocate
2022-08-19	Merge tag '5.20-rc2-ksmbd-smb3-server-fixes' of git://git.samba.org/ksmbd	Linus Torvalds	5	-21/+39
	Pull ksmbd server fixes from Steve French: - important sparse file fix - allocation size fix - fix incorrect rc on bad share - share config fix * tag '5.20-rc2-ksmbd-smb3-server-fixes' of git://git.samba.org/ksmbd: ksmbd: don't remove dos attribute xattr on O_TRUNC open ksmbd: remove unnecessary generic_fillattr in smb2_open ksmbd: request update to stale share config ksmbd: return STATUS_BAD_NETWORK_NAME error status if share is not configured
2022-08-19	perf tools: Support reading PERF_FORMAT_LOST	Namhyung Kim	6	-42/+108
	The recent kernel added lost count can be read from either read(2) or ring buffer data with PERF_SAMPLE_READ. As it's a variable length data we need to access it according to the format info. But for perf tools use cases, PERF_FORMAT_ID is always set. So we can only check PERF_FORMAT_LOST bit to determine the data format. Add sample_read_value_size() and next_sample_read_value() helpers to make it a bit easier to access. Use them in all places where it reads the struct sample_read_value. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-08-19	libperf: Add a test case for read formats	Namhyung Kim	1	-0/+161
	It checks a various combination of the read format settings and verify it return the value in a proper position. The test uses task-clock software events to guarantee it's always active and sets enabled/running time. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-08-19	libperf: Handle read format in perf_evsel__read()	Namhyung Kim	3	-3/+83
	The perf_counts_values should be increased to read the new lost data. Also adjust values after read according the read format. This supports PERF_FORMAT_GROUP which has a different data format but it's only available for leader events. Currently it doesn't have an API to read sibling (member) events in the group. But users may read the sibling event directly. Also reading from mmap would be disabled when the read format has ID or LOST bit as it's not exposed via mmap. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-08-19	tools headers UAPI: Sync linux/perf_event.h with the kernel sources	Namhyung Kim	1	-1/+4
	To pick the trivial change in: 119a784c81270eb8 ("perf/core: Add a new read format to get a number of lost samples") Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-08-19	tools headers UAPI: Sync x86's asm/kvm.h with the kernel sources	Arnaldo Carvalho de Melo	1	-2/+8
	To pick the changes in: 43bb9e000ea4c621 ("KVM: x86: Tweak name of MONITOR/MWAIT #UD quirk to make it #UD specific") 94dfc73e7cf4a31d ("treewide: uapi: Replace zero-length arrays with flexible-array members") bfbcc81bb82cbbad ("KVM: x86: Add a quirk for KVM's "MONITOR/MWAIT are NOPs!" behavior") b172862241b48499 ("KVM: x86: PIT: Preserve state of speaker port data bit") ed2351174e38ad4f ("KVM: x86: Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple fault") That just rebuilds kvm-stat.c on x86, no change in functionality. This silences these perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h' diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h Cc: Chenyi Qiang <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Gustavo A. R. Silva <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Paul Durrant <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-08-19	tools headers UAPI: Sync KVM's vmx.h header with the kernel sources	Arnaldo Carvalho de Melo	1	-1/+3
	To pick the changes in: 2f4073e08f4cc5a4 ("KVM: VMX: Enable Notify VM exit") That makes 'perf kvm-stat' aware of this new NOTIFY exit reason, thus addressing the following perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/vmx.h' differs from latest version at 'arch/x86/include/uapi/asm/vmx.h' diff -u tools/arch/x86/include/uapi/asm/vmx.h arch/x86/include/uapi/asm/vmx.h Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Tao Xu <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>