aboutsummaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2020-12-16Merge branch 'parisc-5.11-1' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc updates from Helge Deller: "A change to increase the default maximum stack size on parisc to 100MB and the ability to further increase the stack hard limit size at runtime with ulimit for newly started processes. The other patches fix compile warnings, utilize the Kbuild logic and cleanups the parisc arch code" * 'parisc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: pci-dma: fix warning unused-function parisc/uapi: Use Kbuild logic to provide <asm/types.h> parisc: Make user stack size configurable parisc: Use _TIF_USER_WORK_MASK in entry.S parisc: Drop loops_per_jiffy from per_cpu struct
2020-12-16Merge tag 'seccomp-v5.11-rc1' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: "The major change here is finally gaining seccomp constant-action bitmaps, which internally reduces the seccomp overhead for many real-world syscall filters to O(1), as discussed at Plumbers this year. - Improve seccomp performance via constant-action bitmaps (YiFei Zhu & Kees Cook) - Fix bogus __user annotations (Jann Horn) - Add missed CONFIG for improved selftest coverage (Mickaël Salaün)" * tag 'seccomp-v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: selftests/seccomp: Update kernel config seccomp: Remove bogus __user annotations seccomp/cache: Report cache data through /proc/pid/seccomp_cache xtensa: Enable seccomp architecture tracking sh: Enable seccomp architecture tracking s390: Enable seccomp architecture tracking riscv: Enable seccomp architecture tracking powerpc: Enable seccomp architecture tracking parisc: Enable seccomp architecture tracking csky: Enable seccomp architecture tracking arm: Enable seccomp architecture tracking arm64: Enable seccomp architecture tracking selftests/seccomp: Compare bitmap vs filter overhead x86: Enable seccomp architecture tracking seccomp/cache: Add "emulator" to check if filter is constant allow seccomp/cache: Lookup syscall allowlist bitmap for fast path
2020-12-16Merge tag 'pstore-v5.11-rc1' of ↵Linus Torvalds6-76/+24
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore updates from Kees Cook: - Clean up unused but exposed API (Christoph Hellwig) - Provide KCONFIG for default size of kmsg buffer (Vasile-Laurentiu Stanimir) * tag 'pstore-v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: pstore: Move kmsg_bytes default into Kconfig pstore/blk: remove {un,}register_pstore_blk pstore/blk: update the command line example pstore/zone: cap the maximum device size
2020-12-16fs/lockd: convert comma to semicolonZheng Yongjun1-1/+1
Replace a comma between expression statements by a semicolon. Signed-off-by: Zheng Yongjun <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2020-12-16NFSv4.2: fix error return on memory allocation failureColin Ian King1-0/+1
Currently when an alloc_page fails the error return is not set in variable err and a garbage initialized value is returned. Fix this by setting err to -ENOMEM before taking the error return path. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: a1f26739ccdc ("NFSv4.2: improve page handling for GETXATTR") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2020-12-16writeback: don't warn on an unregistered BDI in __mark_inode_dirtyChristoph Hellwig1-4/+0
BDIs get unregistered during device removal, and this WARN can be trivially triggered by hot-removing a NVMe device while running fsx It is otherwise harmless as we still hold a BDI reference, and the writeback has been shut down already. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jan Kara <[email protected]>
2020-12-16Merge tag 'linux-kselftest-kunit-5.11-rc1' of ↵Linus Torvalds1-156/+164
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kunit updates from Shuah Khan: - documentation update and fix to kunit_tool to parse diagnostic messages correctly from David Gow - Support for Parameterized Testing and fs/ext4 test updates to use KUnit parameterized testing feature from Arpitha Raghunandan - Helper to derive file names depending on --build_dir argument from Andy Shevchenko * tag 'linux-kselftest-kunit-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: fs: ext4: Modify inode-test.c to use KUnit parameterized testing feature kunit: Support for Parameterized Testing kunit: kunit_tool: Correctly parse diagnostic messages Documentation: kunit: provide guidance for testing many inputs kunit: Introduce get_file_path() helper
2020-12-16cifs: correct four aliased mount parms to allow use of previous namesSteve French1-5/+10
The updates to the new mount API created aliases for some mount parms e.g. esize, idsfromsid, modefromsid, signloosely as "min_enc_offload", "setuidfromacl", "modesid", "ignore_signature" but did not add back in the original name expected by test cases and current users. It also had incorrect names for a few less used mount parms. Signed-off-by: Steve French <[email protected]> Reviewed-by: Ronnie Sahlberg <[email protected]>
2020-12-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds9-34/+71
Merge yet more updates from Andrew Morton: - lots of little subsystems - a few post-linux-next MM material. Most of the rest awaits more merging of other trees. Subsystems affected by this series: alpha, procfs, misc, core-kernel, bitmap, lib, lz4, checkpatch, nilfs, kdump, rapidio, gcov, bfs, relay, resource, ubsan, reboot, fault-injection, lzo, apparmor, and mm (swap, memory-hotplug, pagemap, cleanups, and gup). * emailed patches from Andrew Morton <[email protected]>: (86 commits) mm: fix some spelling mistakes in comments mm: simplify follow_pte{,pmd} mm: unexport follow_pte_pmd apparmor: remove duplicate macro list_entry_is_head() lib/lzo/lzo1x_compress.c: make lzogeneric1x_1_compress() static fault-injection: handle EI_ETYPE_TRUE reboot: hide from sysfs not applicable settings reboot: allow to override reboot type if quirks are found reboot: remove cf9_safe from allowed types and rename cf9_force reboot: allow to specify reboot mode via sysfs reboot: refactor and comment the cpu selection code lib/ubsan.c: mark type_check_kinds with static keyword kcov: don't instrument with UBSAN ubsan: expand tests and reporting ubsan: remove UBSAN_MISC in favor of individual options ubsan: enable for all*config builds ubsan: disable UBSAN_TRAP for all*config ubsan: disable object-size sanitizer under GCC ubsan: move cc-option tests into Kconfig ubsan: remove redundant -Wno-maybe-uninitialized ...
2020-12-15mm: simplify follow_pte{,pmd}Christoph Hellwig1-5/+4
Merge __follow_pte_pmd, follow_pte_pmd and follow_pte into a single follow_pte function and just pass two additional NULL arguments for the two previous follow_pte callers. [[email protected]: merge fix for "s390/pci: remove races against pte updates"] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dan Williams <[email protected]> Cc: Nick Desaulniers <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15bfs: don't use WARNING: string when it's just info.Randy Dunlap1-1/+1
Make the printk() [bfs "printf" macro] seem less severe by changing "WARNING:" to "NOTE:". <asm-generic/bug.h> warns us about using WARNING or BUG in a format string other than in WARN() or BUG() family macros. bfs/inode.c is doing just that in a normal printk() call, so change the "WARNING" string to be "NOTE". Link: https://lkml.kernel.org/r/[email protected] Reported-by: [email protected] Signed-off-by: Randy Dunlap <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Al Viro <[email protected]> Cc: "Tigran A. Aivazian" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15fs/nilfs2: remove some unused macros to tame gccAlex Shi1-5/+0
There some macros are unused and cause gcc warning. Remove them. fs/nilfs2/segment.c:137:0: warning: macro "nilfs_cnt32_gt" is not used [-Wunused-macros] fs/nilfs2/segment.c:144:0: warning: macro "nilfs_cnt32_le" is not used [-Wunused-macros] fs/nilfs2/segment.c:143:0: warning: macro "nilfs_cnt32_lt" is not used [-Wunused-macros] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryusuke Konishi <[email protected]> Signed-off-by: Alex Shi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15kernel.h: split out mathematical helpersAndy Shevchenko1-0/+5
kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out mathematical helpers. At the same time convert users in header and lib folder to use new header. Though for time being include new header back to kernel.h to avoid twisted indirected includes for existing users. [[email protected]: fix powerpc build] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: Trond Myklebust <[email protected]> Cc: Jeff Layton <[email protected]> Cc: Rasmus Villemoes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15fs/proc: make pde_get() return nothingHui Su1-2/+1
We don't need pde_get()'s return value, so make pde_get() return nothing Link: https://lkml.kernel.org/r/20201211061944.GA2387571@rlk Signed-off-by: Hui Su <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Eric W. Biederman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15proc: fix lookup in /proc/net subdirectories after setns(2)Alexey Dobriyan3-18/+29
Commit 1fde6f21d90f ("proc: fix /proc/net/* after setns(2)") only forced revalidation of regular files under /proc/net/ However, /proc/net/ is unusual in the sense of /proc/net/foo handlers take netns pointer from parent directory which is old netns. Steps to reproduce: (void)open("/proc/net/sctp/snmp", O_RDONLY); unshare(CLONE_NEWNET); int fd = open("/proc/net/sctp/snmp", O_RDONLY); read(fd, &c, 1); Read will read wrong data from original netns. Patch forces lookup on every directory under /proc/net . Link: https://lkml.kernel.org/r/[email protected] Fixes: 1da4d377f943 ("proc: revalidate misc dentries") Signed-off-by: Alexey Dobriyan <[email protected]> Reported-by: "Rantala, Tommi T. (Nokia - FI/Espoo)" <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15proc: provide details on indirect branch speculationAnand K Mistry1-0/+28
Similar to speculation store bypass, show information about the indirect branch speculation mode of a task in /proc/$pid/status. For testing/benchmarking, I needed to see whether IB (Indirect Branch) speculation (see Spectre-v2) is enabled on a task, to see whether an IBPB instruction should be executed on an address space switch. Unfortunately, this information isn't available anywhere else and currently the only way to get it is to hack the kernel to expose it (like this change). It also helped expose a bug with conditional IB speculation on certain CPUs. Another place this could be useful is to audit the system when using sanboxing. With this change, I can confirm that seccomp-enabled process have IB speculation force disabled as expected when the kernel command line parameter `spectre_v2_user=seccomp`. Since there's already a 'Speculation_Store_Bypass' field, I used that as precedent for adding this one. [[email protected]: remove underscores from field name to workaround documentation issue] Link: https://lkml.kernel.org/r/20201106131015.v2.1.I7782b0cedb705384a634cfd8898eb7523562da99@changeid Link: https://lkml.kernel.org/r/20201030172731.1.I7782b0cedb705384a634cfd8898eb7523562da99@changeid Signed-off-by: Anand K Mistry <[email protected]> Cc: Anthony Steinhauser <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Anand K Mistry <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Alexey Gladkov <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Kees Cook <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: NeilBrown <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15procfs: delete duplicated words + other fixesRandy Dunlap2-3/+3
Delete repeated words in fs/proc/. {the, which} where "which which" was changed to "with which". Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Randy Dunlap <[email protected]> Cc: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15Merge branch 'exec-update-lock-for-v5.11' of ↵Linus Torvalds2-11/+11
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull exec-update-lock update from Eric Biederman: "The key point of this is to transform exec_update_mutex into a rw_semaphore so readers can be separated from writers. This makes it easier to understand what the holders of the lock are doing, and makes it harder to contend or deadlock on the lock. The real deadlock fix wound up in perf_event_open" * 'exec-update-lock-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: exec: Transform exec_update_mutex into a rw_semaphore
2020-12-15Merge branch 'exec-for-v5.11' of ↵Linus Torvalds10-135/+109
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull execve updates from Eric Biederman: "This set of changes ultimately fixes the interaction of posix file lock and exec. Fundamentally most of the change is just moving where unshare_files is called during exec, and tweaking the users of files_struct so that the count of files_struct is not unnecessarily played with. Along the way fcheck and related helpers were renamed to more accurately reflect what they do. There were also many other small changes that fell out, as this is the first time in a long time much of this code has been touched. Benchmarks haven't turned up any practical issues but Al Viro has observed a possibility for a lot of pounding on task_lock. So I have some changes in progress to convert put_files_struct to always rcu free files_struct. That wasn't ready for the merge window so that will have to wait until next time" * 'exec-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits) exec: Move io_uring_task_cancel after the point of no return coredump: Document coredump code exclusively used by cell spufs file: Remove get_files_struct file: Rename __close_fd_get_file close_fd_get_file file: Replace ksys_close with close_fd file: Rename __close_fd to close_fd and remove the files parameter file: Merge __alloc_fd into alloc_fd file: In f_dupfd read RLIMIT_NOFILE once. file: Merge __fd_install into fd_install proc/fd: In fdinfo seq_show don't use get_files_struct bpf/task_iter: In task_file_seq_get_next use task_lookup_next_fd_rcu proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu file: Implement task_lookup_next_fd_rcu kcmp: In get_file_raw_ptr use task_lookup_fd_rcu proc/fd: In tid_fd_mode use task_lookup_fd_rcu file: Implement task_lookup_fd_rcu file: Rename fcheck lookup_fd_rcu file: Replace fcheck_files with files_lookup_fd_rcu file: Factor files_lookup_fd_locked out of fcheck_files file: Rename __fcheck_files to files_lookup_fd_raw ...
2020-12-15Merge tag 'close-range-openat2-v5.11' of ↵Linus Torvalds2-10/+38
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull close_range/openat2 updates from Christian Brauner: "This contains a fix for openat2() to make RESOLVE_BENEATH and RESOLVE_IN_ROOT mutually exclusive. It doesn't make sense to specify both at the same time. The openat2() selftests have been extended to verify that these two flags can't be specified together. This also adds the CLOSE_RANGE_CLOEXEC flag to close_range() which allows to mark a range of file descriptors as close-on-exec without actually closing them. This is useful in general but the use-case that triggered the patch is installing a seccomp profile in the calling task before exec. If the seccomp profile wants to block the close_range() syscall it obviously can't use it to close all fds before exec. If it calls close_range() before installing the seccomp profile it needs to take care not to close fds that it will still need before the exec meaning it would have to call close_range() multiple times on different ranges and then still fall back to closing fds one by one right before the exec. CLOSE_RANGE_CLOEXEC allows to solve this problem relying on the exec codepath to get rid of the unwanted fds. The close_range() tests have been expanded to verify that CLOSE_RANGE_CLOEXEC works" * tag 'close-range-openat2-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: selftests: core: add tests for CLOSE_RANGE_CLOEXEC fs, close_range: add flag CLOSE_RANGE_CLOEXEC selftests: openat2: add RESOLVE_ conflict test openat2: reject RESOLVE_BENEATH|RESOLVE_IN_ROOT
2020-12-15Merge branch 'work.epoll' of ↵Linus Torvalds2-416/+302
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull epoll updates from Al Viro: "Deal with epoll loop check/removal races sanely (among other things). The solution merged last cycle (pinning a bunch of struct file instances) had been forced by the wrong data structures; untangling that takes a bunch of preparations, but it's worth doing - control flow in there is ridiculously overcomplicated. Memory footprint has also gone down, while we are at it. This is not all I want to do in the area, but since I didn't get around to posting the followups they'll have to wait for the next cycle" * 'work.epoll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (27 commits) epoll: take epitem list out of struct file epoll: massage the check list insertion lift rcu_read_lock() into reverse_path_check() convert ->f_ep_links/->fllink to hlist ep_insert(): move creation of wakeup source past the fl_ep_links insertion fold ep_read_events_proc() into the only caller take the common part of ep_eventpoll_poll() and ep_item_poll() into helper ep_insert(): we only need tep->mtx around the insertion itself ep_insert(): don't open-code ep_remove() on failure exits lift locking/unlocking ep->mtx out of ep_{start,done}_scan() ep_send_events_proc(): fold into the caller lift the calls of ep_send_events_proc() into the callers lift the calls of ep_read_events_proc() into the callers ep_scan_ready_list(): prepare to splitup ep_loop_check_proc(): saner calling conventions get rid of ep_push_nested() ep_loop_check_proc(): lift pushing the cookie into callers clean reverse_path_check_proc() a bit reverse_path_check_proc(): don't bother with cookies reverse_path_check_proc(): sane arguments ...
2020-12-15Merge tag 'erofs-for-5.11-rc1' of ↵Linus Torvalds6-111/+149
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs updates from Gao Xiang: "This cycle we got rid of magical page->mapping type marks for temporary pages which had some concern before, now such usage is replaced with specific page->private. Also switch to inplace I/O instead of allocating extra cached pages to avoid direct reclaim under low memory scenario. There are some bmap bugfix and minor cleanups as well" * tag 'erofs-for-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: avoid using generic_block_bmap erofs: force inplace I/O under low memory scenario erofs: simplify try_to_claim_pcluster() erofs: insert to managed cache after adding to pcl erofs: get rid of magical Z_EROFS_MAPPING_STAGING erofs: remove a void EROFS_VERSION macro set in Makefile
2020-12-15Merge tag 'nfsd-5.11' of git://git.linux-nfs.org/projects/cel/cel-2.6Linus Torvalds33-1430/+1720
Pull nfsd updates from Chuck Lever: "Several substantial changes this time around: - Previously, exporting an NFS mount via NFSD was considered to be an unsupported feature. With v5.11, the community has attempted to make re-exporting a first-class feature of NFSD. This would enable the Linux in-kernel NFS server to be used as an intermediate cache for a remotely-located primary NFS server, for example, even with other NFS server implementations, like a NetApp filer, as the primary. - A short series of patches brings support for multiple RPC/RDMA data chunks per RPC transaction to the Linux NFS server's RPC/RDMA transport implementation. This is a part of the RPC/RDMA spec that the other premiere NFS/RDMA implementation (Solaris) has had for a very long time, and completes the implementation of RPC/RDMA version 1 in the Linux kernel's NFS server. - Long ago, NFSv4 support was introduced to NFSD using a series of C macros that hid dprintk's and goto's. Over time, the kernel's XDR implementation has been greatly improved, but these C macros have remained and become fallow. A series of patches in this pull request completely replaces those macros with the use of current kernel XDR infrastructure. Benefits include: - More robust input sanitization in NFSD's NFSv4 XDR decoders. - Make it easier to use common kernel library functions that use XDR stream APIs (for example, GSS-API). - Align the structure of the source code with the RFCs so it is easier to learn, verify, and maintain our XDR implementation. - Removal of more than a hundred hidden dprintk() call sites. - Removal of some explicit manipulation of pages to help make the eventual transition to xdr->bvec smoother. - On top of several related fixes in 5.10-rc, there are a few more fixes to get the Linux NFSD implementation of NFSv4.2 inter-server copy up to speed. And as usual, there is a pinch of seasoning in the form of a collection of unrelated minor bug fixes and clean-ups. Many thanks to all who contributed this time around!" * tag 'nfsd-5.11' of git://git.linux-nfs.org/projects/cel/cel-2.6: (131 commits) nfsd: Record NFSv4 pre/post-op attributes as non-atomic nfsd: Set PF_LOCAL_THROTTLE on local filesystems only nfsd: Fix up nfsd to ensure that timeout errors don't result in ESTALE exportfs: Add a function to return the raw output from fh_to_dentry() nfsd: close cached files prior to a REMOVE or RENAME that would replace target nfsd: allow filesystems to opt out of subtree checking nfsd: add a new EXPORT_OP_NOWCC flag to struct export_operations Revert "nfsd4: support change_attr_type attribute" nfsd4: don't query change attribute in v2/v3 case nfsd: minor nfsd4_change_attribute cleanup nfsd: simplify nfsd4_change_info nfsd: only call inode_query_iversion in the I_VERSION case nfs_common: need lock during iterate through the list NFSD: Fix 5 seconds delay when doing inter server copy NFSD: Fix sparse warning in nfs4proc.c SUNRPC: Remove XDRBUF_SPARSE_PAGES flag in gss_proxy upcall sunrpc: clean-up cache downcall nfsd: Fix message level for normal termination NFSD: Remove macros that are no longer used NFSD: Replace READ* macros in nfsd4_decode_compound() ...
2020-12-15Merge tag 'jfs-5.11' of git://github.com/kleikamp/linux-shaggyLinus Torvalds7-9/+13
Pull jfs updates from David Kleikamp: "A few jfs fixes" * tag 'jfs-5.11' of git://github.com/kleikamp/linux-shaggy: jfs: Fix array index bounds check in dbAdjTree jfs: Fix memleak in dbAdjCtl jfs: delete duplicated words + other fixes
2020-12-15Merge tag 'dlm-5.11' of ↵Linus Torvalds5-148/+168
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "This set includes more low level communication layer cleanups. The main change is the listening socket is no longer handled as a special case of node connection sockets. There is one small fix for checking the number of local connections" * tag 'dlm-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: fs: dlm: check on existing node address fs: dlm: constify addr_compare fs: dlm: fix check for multi-homed hosts fs: dlm: listen socket out of connection hash fs: dlm: refactor sctp sock parameter fs: dlm: move shutdown action to node creation fs: dlm: move connect callback in node creation fs: dlm: add helper for init connection fs: dlm: handle non blocked connect event fs: dlm: flush othercon at close fs: dlm: add get buffer error handling fs: dlm: define max send buffer fs: dlm: fix proper srcu api call
2020-12-15Merge tag 'for-5.11-tag' of ↵Linus Torvalds64-4312/+4488
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "We have a mix of all kinds of changes, feature updates, core stuff, performance improvements and lots of cleanups and preparatory changes. User visible: - export filesystem generation in sysfs - new features for mount option 'rescue': - what's currently supported is exported in sysfs - 'ignorebadroots'/'ibadroots' - continue even if some essential tree roots are not usable (extent, uuid, data reloc, device, csum, free space) - 'ignoredatacsums'/'idatacsums' - skip checksum verification on data - 'all' - now enables 'ignorebadroots' + 'ignoredatacsums' + 'nologreplay' - export read mirror policy settings to sysfs, new policies will be added in the future - remove inode number cache feature (mount -o inode_cache), obsoleted in 5.9 User visible fixes: - async discard scheduling fixes on high loads - update inode byte counter atomically so stat() does not report wrong value in some cases - free space tree fixes: - correctly report status of v2 after remount - clear v1 cache inodes when v2 is newly enabled after remount Core: - switch own tree lock implementation to standard rw semaphore: - one-level lock nesting is not required anymore, the last use of this was in free space that's now loaded asynchronously - own implementation of adaptive spinning before taking mutex has been part of rwsem - performance seems to be better in general, much better (+tens of percents) for some workloads - lockdep does not complain - finish direct IO conversion to iomap infrastructure, remove temporary workaround for DSYNC after iomap API updates - preparatory work to support data and metadata blocks smaller than page: - generalize code that assumes sectorsize == PAGE_SIZE, lots of refactoring - planned namely for 64K pages (eg. arm64, ppc64) - scrub read-only support - preparatory work for zoned allocation mode (SMR/ZBC/ZNS friendly): - disable incompatible features - round-robin superblock write - free space cache (v1) is loaded asynchronously, remove tree path recursion - slightly improved time tacking for transaction kthread wake ups Performance improvements (note that the numbers depend on load type or other features and weren't run on the same machine): - skip unnecessary work: - do not start readahead for csum tree when scrubbing non-data block groups - do not start and wait for delalloc on snapshot roots on transaction commit - fix race when defragmenting leads to unnecessary IO - dbench speedups (+throughput%/-max latency%): - skip unnecessary searches for xattrs when logging an inode (+10.8/-8.2) - stop incrementing log batch when joining log transaction (1-2) - unlock path before checking if extent is shared during nocow writeback (+5.0/-20.5), on fio load +9.7% throughput/-9.8% runtime - several tree log improvements, eg. removing unnecessary operations, fixing races that lead to additional work (+12.7/-8.2) - tree-checker error branches annotated with unlikely() (+3% throughput) Other: - cleanups - lockdep fixes - more btrfs_inode conversions - error variable cleanups" * tag 'for-5.11-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (198 commits) btrfs: scrub: allow scrub to work with subpage sectorsize btrfs: scrub: support subpage data scrub btrfs: scrub: support subpage tree block scrub btrfs: scrub: always allocate one full page for one sector for RAID56 btrfs: scrub: reduce width of extent_len/stripe_len from 64 to 32 bits btrfs: refactor btrfs_lookup_bio_sums to handle out-of-order bvecs btrfs: remove btrfs_find_ordered_sum call from btrfs_lookup_bio_sums btrfs: handle sectorsize < PAGE_SIZE case for extent buffer accessors btrfs: update num_extent_pages to support subpage sized extent buffer btrfs: don't allow tree block to cross page boundary for subpage support btrfs: calculate inline extent buffer page size based on page size btrfs: factor out btree page submission code to a helper btrfs: make btrfs_verify_data_csum follow sector size btrfs: pass bio_offset to check_data_csum() directly btrfs: rename bio_offset of extent_submit_bio_start_t to dio_file_offset btrfs: fix lockdep warning when creating free space tree btrfs: skip space_cache v1 setup when not using it btrfs: remove free space items when disabling space cache v1 btrfs: warn when remount will not change the free space tree btrfs: use superblock state to print space_cache mount option ...
2020-12-15Merge tag 'locks-v5.11' of ↵Linus Torvalds2-6/+8
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull file locking fixes from Jeff Layton: "A fix for some undefined integer overflow behavior, a typo in a comment header, and a fix for a potential deadlock involving internal senders of SIGIO/SIGURG" * tag 'locks-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: fcntl: Fix potential deadlock in send_sig{io, urg}() locks: fix a typo at a kernel-doc markup locks: Fix UBSAN undefined behaviour in flock64_to_posix_lock
2020-12-15cifs: Tracepoints and logs for tracing credit changes.Shyam Prasad N4-4/+64
There is at least one suspected bug in crediting changes in cifs.ko which has come up a few times in the discussions and in a customer case. This change adds tracepoints to the code which modifies the server credit values in any way. The goal is to be able to track the changes to the credit values of the session to be able to catch when there is a crediting bug. Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]>
2020-12-15cifs: fix use after free in cifs_smb3_do_mount()Ronnie Sahlberg1-1/+3
Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Ronnie Sahlberg <[email protected]> Signed-off-by: Steve French <[email protected]>
2020-12-15Merge tag 'driver-core-5.11-rc1' of ↵Linus Torvalds1-3/+2
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big driver core updates for 5.11-rc1 This time there was a lot of different work happening here for some reason: - redo of the fwnode link logic, speeding it up greatly - auxiliary bus added (this was a tag that will be pulled in from other trees/maintainers this merge window as well, as driver subsystems started to rely on it) - platform driver core cleanups on the way to fixing some long-time api updates in future releases - minor fixes and tweaks. All have been in linux-next with no (finally) reported issues. Testing there did helped in shaking issues out a lot :)" * tag 'driver-core-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (39 commits) driver core: platform: don't oops in platform_shutdown() on unbound devices ACPI: Use fwnode_init() to set up fwnode misc: pvpanic: Replace OF headers by mod_devicetable.h misc: pvpanic: Combine ACPI and platform drivers usb: host: sl811: Switch to use platform_get_mem_or_io() vfio: platform: Switch to use platform_get_mem_or_io() driver core: platform: Introduce platform_get_mem_or_io() dyndbg: fix use before null check soc: fix comment for freeing soc_dev_attr driver core: platform: use bus_type functions driver core: platform: change logic implementing platform_driver_probe driver core: platform: reorder functions driver core: make driver_probe_device() static driver core: Fix a couple of typos driver core: Reorder devices on successful probe driver core: Delete pointless parameter in fwnode_operations.add_links driver core: Refactor fw_devlink feature efi: Update implementation of add_links() to create fwnode links of: property: Update implementation of add_links() to create fwnode links driver core: Use device's fwnode to check if it is waiting for suppliers ...
2020-12-15Merge tag 'net-next-5.11' of ↵Linus Torvalds4-13/+13
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - support "prefer busy polling" NAPI operation mode, where we defer softirq for some time expecting applications to periodically busy poll - AF_XDP: improve efficiency by more batching and hindering the adjacency cache prefetcher - af_packet: make packet_fanout.arr size configurable up to 64K - tcp: optimize TCP zero copy receive in presence of partial or unaligned reads making zero copy a performance win for much smaller messages - XDP: add bulk APIs for returning / freeing frames - sched: support fragmenting IP packets as they come out of conntrack - net: allow virtual netdevs to forward UDP L4 and fraglist GSO skbs BPF: - BPF switch from crude rlimit-based to memcg-based memory accounting - BPF type format information for kernel modules and related tracing enhancements - BPF implement task local storage for BPF LSM - allow the FENTRY/FEXIT/RAW_TP tracing programs to use bpf_sk_storage Protocols: - mptcp: improve multiple xmit streams support, memory accounting and many smaller improvements - TLS: support CHACHA20-POLY1305 cipher - seg6: add support for SRv6 End.DT4/DT6 behavior - sctp: Implement RFC 6951: UDP Encapsulation of SCTP - ppp_generic: add ability to bridge channels directly - bridge: Connectivity Fault Management (CFM) support as is defined in IEEE 802.1Q section 12.14. Drivers: - mlx5: make use of the new auxiliary bus to organize the driver internals - mlx5: more accurate port TX timestamping support - mlxsw: - improve the efficiency of offloaded next hop updates by using the new nexthop object API - support blackhole nexthops - support IEEE 802.1ad (Q-in-Q) bridging - rtw88: major bluetooth co-existance improvements - iwlwifi: support new 6 GHz frequency band - ath11k: Fast Initial Link Setup (FILS) - mt7915: dual band concurrent (DBDC) support - net: ipa: add basic support for IPA v4.5 Refactor: - a few pieces of in_interrupt() cleanup work from Sebastian Andrzej Siewior - phy: add support for shared interrupts; get rid of multiple driver APIs and have the drivers write a full IRQ handler, slight growth of driver code should be compensated by the simpler API which also allows shared IRQs - add common code for handling netdev per-cpu counters - move TX packet re-allocation from Ethernet switch tag drivers to a central place - improve efficiency and rename nla_strlcpy - number of W=1 warning cleanups as we now catch those in a patchwork build bot Old code removal: - wan: delete the DLCI / SDLA drivers - wimax: move to staging - wifi: remove old WDS wifi bridging support" * tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1922 commits) net: hns3: fix expression that is currently always true net: fix proc_fs init handling in af_packet and tls nfc: pn533: convert comma to semicolon af_vsock: Assign the vsock transport considering the vsock address flags af_vsock: Set VMADDR_FLAG_TO_HOST flag on the receive path vsock_addr: Check for supported flag values vm_sockets: Add VMADDR_FLAG_TO_HOST vsock flag vm_sockets: Add flags field in the vsock address data structure net: Disable NETIF_F_HW_TLS_TX when HW_CSUM is disabled tcp: Add logic to check for SYN w/ data in tcp_simple_retransmit net: mscc: ocelot: install MAC addresses in .ndo_set_rx_mode from process context nfc: s3fwrn5: Release the nfc firmware net: vxget: clean up sparse warnings mlxsw: spectrum_router: Use eXtended mezzanine to offload IPv4 router mlxsw: spectrum: Set KVH XLT cache mode for Spectrum2/3 mlxsw: spectrum_router_xm: Introduce basic XM cache flushing mlxsw: reg: Add Router LPM Cache Enable Register mlxsw: reg: Add Router LPM Cache ML Delete Register mlxsw: spectrum_router_xm: Implement L-value tracking for M-index mlxsw: reg: Add XM Router M Table Register ...
2020-12-15cifs: fix rsize/wsize to be negotiated valuesSteve French2-8/+9
Also make sure these are displayed in /proc/mounts Signed-off-by: Steve French <[email protected]> Reviewed-by: Ronnie Sahlberg <[email protected]>
2020-12-15cifs: Fix some error pointers handling detected by static checkerSamuel Cabrero1-10/+11
* extract_hostname() and extract_sharename() never return NULL, so use IS_ERR() instead of IS_ERR_OR_NULL() in cifs_find_swn_reg(). If any of these functions return an error, then return an error pointer instead of NULL. * Change cifs_find_swn_reg() function to always return a valid pointer or an error pointer, instead of returning NULL if the registration is not found. * Finally update cifs_find_swn_reg() callers to check for -EEXIST instead of NULL. * In cifs_get_swn_reg() the swnreg idr mutex was not unlocked in the error path of cifs_find_swn_reg() call. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Samuel Cabrero <[email protected]> Reviewed-by: Dan Carpenter <[email protected]> Signed-off-by: Steve French <[email protected]>
2020-12-15smb3: remind users that witness protocol is experimentalSteve French1-0/+1
warn_once when using the witness protocol that it is experimental Signed-off-by: Steve French <[email protected]> Reviewed-by: Samuel Cabrero <[email protected]> Reviewed-by: Shyam Prasad N <[email protected]>
2020-12-15cifs: update super_operations to show_devnameSteve French1-0/+21
This is needed so that we display the correct //server/share vs \\server\share in /proc/mounts for the device name (in the new mount API). Signed-off-by: Steve French <[email protected]> Reviewed-by: Shyam Prasad N <[email protected]>
2020-12-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds9-18/+26
Merge misc updates from Andrew Morton: - a few random little subsystems - almost all of the MM patches which are staged ahead of linux-next material. I'll trickle to post-linux-next work in as the dependents get merged up. Subsystems affected by this patch series: kthread, kbuild, ide, ntfs, ocfs2, arch, and mm (slab-generic, slab, slub, dax, debug, pagecache, gup, swap, shmem, memcg, pagemap, mremap, hmm, vmalloc, documentation, kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction, oom-kill, migration, cma, page-poison, userfaultfd, zswap, zsmalloc, uaccess, zram, and cleanups). * emailed patches from Andrew Morton <[email protected]>: (200 commits) mm: cleanup kstrto*() usage mm: fix fall-through warnings for Clang mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at mm: shmem: convert shmem_enabled_show to use sysfs_emit_at mm:backing-dev: use sysfs_emit in macro defining functions mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening mm: use sysfs_emit for struct kobject * uses mm: fix kernel-doc markups zram: break the strict dependency from lzo zram: add stat to gather incompressible pages since zram set up zram: support page writeback mm/process_vm_access: remove redundant initialization of iov_r mm/zsmalloc.c: rework the list_add code in insert_zspage() mm/zswap: move to use crypto_acomp API for hardware acceleration mm/zswap: fix passing zero to 'PTR_ERR' warning mm/zswap: make struct kernel_param_ops definitions const userfaultfd/selftests: hint the test runner on required privilege userfaultfd/selftests: fix retval check for userfaultfd_open() userfaultfd/selftests: always dump something in modes userfaultfd: selftests: make __{s,u}64 format specifiers portable ...
2020-12-15userfaultfd: add user-mode only option to unprivileged_userfaultfd sysctl knobLokesh Gidra1-2/+8
With this change, when the knob is set to 0, it allows unprivileged users to call userfaultfd, like when it is set to 1, but with the restriction that page faults from only user-mode can be handled. In this mode, an unprivileged user (without SYS_CAP_PTRACE capability) must pass UFFD_USER_MODE_ONLY to userfaultd or the API will fail with EPERM. This enables administrators to reduce the likelihood that an attacker with access to userfaultfd can delay faulting kernel code to widen timing windows for other exploits. The default value of this knob is changed to 0. This is required for correct functioning of pipe mutex. However, this will fail postcopy live migration, which will be unnoticeable to the VM guests. To avoid this, set 'vm.userfault = 1' in /sys/sysctl.conf. The main reason this change is desirable as in the short term is that the Android userland will behave as with the sysctl set to zero. So without this commit, any Linux binary using userfaultfd to manage its memory would behave differently if run within the Android userland. For more details, refer to Andrea's reply [1]. [1] https://lore.kernel.org/lkml/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Lokesh Gidra <[email protected]> Reviewed-by: Andrea Arcangeli <[email protected]> Cc: Kees Cook <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Peter Xu <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Stephen Smalley <[email protected]> Cc: Eric Biggers <[email protected]> Cc: Daniel Colascione <[email protected]> Cc: "Joel Fernandes (Google)" <[email protected]> Cc: Kalesh Singh <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Jeff Vander Stoep <[email protected]> Cc: <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Shaohua Li <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Iurii Zaikin <[email protected]> Cc: Luis Chamberlain <[email protected]> Cc: Daniel Colascione <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15userfaultfd: add UFFD_USER_MODE_ONLYLokesh Gidra1-1/+9
Patch series "Control over userfaultfd kernel-fault handling", v6. This patch series is split from [1]. The other series enables SELinux support for userfaultfd file descriptors so that its creation and movement can be controlled. It has been demonstrated on various occasions that suspending kernel code execution for an arbitrary amount of time at any access to userspace memory (copy_from_user()/copy_to_user()/...) can be exploited to change the intended behavior of the kernel. For instance, handling page faults in kernel-mode using userfaultfd has been exploited in [2, 3]. Likewise, FUSE, which is similar to userfaultfd in this respect, has been exploited in [4, 5] for similar outcome. This small patch series adds a new flag to userfaultfd(2) that allows callers to give up the ability to handle kernel-mode faults with the resulting UFFD file object. It then adds a 'user-mode only' option to the unprivileged_userfaultfd sysctl knob to require unprivileged callers to use this new flag. The purpose of this new interface is to decrease the chance of an unprivileged userfaultfd user taking advantage of userfaultfd to enhance security vulnerabilities by lengthening the race window in kernel code. [1] https://lore.kernel.org/lkml/[email protected]/ [2] https://duasynt.com/blog/linux-kernel-heap-spray [3] https://duasynt.com/blog/cve-2016-6187-heap-off-by-one-exploit [4] https://googleprojectzero.blogspot.com/2016/06/exploiting-recursion-in-linux-kernel_20.html [5] https://bugs.chromium.org/p/project-zero/issues/detail?id=808 This patch (of 2): userfaultfd handles page faults from both user and kernel code. Add a new UFFD_USER_MODE_ONLY flag for userfaultfd(2) that makes the resulting userfaultfd object refuse to handle faults from kernel mode, treating these faults as if SIGBUS were always raised, causing the kernel code to fail with EFAULT. A future patch adds a knob allowing administrators to give some processes the ability to create userfaultfd file objects only if they pass UFFD_USER_MODE_ONLY, reducing the likelihood that these processes will exploit userfaultfd's ability to delay kernel page faults to open timing windows for future exploits. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Daniel Colascione <[email protected]> Signed-off-by: Lokesh Gidra <[email protected]> Reviewed-by: Andrea Arcangeli <[email protected]> Cc: Alexander Viro <[email protected]> Cc: <[email protected]> Cc: Daniel Colascione <[email protected]> Cc: Eric Biggers <[email protected]> Cc: Iurii Zaikin <[email protected]> Cc: Jeff Vander Stoep <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: "Joel Fernandes (Google)" <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Kalesh Singh <[email protected]> Cc: Kees Cook <[email protected]> Cc: Luis Chamberlain <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nitin Gupta <[email protected]> Cc: Peter Xu <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Cc: Shaohua Li <[email protected]> Cc: Stephen Smalley <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15arm: remove CONFIG_ARCH_HAS_HOLES_MEMORYMODELMike Rapoport1-2/+0
ARM is the only architecture that defines CONFIG_ARCH_HAS_HOLES_MEMORYMODEL which in turn enables memmap_valid_within() function that is intended to verify existence of struct page associated with a pfn when there are holes in the memory map. However, the ARCH_HAS_HOLES_MEMORYMODEL also enables HAVE_ARCH_PFN_VALID and arch-specific pfn_valid() implementation that also deals with the holes in the memory map. The only two users of memmap_valid_within() call this function after a call to pfn_valid() so the memmap_valid_within() check becomes redundant. Remove CONFIG_ARCH_HAS_HOLES_MEMORYMODEL and memmap_valid_within() and rely entirely on ARM's implementation of pfn_valid() that is now enabled unconditionally. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Greg Ungerer <[email protected]> Cc: John Paul Adrian Glaubitz <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Matt Turner <[email protected]> Cc: Meelis Roos <[email protected]> Cc: Michael Schmitz <[email protected]> Cc: Russell King <[email protected]> Cc: Tony Luck <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15mremap: don't allow MREMAP_DONTUNMAP on special_mappings and aioDmitry Safonov1-1/+4
As kernel expect to see only one of such mappings, any further operations on the VMA-copy may be unexpected by the kernel. Maybe it's being on the safe side, but there doesn't seem to be any expected use-case for this, so restrict it now. Link: https://lkml.kernel.org/r/[email protected] Fixes: commit e346b3813067 ("mm/mremap: add MREMAP_DONTUNMAP to mremap()") Signed-off-by: Dmitry Safonov <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Brian Geffon <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Jiang <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: John Hubbard <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Ralph Campbell <[email protected]> Cc: Russell King <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15mm: memcontrol: account pagetables per nodeShakeel Butt1-1/+1
For many workloads, pagetable consumption is significant and it makes sense to expose it in the memory.stat for the memory cgroups. However at the moment, the pagetables are accounted per-zone. Converting them to per-node and using the right interface will correctly account for the memory cgroups as well. [[email protected]: export __mod_lruvec_page_state to modules for arch/mips/kvm/] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Shakeel Butt <[email protected]> Acked-by: Johannes Weiner <[email protected]> Acked-by: Roman Gushchin <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15ocfs2: ratelimit the 'max lookup times reached' noticeMauricio Faria de Oliveira1-2/+2
Running stress-ng on ocfs2 completely fills the kernel log with 'max lookup times reached, filesystem may have nested directories.' Let's ratelimit this message as done with others in the code. Test-case: # mkfs.ocfs2 --mount local $DEV # mount $DEV $MNT # cd $MNT # dmesg -C # stress-ng --dirdeep 1 --dirdeep-ops 1000 # dmesg | grep -c 'max lookup times reached' Before: # dmesg -C # stress-ng --dirdeep 1 --dirdeep-ops 1000 ... stress-ng: info: [11116] successful run completed in 3.03s # dmesg | grep -c 'max lookup times reached' 967 After: # dmesg -C # stress-ng --dirdeep 1 --dirdeep-ops 1000 ... stress-ng: info: [739] successful run completed in 0.96s # dmesg | grep -c 'max lookup times reached' 10 # dmesg [ 259.086086] ocfs2_check_if_ancestor: 1990 callbacks suppressed [ 259.086092] (stress-ng-dirde,740,1):ocfs2_check_if_ancestor:1091 max lookup times reached, filesystem may have nested directories, src inode: 18007, dest inode: 17940. ... Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mauricio Faria de Oliveira <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15fs/ocfs2/cluster/tcp.c: remove unneeded breakTom Rix1-1/+0
A break is not needed if it is preceded by a goto Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Tom Rix <[email protected]> Acked-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15fs/ntfs: remove unused variable attr_lenAlex Shi1-2/+0
This variable isn't used anymore, remove it to skip W=1 warning: fs/ntfs/inode.c:2350:6: warning: variable `attr_len' set but not used [-Wunused-but-set-variable] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alex Shi <[email protected]> Acked-by: Anton Altaparmakov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15fs/ntfs: remove unused variblesAlex Shi2-6/+2
We actually don't use these varibles, so remove them to avoid gcc warning: fs/ntfs/file.c:326:14: warning: variable `base_ni' set but not used [-Wunused-but-set-variable] fs/ntfs/logfile.c:481:21: warning: variable `log_page_mask' set but not used [-Wunused-but-set-variable] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alex Shi <[email protected]> Acked-by: Anton Altaparmakov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15Merge tag 'kvmarm-5.11' of ↵Paolo Bonzini12-83/+26
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.11 - PSCI relay at EL2 when "protected KVM" is enabled - New exception injection code - Simplification of AArch32 system register handling - Fix PMU accesses when no PMU is enabled - Expose CSV3 on non-Meltdown hosts - Cache hierarchy discovery fixes - PV steal-time cleanups - Allow function pointers at EL2 - Various host EL2 entry cleanups - Simplification of the EL2 vector allocation
2020-12-14Merge tag 'core-mm-2020-12-14' of ↵Linus Torvalds2-2/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull kmap updates from Thomas Gleixner: "The new preemtible kmap_local() implementation: - Consolidate all kmap_atomic() internals into a generic implementation which builds the base for the kmap_local() API and make the kmap_atomic() interface wrappers which handle the disabling/enabling of preemption and pagefaults. - Switch the storage from per-CPU to per task and provide scheduler support for clearing mapping when scheduling out and restoring them when scheduling back in. - Merge the migrate_disable/enable() code, which is also part of the scheduler pull request. This was required to make the kmap_local() interface available which does not disable preemption when a mapping is established. It has to disable migration instead to guarantee that the virtual address of the mapped slot is the same across preemption. - Provide better debug facilities: guard pages and enforced utilization of the mapping mechanics on 64bit systems when the architecture allows it. - Provide the new kmap_local() API which can now be used to cleanup the kmap_atomic() usage sites all over the place. Most of the usage sites do not require the implicit disabling of preemption and pagefaults so the penalty on 64bit and 32bit non-highmem systems is removed and quite some of the code can be simplified. A wholesale conversion is not possible because some usage depends on the implicit side effects and some need to be cleaned up because they work around these side effects. The migrate disable side effect is only effective on highmem systems and when enforced debugging is enabled. On 64bit and 32bit non-highmem systems the overhead is completely avoided" * tag 'core-mm-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits) ARM: highmem: Fix cache_is_vivt() reference x86/crashdump/32: Simplify copy_oldmem_page() io-mapping: Provide iomap_local variant mm/highmem: Provide kmap_local* sched: highmem: Store local kmaps in task struct x86: Support kmap_local() forced debugging mm/highmem: Provide CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP mm/highmem: Provide and use CONFIG_DEBUG_KMAP_LOCAL microblaze/mm/highmem: Add dropped #ifdef back xtensa/mm/highmem: Make generic kmap_atomic() work correctly mm/highmem: Take kmap_high_get() properly into account highmem: High implementation details and document API Documentation/io-mapping: Remove outdated blurb io-mapping: Cleanup atomic iomap mm/highmem: Remove the old kmap_atomic cruft highmem: Get rid of kmap_types.h xtensa/mm/highmem: Switch to generic kmap atomic sparc/mm/highmem: Switch to generic kmap atomic powerpc/mm/highmem: Switch to generic kmap atomic nds32/mm/highmem: Switch to generic kmap atomic ...
2020-12-14Merge tag 'sched-core-2020-12-14' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Thomas Gleixner: - migrate_disable/enable() support which originates from the RT tree and is now a prerequisite for the new preemptible kmap_local() API which aims to replace kmap_atomic(). - A fair amount of topology and NUMA related improvements - Improvements for the frequency invariant calculations - Enhanced robustness for the global CPU priority tracking and decision making - The usual small fixes and enhancements all over the place * tag 'sched-core-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (61 commits) sched/fair: Trivial correction of the newidle_balance() comment sched/fair: Clear SMT siblings after determining the core is not idle sched: Fix kernel-doc markup x86: Print ratio freq_max/freq_base used in frequency invariance calculations x86, sched: Use midpoint of max_boost and max_P for frequency invariance on AMD EPYC x86, sched: Calculate frequency invariance for AMD systems irq_work: Optimize irq_work_single() smp: Cleanup smp_call_function*() irq_work: Cleanup sched: Limit the amount of NUMA imbalance that can exist at fork time sched/numa: Allow a floating imbalance between NUMA nodes sched: Avoid unnecessary calculation of load imbalance at clone time sched/numa: Rename nr_running and break out the magic number sched: Make migrate_disable/enable() independent of RT sched/topology: Condition EAS enablement on FIE support arm64: Rebuild sched domains on invariance status changes sched/topology,schedutil: Wrap sched domains rebuild sched/uclamp: Allow to reset a task uclamp constraint value sched/core: Fix typos in comments Documentation: scheduler: fix information on arch SD flags, sched_domain and sched_debug ...
2020-12-14Merge tag 'core-entry-2020-12-14' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core entry/exit updates from Thomas Gleixner: "A set of updates for entry/exit handling: - More generalization of entry/exit functionality - The consolidation work to reclaim TIF flags on x86 and also for non-x86 specific TIF flags which are solely relevant for syscall related work and have been moved into their own storage space. The x86 specific part had to be merged in to avoid a major conflict. - The TIF_NOTIFY_SIGNAL work which replaces the inefficient signal delivery mode of task work and results in an impressive performance improvement for io_uring. The non-x86 consolidation of this is going to come seperate via Jens. - The selective syscall redirection facility which provides a clean and efficient way to support the non-Linux syscalls of WINE by catching them at syscall entry and redirecting them to the user space emulation. This can be utilized for other purposes as well and has been designed carefully to avoid overhead for the regular fastpath. This includes the core changes and the x86 support code. - Simplification of the context tracking entry/exit handling for the users of the generic entry code which guarantee the proper ordering and protection. - Preparatory changes to make the generic entry code accomodate S390 specific requirements which are mostly related to their syscall restart mechanism" * tag 'core-entry-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) entry: Add syscall_exit_to_user_mode_work() entry: Add exit_to_user_mode() wrapper entry_Add_enter_from_user_mode_wrapper entry: Rename exit_to_user_mode() entry: Rename enter_from_user_mode() docs: Document Syscall User Dispatch selftests: Add benchmark for syscall user dispatch selftests: Add kselftest for syscall user dispatch entry: Support Syscall User Dispatch on common syscall entry kernel: Implement selective syscall userspace redirection signal: Expose SYS_USER_DISPATCH si_code type x86: vdso: Expose sigreturn address on vdso to the kernel MAINTAINERS: Add entry for common entry code entry: Fix boot for !CONFIG_GENERIC_ENTRY x86: Support HAVE_CONTEXT_TRACKING_OFFSTACK context_tracking: Only define schedule_user() on !HAVE_CONTEXT_TRACKING_OFFSTACK archs sched: Detect call to schedule from critical entry code context_tracking: Don't implement exception_enter/exit() on CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK context_tracking: Introduce HAVE_CONTEXT_TRACKING_OFFSTACK x86: Reclaim unused x86 TI flags ...
2020-12-14Merge tag 'fixes-v5.11' of ↵Linus Torvalds2-4/+3
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull misc fixes from Christian Brauner: "This contains several fixes which felt worth being combined into a single branch: - Use put_nsproxy() instead of open-coding it switch_task_namespaces() - Kirill's work to unify lifecycle management for all namespaces. The lifetime counters are used identically for all namespaces types. Namespaces may of course have additional unrelated counters and these are not altered. This work allows us to unify the type of the counters and reduces maintenance cost by moving the counter in one place and indicating that basic lifetime management is identical for all namespaces. - Peilin's fix adding three byte padding to Dmitry's PTRACE_GET_SYSCALL_INFO uapi struct to prevent an info leak. - Two smal patches to convert from the /* fall through */ comment annotation to the fallthrough keyword annotation which I had taken into my branch and into -next before df561f6688fe ("treewide: Use fallthrough pseudo-keyword") made it upstream which fixed this tree-wide. Since I didn't want to invalidate all testing for other commits I didn't rebase and kept them" * tag 'fixes-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: nsproxy: use put_nsproxy() in switch_task_namespaces() sys: Convert to the new fallthrough notation signal: Convert to the new fallthrough notation time: Use generic ns_common::count cgroup: Use generic ns_common::count mnt: Use generic ns_common::count user: Use generic ns_common::count pid: Use generic ns_common::count ipc: Use generic ns_common::count uts: Use generic ns_common::count net: Use generic ns_common::count ns: Add a common refcount into ns_common ptrace: Prevent kernel-infoleak in ptrace_get_syscall_info()