aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-04-30ucounts: Silence warning in dec_rlimit_ucountsEric W. Biederman1-1/+1
Dan Carpenter <[email protected]> wrote: > > url: https://github.com/0day-ci/linux/commits/legion-kernel-org/Count-rlimits-in-each-user-namespace/20210427-162857 > base: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git next > config: arc-randconfig-m031-20210426 (attached as .config) > compiler: arceb-elf-gcc (GCC) 9.3.0 > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot <[email protected]> > Reported-by: Dan Carpenter <[email protected]> > > smatch warnings: > kernel/ucount.c:270 dec_rlimit_ucounts() error: uninitialized symbol 'new'. > > vim +/new +270 kernel/ucount.c > > 176ec2b092cc22 Alexey Gladkov 2021-04-22 260 bool dec_rlimit_ucounts(struct ucounts *ucounts, enum ucount_type type, long v) > 176ec2b092cc22 Alexey Gladkov 2021-04-22 261 { > 176ec2b092cc22 Alexey Gladkov 2021-04-22 262 struct ucounts *iter; > 176ec2b092cc22 Alexey Gladkov 2021-04-22 263 long new; > ^^^^^^^^ > > 176ec2b092cc22 Alexey Gladkov 2021-04-22 264 for (iter = ucounts; iter; iter = iter->ns->ucounts) { > 176ec2b092cc22 Alexey Gladkov 2021-04-22 265 long dec = atomic_long_add_return(-v, &iter->ucount[type]); > 176ec2b092cc22 Alexey Gladkov 2021-04-22 266 WARN_ON_ONCE(dec < 0); > 176ec2b092cc22 Alexey Gladkov 2021-04-22 267 if (iter == ucounts) > 176ec2b092cc22 Alexey Gladkov 2021-04-22 268 new = dec; > 176ec2b092cc22 Alexey Gladkov 2021-04-22 269 } > 176ec2b092cc22 Alexey Gladkov 2021-04-22 @270 return (new == 0); > ^^^^^^^^ > I don't know if this is a bug or not, but I can definitely tell why the > static checker complains about it. > > 176ec2b092cc22 Alexey Gladkov 2021-04-22 271 } In the only two cases that care about the return value of dec_rlimit_ucounts the code first tests to see that ucounts is not NULL. In those cases it is guaranteed at least one iteration of the loop will execute guaranteeing the variable new will be initialized. Initialize new to -1 so that the return value is well defined even when the loop does not execute and the static checker is silenced. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: "Eric W. Biederman" <[email protected]>
2021-04-30ucounts: Count rlimits in each user namespaceEric W. Biederman29-127/+467
Preface ------- These patches are for binding the rlimit counters to a user in user namespace. This patch set can be applied on top of: git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git v5.12-rc4 Problem ------- The RLIMIT_NPROC, RLIMIT_MEMLOCK, RLIMIT_SIGPENDING, RLIMIT_MSGQUEUE rlimits implementation places the counters in user_struct [1]. These limits are global between processes and persists for the lifetime of the process, even if processes are in different user namespaces. To illustrate the impact of rlimits, let's say there is a program that does not fork. Some service-A wants to run this program as user X in multiple containers. Since the program never fork the service wants to set RLIMIT_NPROC=1. service-A \- program (uid=1000, container1, rlimit_nproc=1) \- program (uid=1000, container2, rlimit_nproc=1) The service-A sets RLIMIT_NPROC=1 and runs the program in container1. When the service-A tries to run a program with RLIMIT_NPROC=1 in container2 it fails since user X already has one running process. The problem is not that the limit from container1 affects container2. The problem is that limit is verified against the global counter that reflects the number of processes in all containers. This problem can be worked around by using different users for each container but in this case we face a different problem of uid mapping when transferring files from one container to another. Eric W. Biederman mentioned this issue [2][3]. Introduced changes ------------------ To address the problem, we bind rlimit counters to user namespace. Each counter reflects the number of processes in a given uid in a given user namespace. The result is a tree of rlimit counters with the biggest value at the root (aka init_user_ns). The limit is considered exceeded if it's exceeded up in the tree. [1]: https://lore.kernel.org/containers/[email protected]/ [2]: https://lists.linuxfoundation.org/pipermail/containers/2020-August/042096.html [3]: https://lists.linuxfoundation.org/pipermail/containers/2020-October/042524.html Changelog --------- v11: * Revert most of changes in signal.c to fix performance issues and remove unnecessary memory allocations. * Fixed issue found by lkp robot (again). v10: * Fixed memory leak in __sigqueue_alloc. * Handled an unlikely situation when all consumers will return ucounts at once. * Addressed other review comments from Eric W. Biederman. v9: * Used a negative value to check that the ucounts->count is close to overflow. * Rebased onto v5.12-rc4. v8: * Used atomic_t for ucounts reference counting. Also added counter overflow check (thanks to Linus Torvalds for the idea). * Fixed other issues found by lkp-tests project in the patch that Reimplements RLIMIT_MEMLOCK on top of ucounts. v7: * Fixed issues found by lkp-tests project in the patch that Reimplements RLIMIT_MEMLOCK on top of ucounts. v6: * Fixed issues found by lkp-tests project. * Rebased onto v5.11. v5: * Split the first commit into two commits: change ucounts.count type to atomic_long_t and add ucounts to cred. These commits were merged by mistake during the rebase. * The __get_ucounts() renamed to alloc_ucounts(). * The cred.ucounts update has been moved from commit_creds() as it did not allow to handle errors. * Added error handling of set_cred_ucounts(). v4: * Reverted the type change of ucounts.count to refcount_t. * Fixed typo in the kernel/cred.c v3: * Added get_ucounts() function to increase the reference count. The existing get_counts() function renamed to __get_ucounts(). * The type of ucounts.count changed from atomic_t to refcount_t. * Dropped 'const' from set_cred_ucounts() arguments. * Fixed a bug with freeing the cred structure after calling cred_alloc_blank(). * Commit messages have been updated. * Added selftest. v2: * RLIMIT_MEMLOCK, RLIMIT_SIGPENDING and RLIMIT_MSGQUEUE are migrated to ucounts. * Added ucounts for pair uid and user namespace into cred. * Added the ability to increase ucount by more than 1. v1: * After discussion with Eric W. Biederman, I increased the size of ucounts to atomic_long_t. * Added ucount_max to avoid the fork bomb. -- Alexey Gladkov (9): Increase size of ucounts to atomic_long_t Add a reference to ucounts for each cred Use atomic_t for ucounts reference counting Reimplement RLIMIT_NPROC on top of ucounts Reimplement RLIMIT_MSGQUEUE on top of ucounts Reimplement RLIMIT_SIGPENDING on top of ucounts Reimplement RLIMIT_MEMLOCK on top of ucounts kselftests: Add test to check for rlimit changes in different user namespaces ucounts: Set ucount_max to the largest positive value the type can hold fs/exec.c | 6 +- fs/hugetlbfs/inode.c | 16 +- fs/proc/array.c | 2 +- include/linux/cred.h | 4 + include/linux/hugetlb.h | 4 +- include/linux/mm.h | 4 +- include/linux/sched/user.h | 7 - include/linux/shmem_fs.h | 2 +- include/linux/signal_types.h | 4 +- include/linux/user_namespace.h | 31 +++- ipc/mqueue.c | 40 ++--- ipc/shm.c | 26 +-- kernel/cred.c | 50 +++++- kernel/exit.c | 2 +- kernel/fork.c | 18 +- kernel/signal.c | 25 +-- kernel/sys.c | 14 +- kernel/ucount.c | 116 ++++++++++--- kernel/user.c | 3 - kernel/user_namespace.c | 9 +- mm/memfd.c | 4 +- mm/mlock.c | 22 ++- mm/mmap.c | 4 +- mm/shmem.c | 10 +- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/rlimits/.gitignore | 2 + tools/testing/selftests/rlimits/Makefile | 6 + tools/testing/selftests/rlimits/config | 1 + .../selftests/rlimits/rlimits-per-userns.c | 161 ++++++++++++++++++ 29 files changed, 467 insertions(+), 127 deletions(-) create mode 100644 tools/testing/selftests/rlimits/.gitignore create mode 100644 tools/testing/selftests/rlimits/Makefile create mode 100644 tools/testing/selftests/rlimits/config create mode 100644 tools/testing/selftests/rlimits/rlimits-per-userns.c Link: https://lkml.kernel.org/r/[email protected]
2021-04-30ucounts: Set ucount_max to the largest positive value the type can holdAlexey Gladkov3-8/+14
The ns->ucount_max[] is signed long which is less than the rlimit size. We have to protect ucount_max[] from overflow and only use the largest value that we can hold. On 32bit using "long" instead of "unsigned long" to hold the counts has the downside that RLIMIT_MSGQUEUE and RLIMIT_MEMLOCK are limited to 2GiB instead of 4GiB. I don't think anyone cares but it should be mentioned in case someone does. The RLIMIT_NPROC and RLIMIT_SIGPENDING used atomic_t so their maximum hasn't changed. Signed-off-by: Alexey Gladkov <[email protected]> Link: https://lkml.kernel.org/r/1825a5dfa18bc5a570e79feb05e2bd07fd57e7e3.1619094428.git.legion@kernel.org Signed-off-by: Eric W. Biederman <[email protected]>
2021-04-30kselftests: Add test to check for rlimit changes in different user namespacesAlexey Gladkov5-0/+171
The testcase runs few instances of the program with RLIMIT_NPROC=1 from user uid=60000, in different user namespaces. Signed-off-by: Alexey Gladkov <[email protected]> Link: https://lkml.kernel.org/r/28cafdcdd4abd8494b34a27f1970b666b30de8bf.1619094428.git.legion@kernel.org Signed-off-by: Eric W. Biederman <[email protected]>
2021-04-30Reimplement RLIMIT_MEMLOCK on top of ucountsAlexey Gladkov15-45/+53
The rlimit counter is tied to uid in the user_namespace. This allows rlimit values to be specified in userns even if they are already globally exceeded by the user. However, the value of the previous user_namespaces cannot be exceeded. Changelog v11: * Fix issue found by lkp robot. v8: * Fix issues found by lkp-tests project. v7: * Keep only ucounts for RLIMIT_MEMLOCK checks instead of struct cred. v6: * Fix bug in hugetlb_file_setup() detected by trinity. Reported-by: kernel test robot <[email protected]> Reported-by: kernel test robot <[email protected]> Signed-off-by: Alexey Gladkov <[email protected]> Link: https://lkml.kernel.org/r/970d50c70c71bfd4496e0e8d2a0a32feebebb350.1619094428.git.legion@kernel.org Signed-off-by: Eric W. Biederman <[email protected]>
2021-04-30Reimplement RLIMIT_SIGPENDING on top of ucountsAlexey Gladkov9-16/+21
The rlimit counter is tied to uid in the user_namespace. This allows rlimit values to be specified in userns even if they are already globally exceeded by the user. However, the value of the previous user_namespaces cannot be exceeded. Changelog v11: * Revert most of changes to fix performance issues. v10: * Fix memory leak on get_ucounts failure. Signed-off-by: Alexey Gladkov <[email protected]> Link: https://lkml.kernel.org/r/df9d7764dddd50f28616b7840de74ec0f81711a8.1619094428.git.legion@kernel.org Signed-off-by: Eric W. Biederman <[email protected]>
2021-04-30Reimplement RLIMIT_MSGQUEUE on top of ucountsAlexey Gladkov6-23/+25
The rlimit counter is tied to uid in the user_namespace. This allows rlimit values to be specified in userns even if they are already globally exceeded by the user. However, the value of the previous user_namespaces cannot be exceeded. Signed-off-by: Alexey Gladkov <[email protected]> Link: https://lkml.kernel.org/r/2531f42f7884bbfee56a978040b3e0d25cdf6cde.1619094428.git.legion@kernel.org Signed-off-by: Eric W. Biederman <[email protected]>
2021-04-30Reimplement RLIMIT_NPROC on top of ucountsAlexey Gladkov11-15/+73
The rlimit counter is tied to uid in the user_namespace. This allows rlimit values to be specified in userns even if they are already globally exceeded by the user. However, the value of the previous user_namespaces cannot be exceeded. To illustrate the impact of rlimits, let's say there is a program that does not fork. Some service-A wants to run this program as user X in multiple containers. Since the program never fork the service wants to set RLIMIT_NPROC=1. service-A \- program (uid=1000, container1, rlimit_nproc=1) \- program (uid=1000, container2, rlimit_nproc=1) The service-A sets RLIMIT_NPROC=1 and runs the program in container1. When the service-A tries to run a program with RLIMIT_NPROC=1 in container2 it fails since user X already has one running process. We cannot use existing inc_ucounts / dec_ucounts because they do not allow us to exceed the maximum for the counter. Some rlimits can be overlimited by root or if the user has the appropriate capability. Changelog v11: * Change inc_rlimit_ucounts() which now returns top value of ucounts. * Drop inc_rlimit_ucounts_and_test() because the return code of inc_rlimit_ucounts() can be checked. Signed-off-by: Alexey Gladkov <[email protected]> Link: https://lkml.kernel.org/r/c5286a8aa16d2d698c222f7532f3d735c82bc6bc.1619094428.git.legion@kernel.org Signed-off-by: Eric W. Biederman <[email protected]>
2021-04-30Use atomic_t for ucounts reference countingAlexey Gladkov2-36/+21
The current implementation of the ucounts reference counter requires the use of spin_lock. We're going to use get_ucounts() in more performance critical areas like a handling of RLIMIT_SIGPENDING. Now we need to use spin_lock only if we want to change the hashtable. v10: * Always try to put ucounts in case we cannot increase ucounts->count. This will allow to cover the case when all consumers will return ucounts at once. v9: * Use a negative value to check that the ucounts->count is close to overflow. Signed-off-by: Alexey Gladkov <[email protected]> Link: https://lkml.kernel.org/r/94d1dbecab060a6b116b0a2d1accd8ca1bbb4f5f.1619094428.git.legion@kernel.org Signed-off-by: Eric W. Biederman <[email protected]>
2021-04-30Add a reference to ucounts for each credAlexey Gladkov8-3/+108
For RLIMIT_NPROC and some other rlimits the user_struct that holds the global limit is kept alive for the lifetime of a process by keeping it in struct cred. Adding a pointer to ucounts in the struct cred will allow to track RLIMIT_NPROC not only for user in the system, but for user in the user_namespace. Updating ucounts may require memory allocation which may fail. So, we cannot change cred.ucounts in the commit_creds() because this function cannot fail and it should always return 0. For this reason, we modify cred.ucounts before calling the commit_creds(). Changelog v6: * Fix null-ptr-deref in is_ucounts_overlimit() detected by trinity. This error was caused by the fact that cred_alloc_blank() left the ucounts pointer empty. Reported-by: kernel test robot <[email protected]> Signed-off-by: Alexey Gladkov <[email protected]> Link: https://lkml.kernel.org/r/b37aaef28d8b9b0d757e07ba6dd27281bbe39259.1619094428.git.legion@kernel.org Signed-off-by: Eric W. Biederman <[email protected]>
2021-04-30Increase size of ucounts to atomic_long_tAlexey Gladkov2-10/+10
RLIMIT_MSGQUEUE and RLIMIT_MEMLOCK use unsigned long to store their counters. As a preparation for moving rlimits based on ucounts, we need to increase the size of the variable to long. Signed-off-by: Alexey Gladkov <[email protected]> Link: https://lkml.kernel.org/r/257aa5fb1a7d81cf0f4c34f39ada2320c4284771.1619094428.git.legion@kernel.org Signed-off-by: Eric W. Biederman <[email protected]>
2021-04-25Linux 5.12Linus Torvalds1-1/+1
2021-04-25Merge tag 'perf-tools-fixes-for-v5.12-2021-04-25' of ↵Linus Torvalds4-6/+10
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix potential NULL pointer dereference in the auxtrace option parser - Fix access to PID in an array when setting a PID filter in 'perf ftrace' - Fix error return code in the 'perf data' tool and in maps__clone(), found using a static analysis tool from Huawei * tag 'perf-tools-fixes-for-v5.12-2021-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf map: Fix error return code in maps__clone() perf ftrace: Fix access to pid in array when setting a pid filter perf auxtrace: Fix potential NULL pointer dereference perf data: Fix error return code in perf_data__create_dir()
2021-04-25Merge tag 'perf_urgent_for_v5.12' of ↵Linus Torvalds2-36/+27
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf fixes from Borislav Petkov: - Fix Broadwell Xeon's stepping in the PEBS isolation table of CPUs - Fix a panic when initializing perf uncore machinery on Haswell and Broadwell servers * tag 'perf_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[] perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3
2021-04-25Merge tag 'locking_urgent_for_v5.12' of ↵Linus Torvalds1-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Borislav Petkov: "Fix ordering in the queued writer lock's slowpath" * tag 'locking_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/qrwlock: Fix ordering in queued_write_lock_slowpath()
2021-04-25Merge tag 'sched_urgent_for_v5.12' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Borislav Petkov: "Fix a typo in a macro ifdeffery" * tag 'sched_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: preempt/dynamic: Fix typo in macro conditional statement
2021-04-25Merge tag 'x86_urgent_for_v5.12' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Borislav Petkov: "Fix an out-of-bounds memory access when setting up a crash kernel with kexec" * tag 'x86_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/crash: Fix crash_setup_memmap_entries() out-of-bounds access
2021-04-24Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-11/+9
Pull kvm fix from Paolo Bonzini: "Fix SRCU bug introduced in the merge window" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86/xen: Take srcu lock when accessing kvm_memslots()
2021-04-24Revert "net/rds: Avoid potential use after free in rds_send_remove_from_sock"Linus Torvalds2-2/+1
This reverts commit 0c85a7e87465f2d4cbc768e245f4f45b2f299b05. The games with 'rm' are on (two separate instances) of a local variable, and make no difference. Quoting Aditya Pakki: "I was the author of the patch and it was the cause of the giant UMN revert. The patch is garbage and I was unaware of the steps involved in retracting it. I *believed* the maintainers would pull it, given it was already under Greg's list. The patch does not introduce any bugs but is pointless and is stupid. I accept my incompetence and for not requesting a revert earlier." Link: https://lwn.net/Articles/854319/ Requested-by: Aditya Pakki <[email protected]> Cc: Santosh Shilimkar <[email protected]> Cc: David S. Miller <[email protected]> Cc: Al Viro <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-04-23Merge tag 'pinctrl-v5.12-3' of ↵Linus Torvalds2-9/+11
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "Late pin control fixes, would have been in the main pull request normally but hey I got lucky and we got another week to polish up v5.12 so here we go. One driver fix and one making the core debugfs work: - Fix the number of pins in the community of the Intel Lewisburg SoC - Show pin numbers for controllers with base = 0 in the new debugfs feature" * tag 'pinctrl-v5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: core: Show pin numbers for the controllers with base = 0 pinctrl: lewisburg: Update number of pins in community
2021-04-23Merge branch 'akpm' (patches from Andrew)Linus Torvalds4-29/+27
Merge misc fixes from Andrew Morton: "5 patches. Subsystems affected by this patch series: coda, overlayfs, and mm (pagecache and memcg)" * emailed patches from Andrew Morton <[email protected]>: tools/cgroup/slabinfo.py: updated to work on current kernel mm/filemap: fix mapping_seek_hole_data on THP & 32-bit mm/filemap: fix find_lock_entries hang on 32-bit THP ovl: fix reference counting in ovl_mmap error path coda: fix reference counting in coda_file_mmap error path
2021-04-23Merge tag 'block-5.12-2021-04-23' of git://git.kernel.dk/linux-blockLinus Torvalds1-0/+2
Pull block fix from Jens Axboe: "A single fix for a behavioral regression in this series, when re-reading the partition table with partitions open" * tag 'block-5.12-2021-04-23' of git://git.kernel.dk/linux-block: block: return -EBUSY when there are open partitions in blkdev_reread_part
2021-04-23tools/cgroup/slabinfo.py: updated to work on current kernelVasily Averin1-4/+4
slabinfo.py script does not work with actual kernel version. First, it was unable to recognise SLUB susbsytem, and when I specified it manually it failed again with AttributeError: 'struct page' has no member 'obj_cgroups' .. and then again with File "tools/cgroup/memcg_slabinfo.py", line 221, in main memcg.kmem_caches.address_of_(), AttributeError: 'struct mem_cgroup' has no member 'kmem_caches' Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vasily Averin <[email protected]> Tested-by: Roman Gushchin <[email protected]> Acked-by: Roman Gushchin <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-04-23mm/filemap: fix mapping_seek_hole_data on THP & 32-bitHugh Dickins1-10/+11
No problem on 64-bit, or without huge pages, but xfstests generic/285 and other SEEK_HOLE/SEEK_DATA tests have regressed on huge tmpfs, and on 32-bit architectures, with the new mapping_seek_hole_data(). Several different bugs turned out to need fixing. u64 cast to stop losing bits when converting unsigned long to loff_t (and let's use shifts throughout, rather than mixed with * and /). Use round_up() when advancing pos, to stop assuming that pos was already THP-aligned when advancing it by THP-size. (This use of round_up() assumes that any THP has THP-aligned index: true at present and true going forward, but could be recoded to avoid the assumption.) Use xas_set() when iterating away from a THP, so that xa_index stays in synch with start, instead of drifting away to return bogus offset. Check start against end to avoid wrapping 32-bit xa_index to 0 (and to handle these additional cases, seek_data or not, it's easier to break the loop than goto: so rearrange exit from the function). [[email protected]: remove unneeded u64 casts, per Matthew] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 41139aa4c3a3 ("mm/filemap: add mapping_seek_hole_data") Signed-off-by: Hugh Dickins <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Jan Kara <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: William Kucharski <[email protected]> Cc: Yang Shi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-04-23mm/filemap: fix find_lock_entries hang on 32-bit THPHugh Dickins1-2/+8
No problem on 64-bit, or without huge pages, but xfstests generic/308 hung uninterruptibly on 32-bit huge tmpfs. Since commit 0cc3b0ec23ce ("Clarify (and fix) in 4.13 MAX_LFS_FILESIZE macros"), MAX_LFS_FILESIZE is only a PAGE_SIZE away from wrapping 32-bit xa_index to 0, so the new find_lock_entries() has to be extra careful when handling a THP. Link: https://lkml.kernel.org/r/[email protected] Fixes: 5c211ba29deb ("mm: add and use find_lock_entries") Signed-off-by: Hugh Dickins <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: William Kucharski <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Jan Kara <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Yang Shi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-04-23ovl: fix reference counting in ovl_mmap error pathChristian König1-10/+1
mmap_region() now calls fput() on the vma->vm_file. Fix this by using vma_set_file() so it doesn't need to be handled manually here any more. Link: https://lkml.kernel.org/r/[email protected] Fixes: 1527f926fd04 ("mm: mmap: fix fput in error path v2") Signed-off-by: Christian König <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Cc: Jan Harkes <[email protected]> Cc: Miklos Szeredi <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: <[email protected]> [5.11+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-04-23coda: fix reference counting in coda_file_mmap error pathChristian König1-3/+3
mmap_region() now calls fput() on the vma->vm_file. So we need to drop the extra reference on the coda file instead of the host file. Link: https://lkml.kernel.org/r/[email protected] Fixes: 1527f926fd04 ("mm: mmap: fix fput in error path v2") Signed-off-by: Christian König <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Acked-by: Jan Harkes <[email protected]> Cc: Miklos Szeredi <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: <[email protected]> [5.11+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-04-23KVM: x86/xen: Take srcu lock when accessing kvm_memslots()Wanpeng Li1-11/+9
kvm_memslots() will be called by kvm_write_guest_offset_cached() so we should take the srcu lock. Let's pull the srcu lock operation from kvm_steal_time_set_preempted() again to fix xen part. Fixes: 30b5c851af7 ("KVM: x86/xen: Add support for vCPU runstate information") Signed-off-by: Wanpeng Li <[email protected]> Message-Id: <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-04-23Merge tag 'arm-fixes-5.12-4' of ↵Linus Torvalds8-6/+14
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "These should be the final fixes for v5.12. There is one fix for SD card detection on one Allwinner board, and a few fixes for the Tegra platform that I had already queued up for v5.13 due to a communication problem. This addresses MMC device ordering on multiple machines, audio support on Jetson AGX Xavier and suspend/resume on Jetson TX2" * tag 'arm-fixes-5.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: arm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTS arm64: tegra: Move clocks from RT5658 endpoint to device node arm64: tegra: Fix mmc0 alias for Jetson Xavier NX arm64: tegra: Set fw_devlink=on for Jetson TX2 arm64: tegra: Add unit-address for ACONNECT on Tegra186
2021-04-23perf map: Fix error return code in maps__clone()Zhen Lei1-2/+5
Although 'err' has been initialized to -ENOMEM, but it will be reassigned by the "err = unwind__prepare_access(...)" statement in the for loop. So that, the value of 'err' is unknown when map__clone() failed. Fixes: 6c502584438bda63 ("perf unwind: Call unwind__prepare_access for forked thread") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Zhen Lei <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: zhen lei <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-23perf ftrace: Fix access to pid in array when setting a pid filterThomas Richter1-1/+1
Command 'perf ftrace -v -- ls' fails in s390 (at least 5.12.0rc6). The root cause is a missing pointer dereference which causes an array element address to be used as PID. Fix this by extracting the PID. Output before: # ./perf ftrace -v -- ls function_graph tracer is used write '-263732416' to tracing/set_ftrace_pid failed: Invalid argument failed to set ftrace pid # Output after: ./perf ftrace -v -- ls function_graph tracer is used # tracer: function_graph # # CPU DURATION FUNCTION CALLS # | | | | | | | 4) | rcu_read_lock_sched_held() { 4) 0.552 us | rcu_lockdep_current_cpu_online(); 4) 6.124 us | } Reported-by: Alexander Schmidt <[email protected]> Signed-off-by: Thomas Richter <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vasily Gorbik <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-23perf auxtrace: Fix potential NULL pointer dereferenceLeo Yan1-1/+1
In the function auxtrace_parse_snapshot_options(), the callback pointer "itr->parse_snapshot_options" can be NULL if it has not been set during the AUX record initialization. This can cause tool crashing if the callback pointer "itr->parse_snapshot_options" is dereferenced without performing NULL check. Add a NULL check for the pointer "itr->parse_snapshot_options" before invoke the callback. Fixes: d20031bb63dd6dde ("perf tools: Add AUX area tracing Snapshot Mode") Signed-off-by: Leo Yan <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tiezhu Yang <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-23Merge tag 'drm-fixes-2021-04-23' of git://anongit.freedesktop.org/drm/drmLinus Torvalds5-19/+30
Pull drm fixes from Dave Airlie: "Just some small i915 and amdgpu fixes this week, should be all until you open the merge window. amdgpu: - Fix gpuvm page table update issue - Modifier fixes - Register fix for dimgrey cavefish i915: - GVT's BDW regression fix for cmd parser - Fix modesetting in case of unexpected AUX timeouts" * tag 'drm-fixes-2021-04-23' of git://anongit.freedesktop.org/drm/drm: drm/amdgpu: fix GCR_GENERAL_CNTL offset for dimgrey_cavefish amd/display: allow non-linear multi-planar formats drm/amd/display: Update modifier list for gfx10_3 drm/amdgpu: reserve fence slot to update page table drm/i915: Fix modesetting in case of unexpected AUX timeouts drm/i915/gvt: Fix BDW command parser regression
2021-04-23Merge tag 'gpio-fixes-for-v5.12' of ↵Linus Torvalds2-0/+12
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fix from Bartosz Golaszewski: "Save and restore the sysconfig register in gpio-omap to fix a power-management issue" * tag 'gpio-fixes-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: omap: Save and restore sysconfig
2021-04-23Merge branch 'tegra/dt64' into arm/fixesArnd Bergmann7-5/+13
arm64: tegra: Device tree fixes for v5.12-rc6 This contains a couple of device tree fixes for the v5.12 release cycle. These are needed for proper audio support on Jetson AGX Xavier, to boot the Jetson Xavier NX from an SD card and to be able to suspend/resume the Jetson TX2. * tegra/dt64: arm64: tegra: Move clocks from RT5658 endpoint to device node arm64: tegra: Fix mmc0 alias for Jetson Xavier NX arm64: tegra: Set fw_devlink=on for Jetson TX2 arm64: tegra: Add unit-address for ACONNECT on Tegra186 Link: https://lore.kernel.org/linux-arm-kernel/[email protected]/ Signed-off-by: Arnd Bergmann <[email protected]>
2021-04-23Merge tag 'drm-intel-fixes-2021-04-22' of ↵Dave Airlie2-7/+15
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - GVT's BDW regression fix for cmd parser (Zhenyu) - Fix modesetting in case of unexpected AUX timeouts (Imre) Signed-off-by: Dave Airlie <[email protected]> From: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-04-23Merge tag 'amd-drm-fixes-5.12-2021-04-21' of ↵Dave Airlie3-12/+15
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-5.12-2021-04-21: amdgpu: - Fix gpuvm page table update issue - Modifier fixes - Register fix for dimgrey cavefish Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-04-22Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2-2/+8
Pull virtio fixes from Michael Tsirkin: "Very late in the cycle but both risky if left unfixed and more or less obvious.." * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vdpa/mlx5: Set err = -ENOMEM in case dma_map_sg_attrs fails vhost-vdpa: protect concurrent access to vhost device iotlb
2021-04-22vdpa/mlx5: Set err = -ENOMEM in case dma_map_sg_attrs failsEli Cohen1-1/+3
Set err = -ENOMEM if dma_map_sg_attrs() fails so the function reutrns error. Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code") Signed-off-by: Eli Cohen <[email protected]> Reported-by: kernel test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]>
2021-04-22vhost-vdpa: protect concurrent access to vhost device iotlbXie Yongji1-1/+5
Protect vhost device iotlb by vhost_dev->mutex. Otherwise, it might cause corruption of the list and interval tree in struct vhost_iotlb if userspace sends the VHOST_IOTLB_MSG_V2 message concurrently. Fixes: 4c8cf318("vhost: introduce vDPA-based backend") Cc: [email protected] Signed-off-by: Xie Yongji <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael S. Tsirkin <[email protected]>
2021-04-22Merge tag 'sunxi-fixes-for-5.12-2' of ↵Arnd Bergmann1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into arm/fixes One fix for the MMC card detect on the Pine H64 board * tag 'sunxi-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux: arm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTS Link: https://lore.kernel.org/r/45fc5e4d-ef48-4729-a869-79a8f288bb83.lettre@localhost Signed-off-by: Arnd Bergmann <[email protected]>
2021-04-22Merge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/tpmddLinus Torvalds1-1/+1
Pull tpm fix from James Bottomley: "This is an urgent regression fix for a tpm patch set that went in this merge window. It looks like a rebase before the original pull request lost a tpm_try_get_ops() so we have a lock imbalance in our code which is causing oopses. The original patch was correct on the mailing list. I'm sending this in agreement with Mimi (as joint maintainers of trusted keys) because Jarkko is off communing with the Reindeer or whatever it is Finns do when on holiday" * tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/tpmdd: KEYS: trusted: Fix TPM reservation for seal/unseal
2021-04-22perf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[]Jim Mattson1-1/+1
The only stepping of Broadwell Xeon parts is stepping 1. Fix the relevant isolation_ucodes[] entry, which previously enumerated stepping 2. Although the original commit was characterized as an optimization, it is also a workaround for a correctness issue. If a PMI arrives between kvm's call to perf_guest_get_msrs() and the subsequent VM-entry, a stale value for the IA32_PEBS_ENABLE MSR may be restored at the next VM-exit. This is because, unbeknownst to kvm, PMI throttling may clear bits in the IA32_PEBS_ENABLE MSR. CPUs with "PEBS isolation" don't suffer from this issue, because perf_guest_get_msrs() doesn't report the IA32_PEBS_ENABLE value. Fixes: 9b545c04abd4f ("perf/x86/kvm: Avoid unnecessary work in guest filtering") Signed-off-by: Jim Mattson <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Peter Shier <[email protected]> Acked-by: Andi Kleen <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-22arm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTSAndre Przywara1-1/+1
Commit 941432d00768 ("arm64: dts: allwinner: Drop non-removable from SoPine/LTS SD card") enabled the card detect GPIO for the SOPine module, along the way with the Pine64-LTS, which share the same base .dtsi. This was based on the observation that the Pine64-LTS has as "push-push" SD card socket, and that the schematic mentions the card detect GPIO. After having received two reports about failing SD card access with that patch, some more research and polls on that subject revealed that there are at least two different versions of the Pine64-LTS out there: - On some boards (including mine) the card detect pin is "stuck" at high, regardless of an microSD card being inserted or not. - On other boards the card-detect is working, but is active-high, by virtue of an explicit inverter circuit, as shown in the schematic. To cover all versions of the board out there, and don't take any chances, let's revert the introduction of the active-low CD GPIO, but let's use the broken-cd property for the Pine64-LTS this time. That should avoid regressions and should work for everyone, even allowing SD card changes now. The SOPine card detect has proven to be working, so let's keep that GPIO in place. Fixes: 941432d00768 ("arm64: dts: allwinner: Drop non-removable from SoPine/LTS SD card") Reported-by: Michael Weiser <[email protected]> Reported-by: Daniel Kulesz <[email protected]> Suggested-by: Chen-Yu Tsai <[email protected]> Signed-off-by: Andre Przywara <[email protected]> Tested-by: Michael Weiser <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-04-22pinctrl: core: Show pin numbers for the controllers with base = 0Andy Shevchenko1-6/+8
The commit f1b206cf7c57 ("pinctrl: core: print gpio in pins debugfs file") enabled GPIO pin number and label in debugfs for pin controller. However, it limited that feature to the chips where base is positive number. This, in particular, excluded chips where base is 0 for the historical or backward compatibility reasons. Refactor the code to include the latter as well. Fixes: f1b206cf7c57 ("pinctrl: core: print gpio in pins debugfs file") Cc: Drew Fustini <[email protected]> Signed-off-by: Andy Shevchenko <[email protected]> Tested-by: Drew Fustini <[email protected]> Reviewed-by: Drew Fustini <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Walleij <[email protected]>
2021-04-21KEYS: trusted: Fix TPM reservation for seal/unsealJames Bottomley1-1/+1
The original patch 8c657a0590de ("KEYS: trusted: Reserve TPM for seal and unseal operations") was correct on the mailing list: https://lore.kernel.org/linux-integrity/[email protected]/ But somehow got rebased so that the tpm_try_get_ops() in tpm2_seal_trusted() got lost. This causes an imbalanced put of the TPM ops and causes oopses on TIS based hardware. This fix puts back the lost tpm_try_get_ops() Fixes: 8c657a0590de ("KEYS: trusted: Reserve TPM for seal and unseal operations") Reported-by: Mimi Zohar <[email protected]> Acked-by: Mimi Zohar <[email protected]> Signed-off-by: James Bottomley <[email protected]>
2021-04-21Merge tag 'mmc-v5.12-rc5' of ↵Linus Torvalds1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fix from Ulf Hansson: "Replace WARN_ONCE with dev_warn_once for non-optimal sg-alignment in the meson-gx host driver" * tag 'mmc-v5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: meson-gx: replace WARN_ONCE with dev_warn_once about scatterlist size alignment in block mode
2021-04-21block: return -EBUSY when there are open partitions in blkdev_reread_partChristoph Hellwig1-0/+2
The switch to go through blkdev_get_by_dev means we now ignore the return value from bdev_disk_changed in __blkdev_get. Add a manual check to restore the old semantics. Fixes: 4601b4b130de ("block: reopen the device in blkdev_reread_part") Reported-by: Karel Zak <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-04-21drm/amdgpu: fix GCR_GENERAL_CNTL offset for dimgrey_cavefishJiansong Chen1-1/+1
dimgrey_cavefish has similar gc_10_3 ip with sienna_cichlid, so follow its registers offset setting. Signed-off-by: Jiansong Chen <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-04-21amd/display: allow non-linear multi-planar formatsSimon Ser1-7/+4
Accept non-linear buffers which use a multi-planar format, as long as they don't use DCC. Tested on GFX9 with NV12. Signed-off-by: Simon Ser <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Harry Wentland <[email protected]> Cc: Nicholas Kazlauskas <[email protected]> Cc: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]