aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2020-08-05selftests/powerpc: Fix pkey syscall redefinitionsSandipan Das1-6/+6
On distros using older glibc versions, the pkey tests encounter build failures due to redefinition of the pkey syscall numbers. For compatibility, commit 743f3544fffb added a wrapper for the gettid() syscall and included syscall.h if the version of glibc used is older than 2.30. This leads to different definitions of SYS_pkey_* as the ones in the pkey test header set numeric constants where as the ones from syscall.h reuse __NR_pkey_*. The compiler complains about redefinitions since they are different. This replaces SYS_pkey_* definitions with __NR_pkey_* such that the definitions in both syscall.h and pkeys.h are alike. This way, if syscall.h has to be included for compatibility reasons, builds will still succeed. Fixes: 743f3544fffb ("selftests/powerpc: Add wrapper for gettid") Reported-by: Sachin Sant <[email protected]> Suggested-by: David Laight <[email protected]> Suggested-by: Michael Ellerman <[email protected]> Signed-off-by: Sandipan Das <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/a4956d838bf59b0a71a2553c5ca81131ea8b49b9.1596561758.git.sandipan@linux.ibm.com
2020-08-04Merge tag 'close-range-v5.9' of ↵Linus Torvalds4-0/+236
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull close_range() implementation from Christian Brauner: "This adds the close_range() syscall. It allows to efficiently close a range of file descriptors up to all file descriptors of a calling task. This is coordinated with the FreeBSD folks which have copied our version of this syscall and in the meantime have already merged it in April 2019: https://reviews.freebsd.org/D21627 https://svnweb.freebsd.org/base?view=revision&revision=359836 The syscall originally came up in a discussion around the new mount API and making new file descriptor types cloexec by default. During this discussion, Al suggested the close_range() syscall. First, it helps to close all file descriptors of an exec()ing task. This can be done safely via (quoting Al's example from [1] verbatim): /* that exec is sensitive */ unshare(CLONE_FILES); /* we don't want anything past stderr here */ close_range(3, ~0U); execve(....); The code snippet above is one way of working around the problem that file descriptors are not cloexec by default. This is aggravated by the fact that we can't just switch them over without massively regressing userspace. For a whole class of programs having an in-kernel method of closing all file descriptors is very helpful (e.g. demons, service managers, programming language standard libraries, container managers etc.). Second, it allows userspace to avoid implementing closing all file descriptors by parsing through /proc/<pid>/fd/* and calling close() on each file descriptor and other hacks. From looking at various large(ish) userspace code bases this or similar patterns are very common in service managers, container runtimes, and programming language runtimes/standard libraries such as Python or Rust. In addition, the syscall will also work for tasks that do not have procfs mounted and on kernels that do not have procfs support compiled in. In such situations the only way to make sure that all file descriptors are closed is to call close() on each file descriptor up to UINT_MAX or RLIMIT_NOFILE, OPEN_MAX trickery. Based on Linus' suggestion close_range() also comes with a new flag CLOSE_RANGE_UNSHARE to more elegantly handle file descriptor dropping right before exec. This would usually be expressed in the sequence: unshare(CLONE_FILES); close_range(3, ~0U); as pointed out by Linus it might be desirable to have this be a part of close_range() itself under a new flag CLOSE_RANGE_UNSHARE which gets especially handy when we're closing all file descriptors above a certain threshold. Test-suite as always included" * tag 'close-range-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add CLOSE_RANGE_UNSHARE tests close_range: add CLOSE_RANGE_UNSHARE tests: add close_range() tests arch: wire-up close_range() open: add close_range()
2020-08-04Merge tag 'cap-checkpoint-restore-v5.9' of ↵Linus Torvalds3-1/+186
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull checkpoint-restore updates from Christian Brauner: "This enables unprivileged checkpoint/restore of processes. Given that this work has been going on for quite some time the first sentence in this summary is hopefully more exciting than the actual final code changes required. Unprivileged checkpoint/restore has seen a frequent increase in interest over the last two years and has thus been one of the main topics for the combined containers & checkpoint/restore microconference since at least 2018 (cf. [1]). Here are just the three most frequent use-cases that were brought forward: - The JVM developers are integrating checkpoint/restore into a Java VM to significantly decrease the startup time. - In high-performance computing environment a resource manager will typically be distributing jobs where users are always running as non-root. Long-running and "large" processes with significant startup times are supposed to be checkpointed and restored with CRIU. - Container migration as a non-root user. In all of these scenarios it is either desirable or required to run without CAP_SYS_ADMIN. The userspace implementation of checkpoint/restore CRIU already has the pull request for supporting unprivileged checkpoint/restore up (cf. [2]). To enable unprivileged checkpoint/restore a new dedicated capability CAP_CHECKPOINT_RESTORE is introduced. This solution has last been discussed in 2019 in a talk by Google at Linux Plumbers (cf. [1] "Update on Task Migration at Google Using CRIU") with Adrian and Nicolas providing the implementation now over the last months. In essence, this allows the CRIU binary to be installed with the CAP_CHECKPOINT_RESTORE vfs capability set thereby enabling unprivileged users to restore processes. To make this possible the following permissions are altered: - Selecting a specific PID via clone3() set_tid relaxed from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE. - Selecting a specific PID via /proc/sys/kernel/ns_last_pid relaxed from userns CAP_SYS_ADMIN to CAP_CHECKPOINT_RESTORE. - Accessing /proc/pid/map_files relaxed from init userns CAP_SYS_ADMIN to init userns CAP_CHECKPOINT_RESTORE. - Changing /proc/self/exe from userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE. Of these four changes the /proc/self/exe change deserves a few words because the reasoning behind even restricting /proc/self/exe changes in the first place is just full of historical quirks and tracking this down was a questionable version of fun that I'd like to spare others. In short, it is trivial to change /proc/self/exe as an unprivileged user, i.e. without userns CAP_SYS_ADMIN right now. Either via ptrace() or by simply intercepting the elf loader in userspace during exec. Nicolas was nice enough to even provide a POC for the latter (cf. [3]) to illustrate this fact. The original patchset which introduced PR_SET_MM_MAP had no permissions around changing the exe link. They too argued that it is trivial to spoof the exe link already which is true. The argument brought up against this was that the Tomoyo LSM uses the exe link in tomoyo_manager() to detect whether the calling process is a policy manager. This caused changing the exe links to be guarded by userns CAP_SYS_ADMIN. All in all this rather seems like a "better guard it with something rather than nothing" argument which imho doesn't qualify as a great security policy. Again, because spoofing the exe link is possible for the calling process so even if this were security relevant it was broken back then and would be broken today. So technically, dropping all permissions around changing the exe link would probably be possible and would send a clearer message to any userspace that relies on /proc/self/exe for security reasons that they should stop doing this but for now we're only relaxing the exe link permissions from userns CAP_SYS_ADMIN to userns CAP_CHECKPOINT_RESTORE. There's a final uapi change in here. Changing the exe link used to accidently return EINVAL when the caller lacked the necessary permissions instead of the more correct EPERM. This pr contains a commit fixing this. I assume that userspace won't notice or care and if they do I will revert this commit. But since we are changing the permissions anyway it seems like a good opportunity to try this fix. With these changes merged unprivileged checkpoint/restore will be possible and has already been tested by various users" [1] LPC 2018 1. "Task Migration at Google Using CRIU" https://www.youtube.com/watch?v=yI_1cuhoDgA&t=12095 2. "Securely Migrating Untrusted Workloads with CRIU" https://www.youtube.com/watch?v=yI_1cuhoDgA&t=14400 LPC 2019 1. "CRIU and the PID dance" https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=2m48s 2. "Update on Task Migration at Google Using CRIU" https://www.youtube.com/watch?v=LN2CUgp8deo&list=PLVsQ_xZBEyN30ZA3Pc9MZMFzdjwyz26dO&index=9&t=1h2m8s [2] https://github.com/checkpoint-restore/criu/pull/1155 [3] https://github.com/nviennot/run_as_exe * tag 'cap-checkpoint-restore-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: selftests: add clone3() CAP_CHECKPOINT_RESTORE test prctl: exe link permission error changed from -EINVAL to -EPERM prctl: Allow local CAP_CHECKPOINT_RESTORE to change /proc/self/exe proc: allow access in init userns for map_files with CAP_CHECKPOINT_RESTORE pid_namespace: use checkpoint_restore_ns_capable() for ns_last_pid pid: use checkpoint_restore_ns_capable() for set_tid capabilities: Introduce CAP_CHECKPOINT_RESTORE
2020-08-04Merge tag 'threads-v5.9' of ↵Linus Torvalds2-0/+80
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull thread updates from Christian Brauner: "This contains the changes to add the missing support for attaching to time namespaces via pidfds. Last cycle setns() was changed to support attaching to multiple namespaces atomically. This requires all namespaces to have a point of no return where they can't fail anymore. Specifically, <namespace-type>_install() is allowed to perform permission checks and install the namespace into the new struct nsset that it has been given but it is not allowed to make visible changes to the affected task. Once <namespace-type>_install() returns, anything that the given namespace type additionally requires to be setup needs to ideally be done in a function that can't fail or if it fails the failure must be non-fatal. For time namespaces the relevant functions that fell into this category were timens_set_vvar_page() and vdso_join_timens(). The latter could still fail although it didn't need to. This function is only implemented for vdso_join_timens() in current mainline. As discussed on-list (cf. [1]), in order to make setns() support time namespaces when attaching to multiple namespaces at once properly we changed vdso_join_timens() to always succeed. So vdso_join_timens() replaces the mmap_write_lock_killable() with mmap_read_lock(). Please note that arm is about to grow vdso support for time namespaces (possibly this merge window). We've synced on this change and arm64 also uses mmap_read_lock(), i.e. makes vdso_join_timens() a function that can't fail. Once the changes here and the arm64 changes have landed, vdso_join_timens() should be turned into a void function so it's obvious to callers and implementers on other architectures that the expectation is that it can't fail. We didn't do this right away because it would've introduced unnecessary merge conflicts between the two trees for no major gain. As always, tests included" [1]: https://lore.kernel.org/lkml/20200611110221.pgd3r5qkjrjmfqa2@wittgenstein * tag 'threads-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add CLONE_NEWTIME setns tests nsproxy: support CLONE_NEWTIME with setns() timens: add timens_commit() helper timens: make vdso_join_timens() always succeed
2020-08-04Merge tag 'seccomp-v5.9-rc1' of ↵Linus Torvalds5-232/+573
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: "There are a bunch of clean ups and selftest improvements along with two major updates to the SECCOMP_RET_USER_NOTIF filter return: EPOLLHUP support to more easily detect the death of a monitored process, and being able to inject fds when intercepting syscalls that expect an fd-opening side-effect (needed by both container folks and Chrome). The latter continued the refactoring of __scm_install_fd() started by Christoph, and in the process found and fixed a handful of bugs in various callers. - Improved selftest coverage, timeouts, and reporting - Add EPOLLHUP support for SECCOMP_RET_USER_NOTIF (Christian Brauner) - Refactor __scm_install_fd() into __receive_fd() and fix buggy callers - Introduce 'addfd' command for SECCOMP_RET_USER_NOTIF (Sargun Dhillon)" * tag 'seccomp-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (30 commits) selftests/seccomp: Test SECCOMP_IOCTL_NOTIF_ADDFD seccomp: Introduce addfd ioctl to seccomp user notifier fs: Expand __receive_fd() to accept existing fd pidfd: Replace open-coded receive_fd() fs: Add receive_fd() wrapper for __receive_fd() fs: Move __scm_install_fd() to __receive_fd() net/scm: Regularize compat handling of scm_detach_fds() pidfd: Add missing sock updates for pidfd_getfd() net/compat: Add missing sock updates for SCM_RIGHTS selftests/seccomp: Check ENOSYS under tracing selftests/seccomp: Refactor to use fixture variants selftests/harness: Clean up kern-doc for fixtures seccomp: Use -1 marker for end of mode 1 syscall list seccomp: Fix ioctl number for SECCOMP_IOCTL_NOTIF_ID_VALID selftests/seccomp: Rename user_trap_syscall() to user_notif_syscall() selftests/seccomp: Make kcmp() less required seccomp: Use pr_fmt selftests/seccomp: Improve calibration loop selftests/seccomp: use 90s as timeout selftests/seccomp: Expand benchmark to per-filter measurements ...
2020-08-04Merge tag 'uninit-macro-v5.9-rc1' of ↵Linus Torvalds2-4/+0
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull uninitialized_var() macro removal from Kees Cook: "This is long overdue, and has hidden too many bugs over the years. The series has several "by hand" fixes, and then a trivial treewide replacement. - Clean up non-trivial uses of uninitialized_var() - Update documentation and checkpatch for uninitialized_var() removal - Treewide removal of uninitialized_var()" * tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: compiler: Remove uninitialized_var() macro treewide: Remove uninitialized_var() usage checkpatch: Remove awareness of uninitialized_var() macro mm/debug_vm_pgtable: Remove uninitialized_var() usage f2fs: Eliminate usage of uninitialized_var() macro media: sur40: Remove uninitialized_var() usage KVM: PPC: Book3S PR: Remove uninitialized_var() usage clk: spear: Remove uninitialized_var() usage clk: st: Remove uninitialized_var() usage spi: davinci: Remove uninitialized_var() usage ide: Remove uninitialized_var() usage rtlwifi: rtl8192cu: Remove uninitialized_var() usage b43: Remove uninitialized_var() usage drbd: Remove uninitialized_var() usage x86/mm/numa: Remove uninitialized_var() usage docs: deprecated.rst: Add uninitialized_var()
2020-08-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller2-1/+125
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Flush the cleanup xtables worker to make sure destructors have completed, from Florian Westphal. 2) iifgroup is matching erroneously, also from Florian. 3) Add selftest for meta interface matching, from Florian Westphal. 4) Move nf_ct_offload_timeout() to header, from Roi Dayan. 5) Call nf_ct_offload_timeout() from flow_offload_add() to make sure garbage collection does not evict offloaded flow, from Roi Dayan. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-08-04selftests: pmtu.sh: Add tests for UDP tunnels handled by Open vSwitchStefano Brivio1-0/+180
The new tests check that IP and IPv6 packets exceeding the local PMTU estimate, forwarded by an Open vSwitch instance from another node, result in the correct route exceptions being created, and that communication with end-to-end fragmentation, over GENEVE and VXLAN Open vSwitch ports, is now possible as a result of PMTU discovery. Signed-off-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-04selftests: pmtu.sh: Add tests for bridged UDP tunnelsStefano Brivio1-7/+159
The new tests check that IP and IPv6 packets exceeding the local PMTU estimate, both locally generated and forwarded by a bridge from another node, result in the correct route exceptions being created, and that communication with end-to-end fragmentation over VXLAN and GENEVE tunnels is now possible as a result of PMTU discovery. Part of the existing setup functions aren't generic enough to simply add a namespace and a bridge to the existing routing setup. This rework is in progress and we can easily shrink this once more generic topology functions are available. Signed-off-by: Stefano Brivio <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-04perf evsel: Don't set sample_regs_intr/sample_regs_user for dummy eventJin Yao1-2/+4
Since commit 0a892c1c9472 ("perf record: Add dummy event during system wide synthesis"), a dummy event is added to capture mmaps. But if we run perf-record as, # perf record -e cycles:p -IXMM0 -a -- sleep 1 Error: dummy:HG: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat' The issue is, if we enable the extended regs (-IXMM0), but the pmu->capabilities is not set with PERF_PMU_CAP_EXTENDED_REGS, the kernel will return -EOPNOTSUPP error. See following code: /* in kernel/events/core.c */ static int perf_try_init_event(struct pmu *pmu, struct perf_event *event) { .... if (!(pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS) && has_extended_regs(event)) ret = -EOPNOTSUPP; .... } For software dummy event, the PMU should not be set with PERF_PMU_CAP_EXTENDED_REGS. But unfortunately now, the dummy event has possibility to be set with PERF_REG_EXTENDED_MASK bit. In evsel__config, /* tools/perf/util/evsel.c */ if (opts->sample_intr_regs) { attr->sample_regs_intr = opts->sample_intr_regs; } If we use -IXMM0, the attr>sample_regs_intr will be set with PERF_REG_EXTENDED_MASK bit. It doesn't make sense to set attr->sample_regs_intr for a software dummy event. This patch adds dummy event checking before setting attr->sample_regs_intr and attr->sample_regs_user. After: # ./perf record -e cycles:p -IXMM0 -a -- sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.413 MB perf.data (45 samples) ] Committer notes: Adrian said this when providing his Acked-by: " This is fine. It will not break PT. no_aux_samples is useful for evsels that have been added by the code rather than requested by the user. For old kernels PT adds sched_switch tracepoint to track context switches (before the current context switch event was added) and having auxiliary sample information unnecessarily uses up space in the perf buffer. " Fixes: 0a892c1c9472 ("perf record: Add dummy event during system wide synthesis") Signed-off-by: Jin Yao <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-08-04perf record: Introduce --control fd:ctl-fd[,ack-fd] optionsAlexey Budankov3-0/+78
Introduce --control fd:ctl-fd[,ack-fd] options to pass open file descriptors numbers from command line. Extend perf-record.txt file with --control fd:ctl-fd[,ack-fd] options description. Document possible usage model introduced by --control fd:ctl-fd[,ack-fd] options by providing example bash shell script. Signed-off-by: Alexey Budankov <[email protected]> Acked-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-08-04perf record: Implement control commands handlingAlexey Budankov1-0/+16
Implement handling of 'enable' and 'disable' control commands coming from control file descriptor. Signed-off-by: Alexey Budankov <[email protected]> Acked-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-08-04perf record: Extend -D,--delay option with -1 valueAlexey Budankov4-8/+13
Extend -D,--delay option with -1 to start collection with events disabled to be enabled later by 'enable' command provided via control file descriptor. Signed-off-by: Alexey Budankov <[email protected]> Acked-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-08-04perf stat: Introduce --control fd:ctl-fd[,ack-fd] optionsAlexey Budankov3-1/+80
Introduce --control fd:ctl-fd[,ack-fd] options to pass open file descriptors numbers from command line. Extend perf-stat.txt file with --control fd:ctl-fd[,ack-fd] options description. Document possible usage model introduced by --control fd:ctl-fd[,ack-fd] options by providing example bash shell script. Signed-off-by: Alexey Budankov <[email protected]> Acked-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-08-03Merge tag 'acpi-5.9-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These eliminate significant AML processing overhead related to using operation regions in system memory, update the ACPICA code in the kernel to upstream revision 20200717 (including a fix to prevent operation region reference counts from overflowing in some cases), remove the last bits of the (long deprecated) ACPI procfs interface and do some assorted cleanups. Specifics: - Eliminate significant AML processing overhead related to using operation regions in system memory by reworking the management of memory mappings in the ACPI code to defer unmap operations (to do them outside of the ACPICA locks, among other things) and making the memory operation reagion handler avoid releasing memory mappings created by it too early (Rafael Wysocki). - Update the ACPICA code in the kernel to upstream revision 20200717: * Prevent operation region reference counts from overflowing in some cases (Erik Kaneda). * Replace one-element array with flexible-array (Gustavo A. R. Silva). - Fix ACPI PCI hotplug reference counting (Rafael Wysocki). - Drop last bits of the ACPI procfs interface (Thomas Renninger). - Drop some redundant checks from the code parsing ACPI tables related to NUMA (Hanjun Guo). - Avoid redundant object evaluation in the ACPI device properties handling code (Heikki Krogerus). - Avoid unecessary memory overhead related to storing the signatures of the ACPI tables recognized by the kernel (Ard Biesheuvel). - Add missing newline characters when printing module parameter values in some places (Xiongfeng Wang). - Update the link to the ACPI specifications in some places (Tiezhu Yang). - Use the fallthrough pseudo-keyword in the ACPI code (Gustavo A. R. Silva). - Drop redundant variable initialization from the APEI code (Colin Ian King). - Drop uninitialized_var() from the ACPI PAD driver (Jason Yan). - Replace HTTP links with HTTPS ones in the ACPI code (Alexander A. Klimov)" * tag 'acpi-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (22 commits) ACPI: APEI: remove redundant assignment to variable rc ACPI: NUMA: Remove the useless 'node >= MAX_NUMNODES' check ACPI: NUMA: Remove the useless sub table pointer check ACPI: tables: Remove the duplicated checks for acpi_parse_entries_array() ACPICA: Update version to 20200717 ACPICA: Do not increment operation_region reference counts for field units ACPICA: Replace one-element array with flexible-array ACPI: Replace HTTP links with HTTPS ones ACPI: Use valid link to the ACPI specification ACPI: OSL: Clean up the removal of unused memory mappings ACPI: OSL: Use deferred unmapping in acpi_os_unmap_iomem() ACPI: OSL: Use deferred unmapping in acpi_os_unmap_generic_address() ACPICA: Preserve memory opregion mappings ACPI: OSL: Implement deferred unmapping of ACPI memory ACPI: Use fallthrough pseudo-keyword PCI: hotplug: ACPI: Fix context refcounting in acpiphp_grab_context() ACPI: tables: avoid relocations for table signature array ACPI: PAD: Eliminate usage of uninitialized_var() macro ACPI: sysfs: add newlines when printing module parameters ACPI: EC: add newline when printing 'ec_event_clearing' module parameter ...
2020-08-03Merge tag 'pm-5.9-rc1' of ↵Linus Torvalds5-112/+159
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "The most significant change here is the extension of the Energy Model to cover non-CPU devices (as well as CPUs) from Lukasz Luba. There is also some new hardware support (Ice Lake server idle states table for intel_idle, Sapphire Rapids and Power Limit 4 support in the RAPL driver), some new functionality in the existing drivers (eg. a new switch to disable/enable CPU energy-efficiency optimizations in intel_pstate, delayed timers in devfreq), some assorted fixes (cpufreq core, intel_pstate, intel_idle) and cleanups (eg. cpuidle-psci, devfreq), including the elimination of W=1 build warnings from cpufreq done by Lee Jones. Specifics: - Make the Energy Model cover non-CPU devices (Lukasz Luba). - Add Ice Lake server idle states table to the intel_idle driver and eliminate a redundant static variable from it (Chen Yu, Rafael Wysocki). - Eliminate all W=1 build warnings from cpufreq (Lee Jones). - Add support for Sapphire Rapids and for Power Limit 4 to the Intel RAPL power capping driver (Sumeet Pawnikar, Zhang Rui). - Fix function name in kerneldoc comments in the idle_inject power capping driver (Yangtao Li). - Fix locking issues with cpufreq governors and drop a redundant "weak" function definition from cpufreq (Viresh Kumar). - Rearrange cpufreq to register non-modular governors at the core_initcall level and allow the default cpufreq governor to be specified in the kernel command line (Quentin Perret). - Extend, fix and clean up the intel_pstate driver (Srinivas Pandruvada, Rafael Wysocki): * Add a new sysfs attribute for disabling/enabling CPU energy-efficiency optimizations in the processor. * Make the driver avoid enabling HWP if EPP is not supported. * Allow the driver to handle numeric EPP values in the sysfs interface and fix the setting of EPP via sysfs in the active mode. * Eliminate a static checker warning and clean up a kerneldoc comment. - Clean up some variable declarations in the powernv cpufreq driver (Wei Yongjun). - Fix up the ->enter_s2idle callback definition to cover the case when it points to the same function as ->idle correctly (Neal Liu). - Rearrange and clean up the PSCI cpuidle driver (Ulf Hansson). - Make the PM core emit "changed" uevent when adding/removing the "wakeup" sysfs attribute of devices (Abhishek Pandit-Subedi). - Add a helper macro for declaring PM callbacks and use it in the MMC jz4740 driver (Paul Cercueil). - Fix white space in some places in the hibernate code and make the system-wide PM code use "const char *" where appropriate (Xiang Chen, Alexey Dobriyan). - Add one more "unsafe" helper macro to the freezer to cover the NFS use case (He Zhe). - Change the language in the generic PM domains framework to use parent/child terminology and clean up a typo and some comment fromatting in that code (Kees Cook, Geert Uytterhoeven). - Update the operating performance points OPP framework (Lukasz Luba, Andrew-sh.Cheng, Valdis Kletnieks): * Refactor dev_pm_opp_of_register_em() and update related drivers. * Add a missing function export. * Allow disabled OPPs in dev_pm_opp_get_freq(). - Update devfreq core and drivers (Chanwoo Choi, Lukasz Luba, Enric Balletbo i Serra, Dmitry Osipenko, Kieran Bingham, Marc Zyngier): * Add support for delayed timers to the devfreq core and make the Samsung exynos5422-dmc driver use it. * Unify sysfs interface to use "df-" as a prefix in instance names consistently. * Fix devfreq_summary debugfs node indentation. * Add the rockchip,pmu phandle to the rk3399_dmc driver DT bindings. * List Dmitry Osipenko as the Tegra devfreq driver maintainer. * Fix typos in the core devfreq code. - Update the pm-graph utility to version 5.7 including a number of fixes related to suspend-to-idle (Todd Brandt). - Fix coccicheck errors and warnings in the cpupower utility (Shuah Khan). - Replace HTTP links with HTTPs ones in multiple places (Alexander A. Klimov)" * tag 'pm-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (71 commits) cpuidle: ACPI: fix 'return' with no value build warning cpufreq: intel_pstate: Fix EPP setting via sysfs in active mode cpufreq: intel_pstate: Rearrange the storing of new EPP values intel_idle: Customize IceLake server support PM / devfreq: Fix the wrong end with semicolon PM / devfreq: Fix indentaion of devfreq_summary debugfs node PM / devfreq: Clean up the devfreq instance name in sysfs attr memory: samsung: exynos5422-dmc: Add module param to control IRQ mode memory: samsung: exynos5422-dmc: Adjust polling interval and uptreshold memory: samsung: exynos5422-dmc: Use delayed timer as default PM / devfreq: Add support delayed timer for polling mode dt-bindings: devfreq: rk3399_dmc: Add rockchip,pmu phandle PM / devfreq: tegra: Add Dmitry as a maintainer PM / devfreq: event: Fix trivial spelling PM / devfreq: rk3399_dmc: Fix kernel oops when rockchip,pmu is absent cpuidle: change enter_s2idle() prototype cpuidle: psci: Prevent domain idlestates until consumers are ready cpuidle: psci: Convert PM domain to platform driver cpuidle: psci: Fix error path via converting to a platform driver cpuidle: psci: Fail cpuidle registration if set OSI mode failed ...
2020-08-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller68-380/+2352
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-08-04 The following pull-request contains BPF updates for your *net-next* tree. We've added 73 non-merge commits during the last 9 day(s) which contain a total of 135 files changed, 4603 insertions(+), 1013 deletions(-). The main changes are: 1) Implement bpf_link support for XDP. Also add LINK_DETACH operation for the BPF syscall allowing processes with BPF link FD to force-detach, from Andrii Nakryiko. 2) Add BPF iterator for map elements and to iterate all BPF programs for efficient in-kernel inspection, from Yonghong Song and Alexei Starovoitov. 3) Separate bpf_get_{stack,stackid}() helpers for perf events in BPF to avoid unwinder errors, from Song Liu. 4) Allow cgroup local storage map to be shared between programs on the same cgroup. Also extend BPF selftests with coverage, from YiFei Zhu. 5) Add BPF exception tables to ARM64 JIT in order to be able to JIT BPF_PROBE_MEM load instructions, from Jean-Philippe Brucker. 6) Follow-up fixes on BPF socket lookup in combination with reuseport group handling. Also add related BPF selftests, from Jakub Sitnicki. 7) Allow to use socket storage in BPF_PROG_TYPE_CGROUP_SOCK-typed programs for socket create/release as well as bind functions, from Stanislav Fomichev. 8) Fix an info leak in xsk_getsockopt() when retrieving XDP stats via old struct xdp_statistics, from Peilin Ye. 9) Fix PT_REGS_RC{,_CORE}() macros in libbpf for MIPS arch, from Jerry Crunchtime. 10) Extend BPF kernel test infra with skb->family and skb->{local,remote}_ip{4,6} fields and allow user space to specify skb->dev via ifindex, from Dmitry Yakunin. 11) Fix a bpftool segfault due to missing program type name and make it more robust to prevent them in future gaps, from Quentin Monnet. 12) Consolidate cgroup helper functions across selftests and fix a v6 localhost resolver issue, from John Fastabend. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-08-03selftests: mlxsw: RED: Test offload of trapping on RED qeventsPetr Machata2-6/+40
Add a selftest for RED early_drop and mark qevents when a trap action is attached at the associated block. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-08-03Merge tag 'platform-drivers-x86-v5.9-1' of ↵Linus Torvalds1-20/+61
git://git.infradead.org/linux-platform-drivers-x86 Pull x86 platform driver updates from Andy Shevchenko: - ASUS WMI driver honors BAT1 name of the battery (quite a few new laptops are using it) - Dell WMI driver supports new key codes and backlight events - ThinkPad ACPI driver now may use standard charge threshold interface, it also has been updated to provide Laptop or Desktop mode to the user - Intel Speed Select Technology gained support on Sapphire Rapids platform - Regular update of Speed Select Technology tools - Mellanox has been updated to support complex attributes - PMC core driver has been fixed to show correct names for LPM0 register - HTTP links were replaced by HTTPS ones where it applies - Miscellaneous fixes and cleanups here and there * tag 'platform-drivers-x86-v5.9-1' of git://git.infradead.org/linux-platform-drivers-x86: (42 commits) platform/x86: asus-nb-wmi: Drop duplicate DMI quirk structures platform/x86: thinkpad_acpi: Make some symbols static platform/x86: thinkpad_acpi: add documentation for battery charge control platform/x86: thinkpad_acpi: use standard charge control attribute names platform/x86: thinkpad_acpi: remove unused defines platform/x86: ISST: drop a duplicated word in isst_if.h tools/power/x86/intel-speed-select: Update version for v5.9 tools/power/x86/intel-speed-select: Add retries for mail box commands tools/power/x86/intel-speed-select: Add option to delay mbox commands tools/power/x86/intel-speed-select: Ignore -o option processing on error tools/power/x86/intel-speed-select: Change path for caching topology info platform/x86: acerhdf: Replace HTTP links with HTTPS ones platform/x86: apple-gmux: Replace HTTP links with HTTPS ones platform/x86: pcengines-apuv2: revert wiring up simswitch GPIO as LED platform/x86: mlx-platform: Extend FAN platform data description platform_data/mlxreg: Add presence register field for FAN devices Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces platform/mellanox: mlxreg-io: Add support for complex attributes platform/x86: mlx-platform: Add more definitions for system attributes platform_data/mlxreg: Add support for complex attributes ...
2020-08-03Merge tag 'x86-fpu-2020-08-03' of ↵Linus Torvalds5-0/+119
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 FPU selftest from Ingo Molnar: "Add the /sys/kernel/debug/selftest_helpers/test_fpu FPU self-test" * tag 'x86-fpu-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests/fpu: Add an FPU selftest
2020-08-03Merge tag 'objtool-core-2020-08-03' of ↵Linus Torvalds8-244/+375
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - Add support for non-rela relocations, in preparation to merge 'recordmcount' functionality into objtool - Fix assumption that broke under --ffunction-sections (LTO) builds - Misc cleanups * tag 'objtool-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Add support for relocations without addends objtool: Rename rela to reloc objtool: Use sh_info to find the base for .rela sections objtool: Do not assume order of parent/child functions
2020-08-03Merge tag 'locking-core-2020-08-03' of ↵Linus Torvalds5-48/+102
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - LKMM updates: mostly documentation changes, but also some new litmus tests for atomic ops. - KCSAN updates: the most important change is that GCC 11 now has all fixes in place to support KCSAN, so GCC support can be enabled again. Also more annotations. - futex updates: minor cleanups and simplifications - seqlock updates: merge preparatory changes/cleanups for the 'associated locks' facilities. - lockdep updates: - simplify IRQ trace event handling - add various new debug checks - simplify header dependencies, split out <linux/lockdep_types.h>, decouple lockdep from other low level headers some more - fix NMI handling - misc cleanups and smaller fixes * tag 'locking-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) kcsan: Improve IRQ state trace reporting lockdep: Refactor IRQ trace events fields into struct seqlock: lockdep assert non-preemptibility on seqcount_t write lockdep: Add preemption enabled/disabled assertion APIs seqlock: Implement raw_seqcount_begin() in terms of raw_read_seqcount() seqlock: Add kernel-doc for seqcount_t and seqlock_t APIs seqlock: Reorder seqcount_t and seqlock_t API definitions seqlock: seqcount_t latch: End read sections with read_seqcount_retry() seqlock: Properly format kernel-doc code samples Documentation: locking: Describe seqlock design and usage locking/qspinlock: Do not include atomic.h from qspinlock_types.h locking/atomic: Move ATOMIC_INIT into linux/types.h lockdep: Move list.h inclusion into lockdep.h locking/lockdep: Fix TRACE_IRQFLAGS vs. NMIs futex: Remove unused or redundant includes futex: Consistently use fshared as boolean futex: Remove needless goto's futex: Remove put_futex_key() rwsem: fix commas in initialisation docs: locking: Replace HTTP links with HTTPS ones ...
2020-08-03bpf: Allow to specify ifindex for skb in bpf_prog_test_run_skbDmitry Yakunin1-0/+5
Now skb->dev is unconditionally set to the loopback device in current net namespace. But if we want to test bpf program which contains code branch based on ifindex condition (eg filters out localhost packets) it is useful to allow specifying of ifindex from userspace. This patch adds such option through ctx_in (__sk_buff) parameter. Signed-off-by: Dmitry Yakunin <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-03Merge tag 'core-rcu-2020-08-03' of ↵Linus Torvalds17-31/+403
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: - kfree_rcu updates - RCU tasks updates - Read-side scalability tests - SRCU updates - Torture-test updates - Documentation updates - Miscellaneous fixes * tag 'core-rcu-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (109 commits) torture: Remove obsolete "cd $KVM" torture: Avoid duplicate specification of qemu command torture: Dump ftrace at shutdown only if requested torture: Add kvm-tranform.sh script for qemu-cmd files torture: Add more tracing crib notes to kvm.sh torture: Improve diagnostic for KCSAN-incapable compilers torture: Correctly summarize build-only runs torture: Pass --kmake-arg to all make invocations rcutorture: Check for unwatched readers torture: Abstract out console-log error detection torture: Add a stop-run capability torture: Create qemu-cmd in --buildonly runs rcu/rcutorture: Replace 0 with false torture: Add --allcpus argument to the kvm.sh script torture: Remove whitespace from identify_qemu_vcpus output rcutorture: NULL rcu_torture_current earlier in cleanup code rcutorture: Handle non-statistic bang-string error messages torture: Set configfile variable to current scenario rcutorture: Add races with task-exit processing locktorture: Use true and false to assign to bool variables ...
2020-08-03Merge tag 'arm64-upstream' of ↵Linus Torvalds4-18/+124
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 and cross-arch updates from Catalin Marinas: "Here's a slightly wider-spread set of updates for 5.9. Going outside the usual arch/arm64/ area is the removal of read_barrier_depends() series from Will and the MSI/IOMMU ID translation series from Lorenzo. The notable arm64 updates include ARMv8.4 TLBI range operations and translation level hint, time namespace support, and perf. Summary: - Removal of the tremendously unpopular read_barrier_depends() barrier, which is a NOP on all architectures apart from Alpha, in favour of allowing architectures to override READ_ONCE() and do whatever dance they need to do to ensure address dependencies provide LOAD -> LOAD/STORE ordering. This work also offers a potential solution if compilers are shown to convert LOAD -> LOAD address dependencies into control dependencies (e.g. under LTO), as weakly ordered architectures will effectively be able to upgrade READ_ONCE() to smp_load_acquire(). The latter case is not used yet, but will be discussed further at LPC. - Make the MSI/IOMMU input/output ID translation PCI agnostic, augment the MSI/IOMMU ACPI/OF ID mapping APIs to accept an input ID bus-specific parameter and apply the resulting changes to the device ID space provided by the Freescale FSL bus. - arm64 support for TLBI range operations and translation table level hints (part of the ARMv8.4 architecture version). - Time namespace support for arm64. - Export the virtual and physical address sizes in vmcoreinfo for makedumpfile and crash utilities. - CPU feature handling cleanups and checks for programmer errors (overlapping bit-fields). - ACPI updates for arm64: disallow AML accesses to EFI code regions and kernel memory. - perf updates for arm64. - Miscellaneous fixes and cleanups, most notably PLT counting optimisation for module loading, recordmcount fix to ignore relocations other than R_AARCH64_CALL26, CMA areas reserved for gigantic pages on 16K and 64K configurations. - Trivial typos, duplicate words" Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (82 commits) arm64: use IRQ_STACK_SIZE instead of THREAD_SIZE for irq stack arm64/mm: save memory access in check_and_switch_context() fast switch path arm64: sigcontext.h: delete duplicated word arm64: ptrace.h: delete duplicated word arm64: pgtable-hwdef.h: delete duplicated words bus: fsl-mc: Add ACPI support for fsl-mc bus/fsl-mc: Refactor the MSI domain creation in the DPRC driver of/irq: Make of_msi_map_rid() PCI bus agnostic of/irq: make of_msi_map_get_device_domain() bus agnostic dt-bindings: arm: fsl: Add msi-map device-tree binding for fsl-mc bus of/device: Add input id to of_dma_configure() of/iommu: Make of_map_rid() PCI agnostic ACPI/IORT: Add an input ID to acpi_dma_configure() ACPI/IORT: Remove useless PCI bus walk ACPI/IORT: Make iort_msi_map_rid() PCI agnostic ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic ACPI/IORT: Make iort_match_node_callback walk the ACPI namespace for NC arm64: enable time namespace support arm64/vdso: Restrict splitting VVAR VMA arm64/vdso: Handle faults on timens page ...
2020-08-03tools/bootconfig: Add testcases for value override operatorMasami Hiramatsu4-0/+25
Add some testcases and examples for value override operator. Link: https://lkml.kernel.org/r/159482883824.126704.2166030493721357163.stgit@devnote2 Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-08-03virtio: virtio_has_iommu_quirk -> virtio_has_dma_quirkMichael S. Tsirkin1-2/+2
Now that the corresponding feature bit has been renamed, rename the quirk too - it's about special ways to do DMA, not necessarily about the IOMMU. Signed-off-by: Michael S. Tsirkin <[email protected]>
2020-08-03virtio: VIRTIO_F_IOMMU_PLATFORM -> VIRTIO_F_ACCESS_PLATFORMMichael S. Tsirkin1-1/+1
Rename the bit to match latest virtio spec. Add a compat macro to avoid breaking existing userspace. Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: David Hildenbrand <[email protected]>
2020-08-03Merge tag 'for-5.9/io_uring-20200802' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+5
Pull io_uring updates from Jens Axboe: "Lots of cleanups in here, hardening the code and/or making it easier to read and fixing bugs, but a core feature/change too adding support for real async buffered reads. With the latter in place, we just need buffered write async support and we're done relying on kthreads for the fast path. In detail: - Cleanup how memory accounting is done on ring setup/free (Bijan) - sq array offset calculation fixup (Dmitry) - Consistently handle blocking off O_DIRECT submission path (me) - Support proper async buffered reads, instead of relying on kthread offload for that. This uses the page waitqueue to drive retries from task_work, like we handle poll based retry. (me) - IO completion optimizations (me) - Fix race with accounting and ring fd install (me) - Support EPOLLEXCLUSIVE (Jiufei) - Get rid of the io_kiocb unionizing, made possible by shrinking other bits (Pavel) - Completion side cleanups (Pavel) - Cleanup REQ_F_ flags handling, and kill off many of them (Pavel) - Request environment grabbing cleanups (Pavel) - File and socket read/write cleanups (Pavel) - Improve kiocb_set_rw_flags() (Pavel) - Tons of fixes and cleanups (Pavel) - IORING_SQ_NEED_WAKEUP clear fix (Xiaoguang)" * tag 'for-5.9/io_uring-20200802' of git://git.kernel.dk/linux-block: (127 commits) io_uring: flip if handling after io_setup_async_rw fs: optimise kiocb_set_rw_flags() io_uring: don't touch 'ctx' after installing file descriptor io_uring: get rid of atomic FAA for cq_timeouts io_uring: consolidate *_check_overflow accounting io_uring: fix stalled deferred requests io_uring: fix racy overflow count reporting io_uring: deduplicate __io_complete_rw() io_uring: de-unionise io_kiocb io-wq: update hash bits io_uring: fix missing io_queue_linked_timeout() io_uring: mark ->work uninitialised after cleanup io_uring: deduplicate io_grab_files() calls io_uring: don't do opcode prep twice io_uring: clear IORING_SQ_NEED_WAKEUP after executing task works io_uring: batch put_task_struct() tasks: add put_task_struct_many() io_uring: return locked and pinned page accounting io_uring: don't miscount pinned memory io_uring: don't open-code recv kbuf managment ...
2020-08-03Merge tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+1
Pull core block updates from Jens Axboe: "Good amount of cleanups and tech debt removals in here, and as a result, the diffstat shows a nice net reduction in code. - Softirq completion cleanups (Christoph) - Stop using ->queuedata (Christoph) - Cleanup bd claiming (Christoph) - Use check_events, moving away from the legacy media change (Christoph) - Use inode i_blkbits consistently (Christoph) - Remove old unused writeback congestion bits (Christoph) - Cleanup/unify submission path (Christoph) - Use bio_uninit consistently, instead of bio_disassociate_blkg (Christoph) - sbitmap cleared bits handling (John) - Request merging blktrace event addition (Jan) - sysfs add/remove race fixes (Luis) - blk-mq tag fixes/optimizations (Ming) - Duplicate words in comments (Randy) - Flush deferral cleanup (Yufen) - IO context locking/retry fixes (John) - struct_size() usage (Gustavo) - blk-iocost fixes (Chengming) - blk-cgroup IO stats fixes (Boris) - Various little fixes" * tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block: (135 commits) block: blk-timeout: delete duplicated word block: blk-mq-sched: delete duplicated word block: blk-mq: delete duplicated word block: genhd: delete duplicated words block: elevator: delete duplicated word and fix typos block: bio: delete duplicated words block: bfq-iosched: fix duplicated word iocost_monitor: start from the oldest usage index iocost: Fix check condition of iocg abs_vdebt block: Remove callback typedefs for blk_mq_ops block: Use non _rcu version of list functions for tag_set_list blk-cgroup: show global disk stats in root cgroup io.stat blk-cgroup: make iostat functions visible to stat printing block: improve discard bio alignment in __blkdev_issue_discard() block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers block: defer flush request no matter whether we have elevator block: make blk_timeout_init() static block: remove retry loop in ioc_release_fn() block: remove unnecessary ioc nested locking block: integrate bd_start_claiming into __blkdev_get ...
2020-08-03tools/resolve_btfids: Use libbpf's btf__parse() APIAndrii Nakryiko2-57/+5
Instead of re-implementing generic BTF parsing logic, use libbpf's API. Also add .gitignore for resolve_btfids's build artifacts. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-03tools/bpftool: Use libbpf's btf__parse() API for parsing BTF from fileAndrii Nakryiko1-53/+1
Use generic libbpf API to parse BTF data from file, instead of re-implementing it in bpftool. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-03libbpf: Add btf__parse_raw() and generic btf__parse() APIsAndrii Nakryiko3-38/+83
Add public APIs to parse BTF from raw data file (e.g., /sys/kernel/btf/vmlinux), as well as generic btf__parse(), which will try to determine correct format, currently either raw or ELF. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-03tools, bpftool: Fix wrong return value in do_dump()Tianjia Zhang1-1/+1
In case of btf_id does not exist, a negative error code -ENOENT should be returned. Fixes: c93cc69004df3 ("bpftool: add ability to dump BTF types") Signed-off-by: Tianjia Zhang <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Tobias Klauser <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-03tools, build: Propagate build failures from tools/build/Makefile.buildAndrii Nakryiko1-1/+2
The '&&' command seems to have a bad effect when $(cmd_$(1)) exits with non-zero effect: the command failure is masked (despite `set -e`) and all but the first command of $(dep-cmd) is executed (successfully, as they are mostly printfs), thus overall returning 0 in the end. This means in practice that despite compilation errors, tools's build Makefile will return success. We see this very reliably with libbpf's Makefile, which doesn't get compilation error propagated properly. This in turns causes issues with selftests build, as well as bpftool and other projects that rely on building libbpf. The fix is simple: don't use &&. Given `set -e`, we don't need to chain commands with &&. The shell will exit on first failure, giving desired behavior and propagating error properly. Fixes: 275e2d95591e ("tools build: Move dependency copy into function") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-03Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo43-147/+437
Minor conflict in tools/perf/arch/arm/util/auxtrace.c as one fix there was cherry-picked for the last perf/urgent pull req to Linus, so was already there. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-08-03selftests/powerpc: Skip vmx/vsx/tar/etc tests on older CPUsMichael Ellerman10-6/+34
Some of our tests use VSX or newer VMX instructions, so need to be skipped on older CPUs to avoid SIGILL'ing. Similarly TAR was added in v2.07, and the PMU event used in the stcx fail test only works on Power8 or later. Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-08-03Merge branches 'acpi-mm', 'acpi-tables', 'acpi-apei' and 'acpi-misc'Rafael J. Wysocki1-1/+1
* acpi-mm: ACPI: OSL: Clean up the removal of unused memory mappings ACPI: OSL: Use deferred unmapping in acpi_os_unmap_iomem() ACPI: OSL: Use deferred unmapping in acpi_os_unmap_generic_address() ACPICA: Preserve memory opregion mappings ACPI: OSL: Implement deferred unmapping of ACPI memory * acpi-tables: ACPI: NUMA: Remove the useless 'node >= MAX_NUMNODES' check ACPI: NUMA: Remove the useless sub table pointer check ACPI: tables: Remove the duplicated checks for acpi_parse_entries_array() ACPI: tables: avoid relocations for table signature array * acpi-apei: ACPI: APEI: remove redundant assignment to variable rc * acpi-misc: ACPI: Replace HTTP links with HTTPS ones ACPI: Use valid link to the ACPI specification ACPI: Use fallthrough pseudo-keyword
2020-08-03selftests: netfilter: add meta iif/oif match testFlorian Westphal2-1/+125
simple test case, but would have caught this: FAIL: iifgroupcount, want "packets 2", got table inet filter { counter iifgroupcount { packets 0 bytes 0 } } Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2020-08-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-9/+33
Pull KVM fixes from Paolo Bonzini: "Bugfixes and strengthening the validity checks on inputs from new userspace APIs. Now I know why I shouldn't prepare pull requests on the weekend, it's hard to concentrate if your son is shouting about his latest Minecraft builds in your ear. Fortunately all the patches were ready and I just had to check the test results..." * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: SVM: Fix disable pause loop exit/pause filtering capability on SVM KVM: LAPIC: Prevent setting the tscdeadline timer if the lapic is hw disabled KVM: arm64: Don't inherit exec permission across page-table levels KVM: arm64: Prevent vcpu_has_ptrauth from generating OOL functions KVM: nVMX: check for invalid hdr.vmx.flags KVM: nVMX: check for required but missing VMCS12 in KVM_SET_NESTED_STATE selftests: kvm: do not set guest mode flag
2020-08-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller12-29/+129
Resolved kernel/bpf/btf.c using instructions from merge commit 69138b34a7248d2396ab85c8652e20c0c39beaba Signed-off-by: David S. Miller <[email protected]>
2020-08-01selftests/bpf: Fix spurious test failures in core_retro selftestAndrii Nakryiko2-2/+19
core_retro selftest uses BPF program that's triggered on sys_enter system-wide, but has no protection from some unrelated process doing syscall while selftest is running. This leads to occasional test failures with unexpected PIDs being returned. Fix that by filtering out all processes that are not test_progs process. Fixes: fcda189a5133 ("selftests/bpf: Add test relying only on CO-RE and no recent kernel features") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-01tools/bpftool: Add documentation and bash-completion for `link detach`Andrii Nakryiko2-2/+10
Add info on link detach sub-command to man page. Add detach to bash-completion as well. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: John Fastabend <[email protected]. Link: https://lore.kernel.org/bpf/[email protected]
2020-08-01tools/bpftool: Add `link detach` subcommandAndrii Nakryiko1-1/+36
Add ability to force-detach BPF link. Also add missing error message, if specified link ID is wrong. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-01selftests/bpf: Add link detach tests for cgroup, netns, and xdp bpf_linksAndrii Nakryiko5-29/+73
Add bpf_link__detach() testing to selftests for cgroup, netns, and xdp bpf_links. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-01libbpf: Add bpf_link detach APIsAndrii Nakryiko6-0/+25
Add low-level bpf_link_detach() API. Also add higher-level bpf_link__detach() one. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-08-01bpf, selftests: Use single cgroup helpers for both test_sockmap/progsJohn Fastabend15-132/+43
Nearly every user of cgroup helpers does the same sequence of API calls. So push these into a single helper cgroup_setup_and_join. The cases that do a bit of extra logic are test_progs which currently uses an env variable to decide if it needs to setup the cgroup environment or can use an existingi environment. And then tests that are doing cgroup tests themselves. We skip these cases for now. Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/159623335418.30208.15807461815525100199.stgit@john-XPS-13-9370
2020-08-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds8-23/+121
Pull networking fixes from David Miller: 1) Encap offset calculation is incorrect in esp6, from Sabrina Dubroca. 2) Better parameter validation in pfkey_dump(), from Mark Salyzyn. 3) Fix several clang issues on powerpc in selftests, from Tanner Love. 4) cmsghdr_from_user_compat_to_kern() uses the wrong length, from Al Viro. 5) Out of bounds access in mlx5e driver, from Raed Salem. 6) Fix transfer buffer memleak in lan78xx, from Johan Havold. 7) RCU fixups in rhashtable, from Herbert Xu. 8) Fix ipv6 nexthop refcnt leak, from Xiyu Yang. 9) vxlan FDB dump must be done under RCU, from Ido Schimmel. 10) Fix use after free in mlxsw, from Ido Schimmel. 11) Fix map leak in HASH_OF_MAPS bpf code, from Andrii Nakryiko. 12) Fix bug in mac80211 Tx ack status reporting, from Vasanthakumar Thiagarajan. 13) Fix memory leaks in IPV6_ADDRFORM code, from Cong Wang. 14) Fix bpf program reference count leaks in mlx5 during mlx5e_alloc_rq(), from Xin Xiong. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (86 commits) vxlan: fix memleak of fdb rds: Prevent kernel-infoleak in rds_notify_queue_get() net/sched: The error lable position is corrected in ct_init_module net/mlx5e: fix bpf_prog reference count leaks in mlx5e_alloc_rq net/mlx5e: E-Switch, Specify flow_source for rule with no in_port net/mlx5e: E-Switch, Add misc bit when misc fields changed for mirroring net/mlx5e: CT: Support restore ipv6 tunnel net: gemini: Fix missing clk_disable_unprepare() in error path of gemini_ethernet_port_probe() ionic: unlock queue mutex in error path atm: fix atm_dev refcnt leaks in atmtcp_remove_persistent net: ethernet: mtk_eth_soc: fix MTU warnings net: nixge: fix potential memory leak in nixge_probe() devlink: ignore -EOPNOTSUPP errors on dumpit rxrpc: Fix race between recvmsg and sendmsg on immediate call failure MAINTAINERS: Replace Thor Thayer as Altera Triple Speed Ethernet maintainer selftests/bpf: fix netdevsim trap_flow_action_cookie read ipv6: fix memory leaks on IPV6_ADDRFORM path net/bpfilter: Initialize pos in __bpfilter_process_sockopt igb: reinit_locked() should be called with rtnl_lock e1000e: continue to init PHY even when failed to disable ULP ...
2020-08-01Merge branch 'lkmm' of ↵Ingo Molnar4-46/+100
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into locking/core Pull v5.9 LKMM changes from Paul E. McKenney. Mostly documentation changes, but also some new litmus tests for atomic ops. Signed-off-by: Ingo Molnar <[email protected]>
2020-07-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2-14/+111
Daniel Borkmann says: ==================== pull-request: bpf 2020-07-31 The following pull-request contains BPF updates for your *net* tree. We've added 5 non-merge commits during the last 21 day(s) which contain a total of 5 files changed, 126 insertions(+), 18 deletions(-). The main changes are: 1) Fix a map element leak in HASH_OF_MAPS map type, from Andrii Nakryiko. 2) Fix a NULL pointer dereference in __btf_resolve_helper_id() when no btf_vmlinux is available, from Peilin Ye. 3) Init pos variable in __bpfilter_process_sockopt(), from Christoph Hellwig. 4) Fix a cgroup sockopt verifier test by specifying expected attach type, from Jean-Philippe Brucker. Note that when net gets merged into net-next later on, there is a small merge conflict in kernel/bpf/btf.c between commit 5b801dfb7feb ("bpf: Fix NULL pointer dereference in __btf_resolve_helper_id()") from the bpf tree and commit 138b9a0511c7 ("bpf: Remove btf_id helpers resolving") from the net-next tree. Resolve as follows: remove the old hunk with the __btf_resolve_helper_id() function. Change the btf_resolve_helper_id() so it actually tests for a NULL btf_vmlinux and bails out: int btf_resolve_helper_id(struct bpf_verifier_log *log, const struct bpf_func_proto *fn, int arg) { int id; if (fn->arg_type[arg] != ARG_PTR_TO_BTF_ID || !btf_vmlinux) return -EINVAL; id = fn->btf_id[arg]; if (!id || id > btf_vmlinux->nr_types) return -EINVAL; return id; } Let me know if you run into any others issues (CC'ing Jiri Olsa so he's in the loop with regards to merge conflict resolution). ==================== Signed-off-by: David S. Miller <[email protected]>