aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2022-02-14swiotlb: fix info leak with DMA_FROM_DEVICEHalil Pasic1-1/+2
The problem I'm addressing was discovered by the LTP test covering cve-2018-1000204. A short description of what happens follows: 1) The test case issues a command code 00 (TEST UNIT READY) via the SG_IO interface with: dxfer_len == 524288, dxdfer_dir == SG_DXFER_FROM_DEV and a corresponding dxferp. The peculiar thing about this is that TUR is not reading from the device. 2) In sg_start_req() the invocation of blk_rq_map_user() effectively bounces the user-space buffer. As if the device was to transfer into it. Since commit a45b599ad808 ("scsi: sg: allocate with __GFP_ZERO in sg_build_indirect()") we make sure this first bounce buffer is allocated with GFP_ZERO. 3) For the rest of the story we keep ignoring that we have a TUR, so the device won't touch the buffer we prepare as if the we had a DMA_FROM_DEVICE type of situation. My setup uses a virtio-scsi device and the buffer allocated by SG is mapped by the function virtqueue_add_split() which uses DMA_FROM_DEVICE for the "in" sgs (here scatter-gather and not scsi generics). This mapping involves bouncing via the swiotlb (we need swiotlb to do virtio in protected guest like s390 Secure Execution, or AMD SEV). 4) When the SCSI TUR is done, we first copy back the content of the second (that is swiotlb) bounce buffer (which most likely contains some previous IO data), to the first bounce buffer, which contains all zeros. Then we copy back the content of the first bounce buffer to the user-space buffer. 5) The test case detects that the buffer, which it zero-initialized, ain't all zeros and fails. One can argue that this is an swiotlb problem, because without swiotlb we leak all zeros, and the swiotlb should be transparent in a sense that it does not affect the outcome (if all other participants are well behaved). Copying the content of the original buffer into the swiotlb buffer is the only way I can think of to make swiotlb transparent in such scenarios. So let's do just that if in doubt, but allow the driver to tell us that the whole mapped buffer is going to be overwritten, in which case we can preserve the old behavior and avoid the performance impact of the extra bounce. Signed-off-by: Halil Pasic <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2022-02-13Merge tag 'sched_urgent_for_v5.17_rc4' of ↵Linus Torvalds1-5/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Borislav Petkov: "Fix a NULL-ptr dereference when recalculating a sched entity's weight" * tag 'sched_urgent_for_v5.17_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Fix fault in reweight_entity
2022-02-13Merge tag 'perf_urgent_for_v5.17_rc4' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Borislav Petkov: "Prevent cgroup event list corruption when switching events" * tag 'perf_urgent_for_v5.17_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix list corruption in perf_cgroup_switch()
2022-02-12Merge tag 'seccomp-v5.17-rc4' of ↵Linus Torvalds2-2/+13
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp fixes from Kees Cook: "This fixes a corner case of fatal SIGSYS being ignored since v5.15. Along with the signal fix is a change to seccomp so that seeing another syscall after a fatal filter result will cause seccomp to kill the process harder. Summary: - Force HANDLER_EXIT even for SIGNAL_UNKILLABLE - Make seccomp self-destruct after fatal filter results - Update seccomp samples for easier behavioral demonstration" * tag 'seccomp-v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: samples/seccomp: Adjust sample to also provide kill option seccomp: Invalidate seccomp mode to catch death failures signal: HANDLER_EXIT should clear SIGNAL_UNKILLABLE
2022-02-11lockdep: Correct lock_classes index mappingCheng Jui Wang1-2/+2
A kernel exception was hit when trying to dump /proc/lockdep_chains after lockdep report "BUG: MAX_LOCKDEP_CHAIN_HLOCKS too low!": Unable to handle kernel paging request at virtual address 00054005450e05c3 ... 00054005450e05c3] address between user and kernel address ranges ... pc : [0xffffffece769b3a8] string+0x50/0x10c lr : [0xffffffece769ac88] vsnprintf+0x468/0x69c ... Call trace: string+0x50/0x10c vsnprintf+0x468/0x69c seq_printf+0x8c/0xd8 print_name+0x64/0xf4 lc_show+0xb8/0x128 seq_read_iter+0x3cc/0x5fc proc_reg_read_iter+0xdc/0x1d4 The cause of the problem is the function lock_chain_get_class() will shift lock_classes index by 1, but the index don't need to be shifted anymore since commit 01bb6f0af992 ("locking/lockdep: Change the range of class_idx in held_lock struct") already change the index to start from 0. The lock_classes[-1] located at chain_hlocks array. When printing lock_classes[-1] after the chain_hlocks entries are modified, the exception happened. The output of lockdep_chains are incorrect due to this problem too. Fixes: f611e8cf98ec ("lockdep: Take read/write status in consideration when generate chainkey") Signed-off-by: Cheng Jui Wang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Boqun Feng <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-02-11bpf: Emit bpf_timer in vmlinux BTFYonghong Song1-0/+2
Currently the following code in check_and_init_map_value() *(struct bpf_timer *)(dst + map->timer_off) = (struct bpf_timer){}; can help generate bpf_timer definition in vmlinuxBTF. But the code above may not zero the whole structure due to anonymour members and that code will be replaced by memset in the subsequent patch and bpf_timer definition will disappear from vmlinuxBTF. Let us emit the type explicitly so bpf program can continue to use it from vmlinux.h. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-11Merge tag 'acpi-5.17-rc4' of ↵Linus Torvalds3-4/+5
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These revert two commits that turned out to be problematic and fix two issues related to wakeup from suspend-to-idle on x86. Specifics: - Revert a recent change that attempted to avoid issues with conflicting address ranges during PCI initialization, because it turned out to introduce a regression (Hans de Goede). - Revert a change that limited EC GPE wakeups from suspend-to-idle to systems based on Intel hardware, because it turned out that systems based on hardware from other vendors depended on that functionality too (Mario Limonciello). - Fix two issues related to the handling of wakeup interrupts and wakeup events signaled through the EC GPE during suspend-to-idle on x86 (Rafael Wysocki)" * tag 'acpi-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: x86/PCI: revert "Ignore E820 reservations for bridge windows on newer systems" PM: s2idle: ACPI: Fix wakeup interrupts handling ACPI: PM: s2idle: Cancel wakeup before dispatching EC GPE ACPI: PM: Revert "Only mark EC GPE for wakeup on Intel systems"
2022-02-11Merge tag 'trace-v5.17-rc2' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Fixes to the RTLA tooling - A fix to a tp_printk overriding tp_printk_stop_on_boot on the command line * tag 'trace-v5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix tp_printk option related with tp_printk_stop_on_boot MAINTAINERS: Add RTLA entry rtla: Fix segmentation fault when failing to enable -t rtla/trace: Error message fixup rtla/utils: Fix session duration parsing rtla: Follow kernel version
2022-02-11copy_process(): Move fd_install() out of sighand->siglock critical sectionWaiman Long1-4/+3
I was made aware of the following lockdep splat: [ 2516.308763] ===================================================== [ 2516.309085] WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected [ 2516.309433] 5.14.0-51.el9.aarch64+debug #1 Not tainted [ 2516.309703] ----------------------------------------------------- [ 2516.310149] stress-ng/153663 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: [ 2516.310512] ffff0000e422b198 (&newf->file_lock){+.+.}-{2:2}, at: fd_install+0x368/0x4f0 [ 2516.310944] and this task is already holding: [ 2516.311248] ffff0000c08140d8 (&sighand->siglock){-.-.}-{2:2}, at: copy_process+0x1e2c/0x3e80 [ 2516.311804] which would create a new lock dependency: [ 2516.312066] (&sighand->siglock){-.-.}-{2:2} -> (&newf->file_lock){+.+.}-{2:2} [ 2516.312446] but this new dependency connects a HARDIRQ-irq-safe lock: [ 2516.312983] (&sighand->siglock){-.-.}-{2:2} : [ 2516.330700] Possible interrupt unsafe locking scenario: [ 2516.331075] CPU0 CPU1 [ 2516.331328] ---- ---- [ 2516.331580] lock(&newf->file_lock); [ 2516.331790] local_irq_disable(); [ 2516.332231] lock(&sighand->siglock); [ 2516.332579] lock(&newf->file_lock); [ 2516.332922] <Interrupt> [ 2516.333069] lock(&sighand->siglock); [ 2516.333291] *** DEADLOCK *** [ 2516.389845] stack backtrace: [ 2516.390101] CPU: 3 PID: 153663 Comm: stress-ng Kdump: loaded Not tainted 5.14.0-51.el9.aarch64+debug #1 [ 2516.390756] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 [ 2516.391155] Call trace: [ 2516.391302] dump_backtrace+0x0/0x3e0 [ 2516.391518] show_stack+0x24/0x30 [ 2516.391717] dump_stack_lvl+0x9c/0xd8 [ 2516.391938] dump_stack+0x1c/0x38 [ 2516.392247] print_bad_irq_dependency+0x620/0x710 [ 2516.392525] check_irq_usage+0x4fc/0x86c [ 2516.392756] check_prev_add+0x180/0x1d90 [ 2516.392988] validate_chain+0x8e0/0xee0 [ 2516.393215] __lock_acquire+0x97c/0x1e40 [ 2516.393449] lock_acquire.part.0+0x240/0x570 [ 2516.393814] lock_acquire+0x90/0xb4 [ 2516.394021] _raw_spin_lock+0xe8/0x154 [ 2516.394244] fd_install+0x368/0x4f0 [ 2516.394451] copy_process+0x1f5c/0x3e80 [ 2516.394678] kernel_clone+0x134/0x660 [ 2516.394895] __do_sys_clone3+0x130/0x1f4 [ 2516.395128] __arm64_sys_clone3+0x5c/0x7c [ 2516.395478] invoke_syscall.constprop.0+0x78/0x1f0 [ 2516.395762] el0_svc_common.constprop.0+0x22c/0x2c4 [ 2516.396050] do_el0_svc+0xb0/0x10c [ 2516.396252] el0_svc+0x24/0x34 [ 2516.396436] el0t_64_sync_handler+0xa4/0x12c [ 2516.396688] el0t_64_sync+0x198/0x19c [ 2517.491197] NET: Registered PF_ATMPVC protocol family [ 2517.491524] NET: Registered PF_ATMSVC protocol family [ 2591.991877] sched: RT throttling activated One way to solve this problem is to move the fd_install() call out of the sighand->siglock critical section. Before commit 6fd2fe494b17 ("copy_process(): don't use ksys_close() on cleanups"), the pidfd installation was done without holding both the task_list lock and the sighand->siglock. Obviously, holding these two locks are not really needed to protect the fd_install() call. So move the fd_install() call down to after the releases of both locks. Link: https://lore.kernel.org/r/[email protected] Fixes: 6fd2fe494b17 ("copy_process(): don't use ksys_close() on cleanups") Reviewed-by: "Eric W. Biederman" <[email protected]> Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
2022-02-10seccomp: Invalidate seccomp mode to catch death failuresKees Cook1-0/+10
If seccomp tries to kill a process, it should never see that process again. To enforce this proactively, switch the mode to something impossible. If encountered: WARN, reject all syscalls, and attempt to kill the process again even harder. Cc: Andy Lutomirski <[email protected]> Cc: Will Drewry <[email protected]> Fixes: 8112c4f140fa ("seccomp: remove 2-phase API") Cc: [email protected] Signed-off-by: Kees Cook <[email protected]>
2022-02-10signal: HANDLER_EXIT should clear SIGNAL_UNKILLABLEKees Cook1-2/+3
Fatal SIGSYS signals (i.e. seccomp RET_KILL_* syscall filter actions) were not being delivered to ptraced pid namespace init processes. Make sure the SIGNAL_UNKILLABLE doesn't get set for these cases. Reported-by: Robert Święcki <[email protected]> Suggested-by: "Eric W. Biederman" <[email protected]> Fixes: 00b06da29cf9 ("signal: Add SA_IMMUTABLE to ensure forced siganls do not get changed") Cc: [email protected] Signed-off-by: Kees Cook <[email protected]> Reviewed-by: "Eric W. Biederman" <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]
2022-02-10bpf: Fix bpf_prog_pack build for ppc64_defconfigSong Liu1-2/+2
bpf_prog_pack causes build error with powerpc ppc64_defconfig: kernel/bpf/core.c:830:23: error: variably modified 'bitmap' at file scope 830 | unsigned long bitmap[BITS_TO_LONGS(BPF_PROG_CHUNK_COUNT)]; | ^~~~~~ This is because the marco expands as: unsigned long bitmap[((((((1UL) << (16 + __pte_index_size)) / (1 << 6))) \ + ((sizeof(long) * 8)) - 1) / ((sizeof(long) * 8)))]; where __pte_index_size is a global variable. Fix it by turning bitmap into a 0-length array. Fixes: 57631054fae6 ("bpf: Introduce bpf_prog_pack allocator") Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-1/+17
No conflicts. Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-10bpf: Convert bpf_preload.ko to use light skeleton.Alexei Starovoitov9-247/+70
The main change is a move of the single line #include "iterators.lskel.h" from iterators/iterators.c to bpf_preload_kern.c. Which means that generated light skeleton can be used from user space or user mode driver like iterators.c or from the kernel module or the kernel itself. The direct use of light skeleton from the kernel module simplifies the code, since UMD is no longer necessary. The libbpf.a required user space and UMD. The CO-RE in the kernel and generated "loader bpf program" used by the light skeleton are capable to perform complex loading operations traditionally provided by libbpf. In addition UMD approach was launching UMD process every time bpffs has to be mounted. With light skeleton in the kernel the bpf_preload kernel module loads bpf iterators once and pins them multiple times into different bpffs mounts. Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Yonghong Song <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-10bpf: Update iterators.lskel.h.Alexei Starovoitov1-72/+69
Light skeleton and skel_internal.h have changed. Update iterators.lskel.h. Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Yonghong Song <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-10bpf: Extend sys_bpf commands for bpf_syscall programs.Alexei Starovoitov1-4/+34
bpf_sycall programs can be used directly by the kernel modules to load programs and create maps via kernel skeleton. . Export bpf_sys_bpf syscall wrapper to be used in kernel skeleton. . Export bpf_map_get to be used in kernel skeleton. . Allow prog_run cmd for bpf_syscall programs with recursion check. . Enable link_create and raw_tp_open cmds. Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Yonghong Song <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-10Merge tag 'audit-pr-20220209' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit fix from Paul Moore: "Another audit fix, this time a single rather small but important fix for an oops/page-fault caused by improperly accessing userspace memory" * tag 'audit-pr-20220209' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: don't deref the syscall args when checking the openat2 open_how::flags
2022-02-09Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski13-544/+958
Daniel Borkmann says: ==================== pull-request: bpf-next 2022-02-09 We've added 126 non-merge commits during the last 16 day(s) which contain a total of 201 files changed, 4049 insertions(+), 2215 deletions(-). The main changes are: 1) Add custom BPF allocator for JITs that pack multiple programs into a huge page to reduce iTLB pressure, from Song Liu. 2) Add __user tagging support in vmlinux BTF and utilize it from BPF verifier when generating loads, from Yonghong Song. 3) Add per-socket fast path check guarding from cgroup/BPF overhead when used by only some sockets, from Pavel Begunkov. 4) Continued libbpf deprecation work of APIs/features and removal of their usage from samples, selftests, libbpf & bpftool, from Andrii Nakryiko and various others. 5) Improve BPF instruction set documentation by adding byte swap instructions and cleaning up load/store section, from Christoph Hellwig. 6) Switch BPF preload infra to light skeleton and remove libbpf dependency from it, from Alexei Starovoitov. 7) Fix architecture-agnostic macros in libbpf for accessing syscall arguments from BPF progs for non-x86 architectures, from Ilya Leoshkevich. 8) Rework port members in struct bpf_sk_lookup and struct bpf_sock to be of 16-bit field with anonymous zero padding, from Jakub Sitnicki. 9) Add new bpf_copy_from_user_task() helper to read memory from a different task than current. Add ability to create sleepable BPF iterator progs, from Kenny Yu. 10) Implement XSK batching for ice's zero-copy driver used by AF_XDP and utilize TX batching API from XSK buffer pool, from Maciej Fijalkowski. 11) Generate temporary netns names for BPF selftests to avoid naming collisions, from Hangbin Liu. 12) Implement bpf_core_types_are_compat() with limited recursion for in-kernel usage, from Matteo Croce. 13) Simplify pahole version detection and finally enable CONFIG_DEBUG_INFO_DWARF5 to be selected with CONFIG_DEBUG_INFO_BTF, from Nathan Chancellor. 14) Misc minor fixes to libbpf and selftests from various folks. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (126 commits) selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide libbpf: Fix compilation warning due to mismatched printf format selftests/bpf: Test BPF_KPROBE_SYSCALL macro libbpf: Add BPF_KPROBE_SYSCALL macro libbpf: Fix accessing the first syscall argument on s390 libbpf: Fix accessing the first syscall argument on arm64 libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL selftests/bpf: Skip test_bpf_syscall_macro's syscall_arg1 on arm64 and s390 libbpf: Fix accessing syscall arguments on riscv libbpf: Fix riscv register names libbpf: Fix accessing syscall arguments on powerpc selftests/bpf: Use PT_REGS_SYSCALL_REGS in bpf_syscall_macro libbpf: Add PT_REGS_SYSCALL_REGS macro selftests/bpf: Fix an endianness issue in bpf_syscall_macro test bpf: Fix bpf_prog_pack build HPAGE_PMD_SIZE bpf: Fix leftover header->pages in sparc and powerpc code. libbpf: Fix signedness bug in btf_dump_array_data() selftests/bpf: Do not export subtest as standalone test bpf, x86_64: Fail gracefully on bpf_jit_binary_pack_finalize failures ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-09audit: don't deref the syscall args when checking the openat2 open_how::flagsPaul Moore1-1/+1
As reported by Jeff, dereferencing the openat2 syscall argument in audit_match_perm() to obtain the open_how::flags can result in an oops/page-fault. This patch fixes this by using the open_how struct that we store in the audit_context with audit_openat2_how(). Independent of this patch, Richard Guy Briggs posted a similar patch to the audit mailing list roughly 40 minutes after this patch was posted. Cc: [email protected] Fixes: 1c30e3af8a79 ("audit: add support for the openat2 syscall") Reported-by: Jeff Mahoney <[email protected]> Signed-off-by: Paul Moore <[email protected]>
2022-02-08bpf: Fix bpf_prog_pack build HPAGE_PMD_SIZESong Liu1-1/+5
Fix build with CONFIG_TRANSPARENT_HUGEPAGE=n with BPF_PROG_PACK_SIZE as PAGE_SIZE. Fixes: 57631054fae6 ("bpf: Introduce bpf_prog_pack allocator") Reported-by: kernel test robot <[email protected]> Signed-off-by: Song Liu <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-08tracing: Fix tp_printk option related with tp_printk_stop_on_bootJaeSang Yoo1-0/+4
The kernel parameter "tp_printk_stop_on_boot" starts with "tp_printk" which is the same as another kernel parameter "tp_printk". If "tp_printk" setup is called before the "tp_printk_stop_on_boot", it will override the latter and keep it from being set. This is similar to other kernel parameter issues, such as: Commit 745a600cf1a6 ("um: console: Ignore console= option") or init/do_mounts.c:45 (setup function of "ro" kernel param) Fix it by checking for a "_" right after the "tp_printk" and if that exists do not process the parameter. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: JaeSang Yoo <[email protected]> [ Fixed up change log and added space after if condition ] Signed-off-by: Steven Rostedt (Google) <[email protected]>
2022-02-07bpf: Introduce bpf_jit_binary_pack_[alloc|finalize|free]Song Liu1-1/+107
This is the jit binary allocator built on top of bpf_prog_pack. bpf_prog_pack allocates RO memory, which cannot be used directly by the JIT engine. Therefore, a temporary rw buffer is allocated for the JIT engine. Once JIT is done, bpf_jit_binary_pack_finalize is used to copy the program to the RO memory. bpf_jit_binary_pack_alloc reserves 16 bytes of extra space for illegal instructions, which is small than the 128 bytes space reserved by bpf_jit_binary_alloc. This change is necessary for bpf_jit_binary_hdr to find the correct header. Also, flag use_bpf_prog_pack is added to differentiate a program allocated by bpf_jit_binary_pack_alloc. Signed-off-by: Song Liu <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-07bpf: Introduce bpf_prog_pack allocatorSong Liu1-0/+127
Most BPF programs are small, but they consume a page each. For systems with busy traffic and many BPF programs, this could add significant pressure to instruction TLB. High iTLB pressure usually causes slow down for the whole system, which includes visible performance degradation for production workloads. Introduce bpf_prog_pack allocator to pack multiple BPF programs in a huge page. The memory is then allocated in 64 byte chunks. Memory allocated by bpf_prog_pack allocator is RO protected after initial allocation. To write to it, the user (jit engine) need to use text poke API. Signed-off-by: Song Liu <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-07bpf: Introduce bpf_arch_text_copySong Liu1-0/+5
This will be used to copy JITed text to RO protected module memory. On x86, bpf_arch_text_copy is implemented with text_poke_copy. bpf_arch_text_copy returns pointer to dst on success, and ERR_PTR(errno) on errors. Signed-off-by: Song Liu <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-07bpf: Use prog->jited_len in bpf_prog_ksym_set_addr()Song Liu2-4/+2
Using prog->jited_len is simpler and more accurate than current estimation (header + header->size). Also, fix missing prog->jited_len with multi function program. This hasn't been a real issue before this. Signed-off-by: Song Liu <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-07bpf: Use size instead of pages in bpf_binary_headerSong Liu1-6/+5
This is necessary to charge sub page memory for the BPF program. Signed-off-by: Song Liu <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-07bpf: Use bytes instead of pages for bpf_jit_[charge|uncharge]_modmemSong Liu2-12/+11
This enables sub-page memory charge and allocation. Signed-off-by: Song Liu <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-07PM: s2idle: ACPI: Fix wakeup interrupts handlingRafael J. Wysocki3-4/+5
After commit e3728b50cd9b ("ACPI: PM: s2idle: Avoid possible race related to the EC GPE") wakeup interrupts occurring immediately after the one discarded by acpi_s2idle_wake() may be missed. Moreover, if the SCI triggers again immediately after the rearming in acpi_s2idle_wake(), that wakeup may be missed too. The problem is that pm_system_irq_wakeup() only calls pm_system_wakeup() when pm_wakeup_irq is 0, but that's not the case any more after the interrupt causing acpi_s2idle_wake() to run until pm_wakeup_irq is cleared by the pm_wakeup_clear() call in s2idle_loop(). However, there may be wakeup interrupts occurring in that time frame and if that happens, they will be missed. To address that issue first move the clearing of pm_wakeup_irq to the point at which it is known that the interrupt causing acpi_s2idle_wake() to tun will be discarded, before rearming the SCI for wakeup. Moreover, because that only reduces the size of the time window in which the issue may manifest itself, allow pm_system_irq_wakeup() to register two second wakeup interrupts in a row and, when discarding the first one, replace it with the second one. [Of course, this assumes that only one wakeup interrupt can be discarded in one go, but currently that is the case and I am not aware of any plans to change that.] Fixes: e3728b50cd9b ("ACPI: PM: s2idle: Avoid possible race related to the EC GPE") Cc: 5.4+ <[email protected]> # 5.4+ Signed-off-by: Rafael J. Wysocki <[email protected]>
2022-02-06perf: Fix list corruption in perf_cgroup_switch()Song Liu1-2/+2
There's list corruption on cgrp_cpuctx_list. This happens on the following path: perf_cgroup_switch: list_for_each_entry(cgrp_cpuctx_list) cpu_ctx_sched_in ctx_sched_in ctx_pinned_sched_in merge_sched_in perf_cgroup_event_disable: remove the event from the list Use list_for_each_entry_safe() to allow removing an entry during iteration. Fixes: 058fe1c0440e ("perf/core: Make cgroup switch visit only cpuctxs with cgroup events") Signed-off-by: Song Liu <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-02-06sched/fair: Fix fault in reweight_entityTadeusz Struk1-5/+6
Syzbot found a GPF in reweight_entity. This has been bisected to commit 4ef0c5c6b5ba ("kernel/sched: Fix sched_fork() access an invalid sched_task_group") There is a race between sched_post_fork() and setpriority(PRIO_PGRP) within a thread group that causes a null-ptr-deref in reweight_entity() in CFS. The scenario is that the main process spawns number of new threads, which then call setpriority(PRIO_PGRP, 0, -20), wait, and exit. For each of the new threads the copy_process() gets invoked, which adds the new task_struct and calls sched_post_fork() for it. In the above scenario there is a possibility that setpriority(PRIO_PGRP) and set_one_prio() will be called for a thread in the group that is just being created by copy_process(), and for which the sched_post_fork() has not been executed yet. This will trigger a null pointer dereference in reweight_entity(), as it will try to access the run queue pointer, which hasn't been set. Before the mentioned change the cfs_rq pointer for the task has been set in sched_fork(), which is called much earlier in copy_process(), before the new task is added to the thread_group. Now it is done in the sched_post_fork(), which is called after that. To fix the issue the remove the update_load param from the update_load param() function and call reweight_task() only if the task flag doesn't have the TASK_NEW flag set. Fixes: 4ef0c5c6b5ba ("kernel/sched: Fix sched_fork() access an invalid sched_task_group") Reported-by: [email protected] Signed-off-by: Tadeusz Struk <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Dietmar Eggemann <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2022-02-06Merge tag 'perf_urgent_for_v5.17_rc3' of ↵Linus Torvalds1-0/+16
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Borislav Petkov: - Intel/PT: filters could crash the kernel - Intel: default disable the PMU for SMM, some new-ish EFI firmware has started using CPL3 and the PMU CPL filters don't discriminate against SMM, meaning that CPL3 (userspace only) events now also count EFI/SMM cycles. - Fixup for perf_event_attr::sig_data * tag 'perf_urgent_for_v5.17_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/pt: Fix crash with stop filters in single-range mode perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit architectures selftests/perf_events: Test modification of perf_event_attr::sig_data perf: Copy perf_event_attr::sig_data on modification x86/perf: Default set FREEZE_ON_SMI for all
2022-02-04bpf: Implement bpf_core_types_are_compat().Matteo Croce1-1/+104
Adopt libbpf's bpf_core_types_are_compat() for kernel duty by adding explicit recursion limit of 2 which is enough to handle 2 levels of function prototypes. Signed-off-by: Matteo Croce <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski20-139/+224
No conflicts. Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-03gcc-plugins/stackleak: Use noinstr in favor of notraceKees Cook1-3/+2
While the stackleak plugin was already using notrace, objtool is now a bit more picky. Update the notrace uses to noinstr. Silences the following objtool warnings when building with: CONFIG_DEBUG_ENTRY=y CONFIG_STACK_VALIDATION=y CONFIG_VMLINUX_VALIDATION=y CONFIG_GCC_PLUGIN_STACKLEAK=y vmlinux.o: warning: objtool: do_syscall_64()+0x9: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: do_int80_syscall_32()+0x9: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: exc_general_protection()+0x22: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: fixup_bad_iret()+0x20: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: do_machine_check()+0x27: call to stackleak_track_stack() leaves .noinstr.text section vmlinux.o: warning: objtool: .text+0x5346e: call to stackleak_erase() leaves .noinstr.text section vmlinux.o: warning: objtool: .entry.text+0x143: call to stackleak_erase() leaves .noinstr.text section vmlinux.o: warning: objtool: .entry.text+0x10eb: call to stackleak_erase() leaves .noinstr.text section vmlinux.o: warning: objtool: .entry.text+0x17f9: call to stackleak_erase() leaves .noinstr.text section Note that the plugin's addition of calls to stackleak_track_stack() from noinstr functions is expected to be safe, as it isn't runtime instrumentation and is self-contained. Cc: Alexander Popov <[email protected]> Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-02-03Merge tag 'net-5.17-rc3' of ↵Linus Torvalds3-4/+5
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, netfilter, and ieee802154. Current release - regressions: - Partially revert "net/smc: Add netlink net namespace support", fix uABI breakage - netfilter: - nft_ct: fix use after free when attaching zone template - nft_byteorder: track register operations Previous releases - regressions: - ipheth: fix EOVERFLOW in ipheth_rcvbulk_callback - phy: qca8081: fix speeds lower than 2.5Gb/s - sched: fix use-after-free in tc_new_tfilter() Previous releases - always broken: - tcp: fix mem under-charging with zerocopy sendmsg() - tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data() - neigh: do not trigger immediate probes on NUD_FAILED from neigh_managed_work, avoid a deadlock - bpf: use VM_MAP instead of VM_ALLOC for ringbuf, avoid KASAN false-positives - netfilter: nft_reject_bridge: fix for missing reply from prerouting - smc: forward wakeup to smc socket waitqueue after fallback - ieee802154: - return meaningful error codes from the netlink helpers - mcr20a: fix lifs/sifs periods - at86rf230, ca8210: stop leaking skbs on error paths - macsec: add missing un-offload call for NETDEV_UNREGISTER of parent - ax25: add refcount in ax25_dev to avoid UAF bugs - eth: mlx5e: - fix SFP module EEPROM query - fix broken SKB allocation in HW-GRO - IPsec offload: fix tunnel mode crypto for non-TCP/UDP flows - eth: amd-xgbe: - fix skb data length underflow - ensure reset of the tx_timer_active flag, avoid Tx timeouts - eth: stmmac: fix runtime pm use in stmmac_dvr_remove() - eth: e1000e: handshake with CSME starts from Alder Lake platforms" * tag 'net-5.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits) ax25: fix reference count leaks of ax25_dev net: stmmac: ensure PTP time register reads are consistent net: ipa: request IPA register values be retained dt-bindings: net: qcom,ipa: add optional qcom,qmp property tools/resolve_btfids: Do not print any commands when building silently bpf: Use VM_MAP instead of VM_ALLOC for ringbuf net, neigh: Do not trigger immediate probes on NUD_FAILED from neigh_managed_work tcp: add missing tcp_skb_can_collapse() test in tcp_shift_skb_data() net: sparx5: do not refer to skb after passing it on Partially revert "net/smc: Add netlink net namespace support" net/mlx5e: Avoid field-overflowing memcpy() net/mlx5e: Use struct_group() for memcpy() region net/mlx5e: Avoid implicit modify hdr for decap drop rule net/mlx5e: IPsec: Fix tunnel mode crypto offload for non TCP/UDP traffic net/mlx5e: IPsec: Fix crypto offload for non TCP/UDP encapsulated traffic net/mlx5e: Don't treat small ceil values as unlimited in HTB offload net/mlx5: E-Switch, Fix uninitialized variable modact net/mlx5e: Fix handling of wrong devices during bond netevent net/mlx5e: Fix broken SKB allocation in HW-GRO net/mlx5e: Fix wrong calculation of header index in HW_GRO ...
2022-02-03Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski3-4/+5
Daniel Borkmann says: ==================== pull-request: bpf 2022-02-03 We've added 6 non-merge commits during the last 10 day(s) which contain a total of 7 files changed, 11 insertions(+), 236 deletions(-). The main changes are: 1) Fix BPF ringbuf to allocate its area with VM_MAP instead of VM_ALLOC flag which otherwise trips over KASAN, from Hou Tao. 2) Fix unresolved symbol warning in resolve_btfids due to LSM callback rename, from Alexei Starovoitov. 3) Fix a possible race in inc_misses_counter() when IRQ would trigger during counter update, from He Fengqing. 4) Fix tooling infra for cross-building with clang upon probing whether gcc provides the standard libraries, from Jean-Philippe Brucker. 5) Fix silent mode build for resolve_btfids, from Nathan Chancellor. 6) Drop unneeded and outdated lirc.h header copy from tooling infra as BPF does not require it anymore, from Sean Young. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: tools/resolve_btfids: Do not print any commands when building silently bpf: Use VM_MAP instead of VM_ALLOC for ringbuf tools: Ignore errors from `which' when searching a GCC toolchain tools headers UAPI: remove stale lirc.h bpf: Fix possible race in inc_misses_counter bpf: Fix renaming task_getsecid_subj->current_getsecid_subj. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-03bpf: Fix a btf decl_tag bug when tagging a functionYonghong Song1-8/+21
syzbot reported a btf decl_tag bug with stack trace below: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 0 PID: 3592 Comm: syz-executor914 Not tainted 5.16.0-syzkaller-11424-gb7892f7d5cb2 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:btf_type_vlen include/linux/btf.h:231 [inline] RIP: 0010:btf_decl_tag_resolve+0x83e/0xaa0 kernel/bpf/btf.c:3910 ... Call Trace: <TASK> btf_resolve+0x251/0x1020 kernel/bpf/btf.c:4198 btf_check_all_types kernel/bpf/btf.c:4239 [inline] btf_parse_type_sec kernel/bpf/btf.c:4280 [inline] btf_parse kernel/bpf/btf.c:4513 [inline] btf_new_fd+0x19fe/0x2370 kernel/bpf/btf.c:6047 bpf_btf_load kernel/bpf/syscall.c:4039 [inline] __sys_bpf+0x1cbb/0x5970 kernel/bpf/syscall.c:4679 __do_sys_bpf kernel/bpf/syscall.c:4738 [inline] __se_sys_bpf kernel/bpf/syscall.c:4736 [inline] __x64_sys_bpf+0x75/0xb0 kernel/bpf/syscall.c:4736 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae The kasan error is triggered with an illegal BTF like below: type 0: void type 1: int type 2: decl_tag to func type 3 type 3: func to func_proto type 8 The total number of types is 4 and the type 3 is illegal since its func_proto type is out of range. Currently, the target type of decl_tag can be struct/union, var or func. Both struct/union and var implemented their own 'resolve' callback functions and hence handled properly in kernel. But func type doesn't have 'resolve' callback function. When btf_decl_tag_resolve() tries to check func type, it tries to get vlen of its func_proto type, which triggered the above kasan error. To fix the issue, btf_decl_tag_resolve() needs to do btf_func_check() before trying to accessing func_proto type. In the current implementation, func type is checked with btf_func_check() in the main checking function btf_check_all_types(). To fix the above kasan issue, let us implement 'resolve' callback func type properly. The 'resolve' callback will be also called in btf_check_all_types() for func types. Fixes: b5ea834dde6b ("bpf: Support for new btf kind BTF_KIND_TAG") Reported-by: [email protected] Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-03printk: Fix incorrect __user type in proc_dointvec_minmax_sysadmin()Mickaël Salaün1-1/+1
The move of proc_dointvec_minmax_sysadmin() from kernel/sysctl.c to kernel/printk/sysctl.c introduced an incorrect __user attribute to the buffer argument. I spotted this change in [1] as well as the kernel test robot. Revert this change to please sparse: kernel/printk/sysctl.c:20:51: warning: incorrect type in argument 3 (different address spaces) kernel/printk/sysctl.c:20:51: expected void * kernel/printk/sysctl.c:20:51: got void [noderef] __user *buffer Fixes: faaa357a55e0 ("printk: move printk sysctl to printk/sysctl.c") Link: https://lore.kernel.org/r/[email protected] [1] Reported-by: kernel test robot <[email protected]> Cc: Andrew Morton <[email protected]> Cc: John Ogness <[email protected]> Cc: Luis Chamberlain <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Xiaoming Ni <[email protected]> Signed-off-by: Mickaël Salaün <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2022-02-03Revert "module, async: async_synchronize_full() on module init iff async is ↵Igor Pylypiv2-23/+5
used" This reverts commit 774a1221e862b343388347bac9b318767336b20b. We need to finish all async code before the module init sequence is done. In the reverted commit the PF_USED_ASYNC flag was added to mark a thread that called async_schedule(). Then the PF_USED_ASYNC flag was used to determine whether or not async_synchronize_full() needs to be invoked. This works when modprobe thread is calling async_schedule(), but it does not work if module dispatches init code to a worker thread which then calls async_schedule(). For example, PCI driver probing is invoked from a worker thread based on a node where device is attached: if (cpu < nr_cpu_ids) error = work_on_cpu(cpu, local_pci_probe, &ddi); else error = local_pci_probe(&ddi); We end up in a situation where a worker thread gets the PF_USED_ASYNC flag set instead of the modprobe thread. As a result, async_synchronize_full() is not invoked and modprobe completes without waiting for the async code to finish. The issue was discovered while loading the pm80xx driver: (scsi_mod.scan=async) modprobe pm80xx worker ... do_init_module() ... pci_call_probe() work_on_cpu(local_pci_probe) local_pci_probe() pm8001_pci_probe() scsi_scan_host() async_schedule() worker->flags |= PF_USED_ASYNC; ... < return from worker > ... if (current->flags & PF_USED_ASYNC) <--- false async_synchronize_full(); Commit 21c3c5d28007 ("block: don't request module during elevator init") fixed the deadlock issue which the reverted commit 774a1221e862 ("module, async: async_synchronize_full() on module init iff async is used") tried to fix. Since commit 0fdff3ec6d87 ("async, kmod: warn on synchronous request_module() from async workers") synchronous module loading from async is not allowed. Given that the original deadlock issue is fixed and it is no longer allowed to call synchronous request_module() from async we can remove PF_USED_ASYNC flag to make module init consistently invoke async_synchronize_full() unless async module probe is requested. Signed-off-by: Igor Pylypiv <[email protected]> Reviewed-by: Changyuan Lyu <[email protected]> Reviewed-by: Luis Chamberlain <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-02-03Merge branch 'for-5.17-fixes' of ↵Linus Torvalds2-14/+65
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Eric's fix for a long standing cgroup1 permission issue where it only checks for uid 0 instead of CAP which inadvertently allows unprivileged userns roots to modify release_agent userhelper - Fixes for the fallout from Waiman's recent cpuset work * 'for-5.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup/cpuset: Fix "suspicious RCU usage" lockdep warning cgroup-v1: Require capabilities to set release_agent cpuset: Fix the bug that subpart_cpus updated wrongly in update_cpumask() cgroup/cpuset: Make child cpusets restrict parents on v1 hierarchy
2022-02-03cgroup/cpuset: Fix "suspicious RCU usage" lockdep warningWaiman Long1-0/+10
It was found that a "suspicious RCU usage" lockdep warning was issued with the rcu_read_lock() call in update_sibling_cpumasks(). It is because the update_cpumasks_hier() function may sleep. So we have to release the RCU lock, call update_cpumasks_hier() and reacquire it afterward. Also add a percpu_rwsem_assert_held() in update_sibling_cpumasks() instead of stating that in the comment. Fixes: 4716909cc5c5 ("cpuset: Track cpusets that use parent's effective_cpus") Signed-off-by: Waiman Long <[email protected]> Tested-by: Phil Auld <[email protected]> Reviewed-by: Phil Auld <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2022-02-02bpf: Use VM_MAP instead of VM_ALLOC for ringbufHou Tao1-1/+1
After commit 2fd3fb0be1d1 ("kasan, vmalloc: unpoison VM_ALLOC pages after mapping"), non-VM_ALLOC mappings will be marked as accessible in __get_vm_area_node() when KASAN is enabled. But now the flag for ringbuf area is VM_ALLOC, so KASAN will complain out-of-bound access after vmap() returns. Because the ringbuf area is created by mapping allocated pages, so use VM_MAP instead. After the change, info in /proc/vmallocinfo also changes from [start]-[end] 24576 ringbuf_map_alloc+0x171/0x290 vmalloc user to [start]-[end] 24576 ringbuf_map_alloc+0x171/0x290 vmap user Fixes: 457f44363a88 ("bpf: Implement BPF ring buffer and verifier support for it") Reported-by: [email protected] Signed-off-by: Hou Tao <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-02perf: Copy perf_event_attr::sig_data on modificationMarco Elver1-0/+16
The intent has always been that perf_event_attr::sig_data should also be modifiable along with PERF_EVENT_IOC_MODIFY_ATTRIBUTES, because it is observable by user space if SIGTRAP on events is requested. Currently only PERF_TYPE_BREAKPOINT is modifiable, and explicitly copies relevant breakpoint-related attributes in hw_breakpoint_copy_attr(). This misses copying perf_event_attr::sig_data. Since sig_data is not specific to PERF_TYPE_BREAKPOINT, introduce a helper to copy generic event-type-independent attributes on modification. Fixes: 97ba62b27867 ("perf: Add support for SIGTRAP on perf events") Reported-by: Dmitry Vyukov <[email protected]> Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Dmitry Vyukov <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-02-01bpf: Drop libbpf, libelf, libz dependency from bpf preload.Alexei Starovoitov1-26/+2
Drop libbpf, libelf, libz dependency from bpf preload. This reduces bpf_preload_umd binary size from 1.7M to 30k unstripped with debug info and from 300k to 19k stripped. Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-01bpf: Open code obj_get_info_by_fd in bpf preload.Alexei Starovoitov1-1/+17
Open code obj_get_info_by_fd in bpf preload. It's the last part of libbpf that preload/iterators were using. Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-01bpf: Convert bpf preload to light skeleton.Alexei Starovoitov4-420/+436
Convert bpffs preload iterators to light skeleton. Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-01bpf: Remove unnecessary setrlimit from bpf preload.Alexei Starovoitov1-2/+0
BPF programs and maps are memcg accounted. setrlimit is obsolete. Remove its use from bpf preload. Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-02-01Merge tag 'audit-pr-20220131' of ↵Linus Torvalds1-19/+43
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit fix from Paul Moore: "A single audit patch to fix problems relating to audit queuing and system responsiveness when "audit=1" is specified on the kernel command line and the audit daemon is SIGSTOP'd for an extended period of time" * tag 'audit-pr-20220131' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: improve audit queue handling when "audit=1" on cmdline
2022-02-01cgroup-v1: Require capabilities to set release_agentEric W. Biederman1-0/+14
The cgroup release_agent is called with call_usermodehelper. The function call_usermodehelper starts the release_agent with a full set fo capabilities. Therefore require capabilities when setting the release_agaent. Reported-by: Tabitha Sable <[email protected]> Tested-by: Tabitha Sable <[email protected]> Fixes: 81a6a5cdd2c5 ("Task Control Groups: automatic userspace notification of idle cgroups") Cc: [email protected] # v2.6.24+ Signed-off-by: "Eric W. Biederman" <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2022-01-31bpf: make bpf_copy_from_user_task() gpl onlyKenta Tada1-1/+1
access_process_vm() is exported by EXPORT_SYMBOL_GPL(). Signed-off-by: Kenta Tada <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>