Age | Commit message (Collapse) | Author | Files | Lines |
|
bpf prog accesses stack using BPF_FP as the base address and a negative
immediate number as offset. But arm64 ldr/str instructions only support
non-negative immediate number as offset. To simplify the jited result,
commit 5b3d19b9bd40 ("bpf, arm64: Adjust the offset of str/ldr(immediate)
to positive number") introduced FPB to represent the lowest stack address
that the bpf prog being jited may access, and with this address as the
baseline, it converts BPF_FP plus negative immediate offset number to FPB
plus non-negative immediate offset.
Considering that for a given bpf prog, the jited stack space is fixed
with A64_SP as the lowest address and BPF_FP as the highest address.
Thus we can get rid of FPB and converts BPF_FP plus negative immediate
offset to A64_SP plus non-negative immediate offset.
Signed-off-by: Xu Kuohai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
commit 7bd230a26648 ("mm/slab: enable slab allocation tagging for kmalloc
and friends") [1] swap kmem_cache_alloc_node() to
kmem_cache_alloc_node_noprof().
linux/samples/bpf$ sudo ./tracex4
libbpf: prog 'bpf_prog2': failed to create kretprobe
'kmem_cache_alloc_node+0x0' perf event: No such file or directory
ERROR: bpf_program__attach failed
Signed-off-by: Rong Tao <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://github.com/torvalds/linux/commit/7bd230a26648ac68ab3731ebbc449090f0ac6a37
Link: https://lore.kernel.org/bpf/[email protected]
|
|
This adds tests for both the happy path and
the error path.
Signed-off-by: Jordan Rome <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
This adds a kfunc wrapper around strncpy_from_user,
which can be called from sleepable BPF programs.
This matches the non-sleepable 'bpf_probe_read_user_str'
helper except it includes an additional 'flags'
param, which allows consumers to clear the entire
destination buffer on success or failure.
Signed-off-by: Jordan Rome <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Save pkg-config output for libpcap as simply-expanded variables.
For an obscure reason 'shell' call in LDLIBS/CFLAGS recursively
expanded variables makes *.test.o files compilation non-parallel
when make is executed with -j option.
While at it, reuse 'pkg-config --cflags' call to define
-DTRAFFIC_MONITOR=1 option, it's exit status is the same as for
'pkg-config --exists'.
Fixes: f52403b6bfea ("selftests/bpf: Add traffic monitor functions.")
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Martin KaFai Lau <[email protected]>
|
|
Amery Hung says:
====================
Support bpf_kptr_xchg into local kptr
This revision adds substaintial changes to patch 2 to support structures
with kptr as the only special btf type. The test is split into
local_kptr_stash and task_kfunc_success to remove dependencies on
bpf_testmod that would break veristat results.
This series allows stashing kptr into local kptr. Currently, kptrs are
only allowed to be stashed into map value with bpf_kptr_xchg(). A
motivating use case of this series is to enable adding referenced kptr to
bpf_rbtree or bpf_list by using allocated object as graph node and the
storage of referenced kptr. For example, a bpf qdisc [0] enqueuing a
referenced kptr to a struct sk_buff* to a bpf_list serving as a fifo:
struct skb_node {
struct sk_buff __kptr *skb;
struct bpf_list_node node;
};
private(A) struct bpf_spin_lock fifo_lock;
private(A) struct bpf_list_head fifo __contains(skb_node, node);
/* In Qdisc_ops.enqueue */
struct skb_node *skbn;
skbn = bpf_obj_new(typeof(*skbn));
if (!skbn)
goto drop;
/* skb is a referenced kptr to struct sk_buff acquired earilier
* but not shown in this code snippet.
*/
skb = bpf_kptr_xchg(&skbn->skb, skb);
if (skb)
/* should not happen; do something below releasing skb to
* satisfy the verifier */
...
bpf_spin_lock(&fifo_lock);
bpf_list_push_back(&fifo, &skbn->node);
bpf_spin_unlock(&fifo_lock);
The implementation first searches for BPF_KPTR when generating program
BTF. Then, we teach the verifier that the detination argument of
bpf_kptr_xchg() can be local kptr, and use the btf_record in program BTF
to check against the source argument.
This series is mostly developed by Dave, who kindly helped and sent me
the patchset. The selftests in bpf qdisc (WIP) relies on this series to
work.
[0] https://lore.kernel.org/netdev/[email protected]/
---
v3 -> v4
- Allow struct in prog btf w/ kptr as the only special field type
- Split tests of stashing referenced kptr and local kptr
- v3: https://lore.kernel.org/bpf/[email protected]/
v2 -> v3
- Fix prog btf memory leak
- Test stashing kptr in prog btf
- Test unstashing kptrs after stashing into local kptrs
- v2: https://lore.kernel.org/bpf/[email protected]/
v1 -> v2
- Fix the document for bpf_kptr_xchg()
- Add a comment explaining changes in the verifier
- v1: https://lore.kernel.org/bpf/[email protected]/
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Test stashing both referenced kptr and local kptr into local kptrs. Then,
test unstashing them.
Acked-by: Martin KaFai Lau <[email protected]>
Acked-by: Hou Tao <[email protected]>
Signed-off-by: Dave Marchevsky <[email protected]>
Signed-off-by: Amery Hung <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Currently, users can only stash kptr into map values with bpf_kptr_xchg().
This patch further supports stashing kptr into local kptr by adding local
kptr as a valid destination type.
When stashing into local kptr, btf_record in program BTF is used instead
of btf_record in map to search for the btf_field of the local kptr.
The local kptr specific checks in check_reg_type() only apply when the
source argument of bpf_kptr_xchg() is local kptr. Therefore, we make the
scope of the check explicit as the destination now can also be local kptr.
Acked-by: Martin KaFai Lau <[email protected]>
Signed-off-by: Dave Marchevsky <[email protected]>
Signed-off-by: Amery Hung <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
ARG_PTR_TO_KPTR is currently only used by the bpf_kptr_xchg helper.
Although it limits reg types for that helper's first arg to
PTR_TO_MAP_VALUE, any arbitrary mapval won't do: further custom
verification logic ensures that the mapval reg being xchgd-into is
pointing to a kptr field. If this is not the case, it's not safe to xchg
into that reg's pointee.
Let's rename the bpf_arg_type to more accurately describe the fairly
specific expectations that this arg type encodes.
This is a nonfunctional change.
Acked-by: Martin KaFai Lau <[email protected]>
Signed-off-by: Dave Marchevsky <[email protected]>
Signed-off-by: Amery Hung <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Currently btf_parse_fields is used in two places to create struct
btf_record's for structs: when looking at mapval type, and when looking
at any struct in program BTF. The former looks for kptr fields while the
latter does not. This patch modifies the btf_parse_fields call made when
looking at prog BTF struct types to search for kptrs as well.
Before this series there was no reason to search for kptrs in non-mapval
types: a referenced kptr needs some owner to guarantee resource cleanup,
and map values were the only owner that supported this. If a struct with
a kptr field were to have some non-kptr-aware owner, the kptr field
might not be properly cleaned up and result in resources leaking. Only
searching for kptr fields in mapval was a simple way to avoid this
problem.
In practice, though, searching for BPF_KPTR when populating
struct_meta_tab does not expose us to this risk, as struct_meta_tab is
only accessed through btf_find_struct_meta helper, and that helper is
only called in contexts where recognizing the kptr field is safe:
* PTR_TO_BTF_ID reg w/ MEM_ALLOC flag
* Such a reg is a local kptr and must be free'd via bpf_obj_drop,
which will correctly handle kptr field
* When handling specific kfuncs which either expect MEM_ALLOC input or
return MEM_ALLOC output (obj_{new,drop}, percpu_obj_{new,drop},
list+rbtree funcs, refcount_acquire)
* Will correctly handle kptr field for same reasons as above
* When looking at kptr pointee type
* Called by functions which implement "correct kptr resource
handling"
* In btf_check_and_fixup_fields
* Helper that ensures no ownership loops for lists and rbtrees,
doesn't care about kptr field existence
So we should be able to find BPF_KPTR fields in all prog BTF structs
without leaking resources.
Further patches in the series will build on this change to support
kptr_xchg into non-mapval local kptr. Without this change there would be
no kptr field found in such a type.
Acked-by: Martin KaFai Lau <[email protected]>
Acked-by: Hou Tao <[email protected]>
Signed-off-by: Dave Marchevsky <[email protected]>
Signed-off-by: Amery Hung <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
btf_parse_kptr() and btf_record_free() do btf_get() and btf_put()
respectively when working on btf_record in program and map if there are
kptr fields. If the kptr is from program BTF, since both callers has
already tracked the life cycle of program BTF, it is safe to remove the
btf_get() and btf_put().
This change prevents memory leak of program BTF later when we start
searching for kptr fields when building btf_record for program. It can
happen when the btf fd is closed. The btf_put() corresponding to the
btf_get() in btf_parse_kptr() was supposed to be called by
btf_record_free() in btf_free_struct_meta_tab() in btf_free(). However,
it will never happen since the invocation of btf_free() depends on the
refcount of the btf to become 0 in the first place.
Acked-by: Martin KaFai Lau <[email protected]>
Acked-by: Hou Tao <[email protected]>
Signed-off-by: Amery Hung <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Add multi-uprobe and multi-uretprobe benchmarks to bench tool.
Multi- and classic uprobes/uretprobes have different low-level
triggering code paths, so it's sometimes important to be able to
benchmark both flavors of uprobes/uretprobes.
Sample examples from my dev machine below. Single-threaded peformance
almost doesn't differ, but with more parallel CPUs triggering the same
uprobe/uretprobe the difference grows. This might be due to [0], but
given the code is slightly different, there could be other sources of
slowdown.
Note, all these numbers will change due to ongoing work to improve
uprobe/uretprobe scalability (e.g., [1]), but having benchmark like this
is useful for measurements and debugging nevertheless.
\#!/bin/bash
set -eufo pipefail
for p in 1 8 16 32; do
for i in uprobe-nop uretprobe-nop uprobe-multi-nop uretprobe-multi-nop; do
summary=$(sudo ./bench -w1 -d3 -p$p -a trig-$i | tail -n1)
total=$(echo "$summary" | cut -d'(' -f1 | cut -d' ' -f3-)
percpu=$(echo "$summary" | cut -d'(' -f2 | cut -d')' -f1 | cut -d'/' -f1)
printf "%-21s (%2d cpus): %s (%s/s/cpu)\n" $i $p "$total" "$percpu"
done
echo
done
uprobe-nop ( 1 cpus): 1.020 ± 0.005M/s ( 1.020M/s/cpu)
uretprobe-nop ( 1 cpus): 0.515 ± 0.009M/s ( 0.515M/s/cpu)
uprobe-multi-nop ( 1 cpus): 1.036 ± 0.004M/s ( 1.036M/s/cpu)
uretprobe-multi-nop ( 1 cpus): 0.512 ± 0.005M/s ( 0.512M/s/cpu)
uprobe-nop ( 8 cpus): 3.481 ± 0.030M/s ( 0.435M/s/cpu)
uretprobe-nop ( 8 cpus): 2.222 ± 0.008M/s ( 0.278M/s/cpu)
uprobe-multi-nop ( 8 cpus): 3.769 ± 0.094M/s ( 0.471M/s/cpu)
uretprobe-multi-nop ( 8 cpus): 2.482 ± 0.007M/s ( 0.310M/s/cpu)
uprobe-nop (16 cpus): 2.968 ± 0.011M/s ( 0.185M/s/cpu)
uretprobe-nop (16 cpus): 1.870 ± 0.002M/s ( 0.117M/s/cpu)
uprobe-multi-nop (16 cpus): 3.541 ± 0.037M/s ( 0.221M/s/cpu)
uretprobe-multi-nop (16 cpus): 2.123 ± 0.026M/s ( 0.133M/s/cpu)
uprobe-nop (32 cpus): 2.524 ± 0.026M/s ( 0.079M/s/cpu)
uretprobe-nop (32 cpus): 1.572 ± 0.003M/s ( 0.049M/s/cpu)
uprobe-multi-nop (32 cpus): 2.717 ± 0.003M/s ( 0.085M/s/cpu)
uretprobe-multi-nop (32 cpus): 1.687 ± 0.007M/s ( 0.053M/s/cpu)
[0] https://lore.kernel.org/linux-trace-kernel/[email protected]/
[1] https://lore.kernel.org/linux-trace-kernel/[email protected]/
Signed-off-by: Andrii Nakryiko <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Instead of parsing text-based /proc/<pid>/maps file, try to use
PROCMAP_QUERY ioctl() to simplify and speed up data fetching.
This logic is used to do uprobe file offset calculation, so any bugs in
this logic would manifest as failing uprobe BPF selftests.
This also serves as a simple demonstration of one of the intended uses.
Signed-off-by: Andrii Nakryiko <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Eduard Zingerman says:
====================
follow up for __jited test tag
This patch-set is a collection of follow-ups for
"__jited test tag to check disassembly after jit" series (see [1]).
First patch is most important:
as it turns out, I broke all test_loader based tests for s390 CI.
E.g. see log [2] for s390 execution of test_progs,
note all 'verivier_*' tests being skipped.
This happens because of incorrect handling of corner case when
get_current_arch() does not know which architecture to return.
Second patch makes matching of function return sequence in
verifier_tailcall_jit more flexible:
-__jited(" retq")
+__jited(" {{(retq|jmp 0x)}}")
The difference could be seen with and w/o mitigations=off boot
parameter for test VM (CI runs with mitigations=off, hence it
generates retq).
Third patch addresses Alexei's request to add #define and a comment in
jit_disasm_helpers.c.
[1] https://lore.kernel.org/bpf/[email protected]/
[2] https://github.com/kernel-patches/bpf/actions/runs/10518445973/job/29144511595
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Extract local label length as a #define directive and
elaborate why 'i % MAX_LOCAL_LABELS' expression is needed
for local labels array initialization.
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Depending on kernel parameters, x86 jit generates either retq or jump
to rethunk for 'exit' instruction. The difference could be seen when
kernel is booted with and without mitigations=off parameter.
Relax the verifier_tailcall_jit test case to match both variants.
Fixes: e5bdd6a8be78 ("selftests/bpf: validate jit behaviour for tail calls")
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
At the moment, when test_loader.c:get_current_arch() can't determine
the arch, it returns 0. The arch check in run_subtest() looks as
follows:
if ((get_current_arch() & spec->arch_mask) == 0) {
test__skip();
return;
}
Which means that all test_loader based tests would be skipped if arch
could not be determined. get_current_arch() recognizes x86_64, arm64
and riscv64. Which means that CI skips test_loader tests for s390.
Fix this by making sure that get_current_arch() always returns
non-zero value. In combination with default spec->arch_mask == -1 this
should cover all possibilities.
Fixes: f406026fefa7 ("selftests/bpf: by default use arch mask allowing all archs")
Fixes: 7d743e4c759c ("selftests/bpf: __jited test tag to check disassembly after jit")
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
prog_array map
Add a selftest to confirm the issue, which gets -EINVAL when update
attached freplace prog to prog_array map, has been fixed.
cd tools/testing/selftests/bpf; ./test_progs -t tailcalls
328/25 tailcalls/tailcall_freplace:OK
328 tailcalls:OK
Summary: 1/25 PASSED, 0 SKIPPED, 0 FAILED
Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Leon Hwang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Cross-merge bpf fixes after downstream PR including
important fixes (from bpf-next point of view):
commit 41c24102af7b ("selftests/bpf: Filter out _GNU_SOURCE when compiling test_cpp")
commit fdad456cbcca ("bpf: Fix updating attached freplace prog in prog_array map")
No conflicts.
Adjacent changes in:
include/linux/bpf_verifier.h
kernel/bpf/verifier.c
tools/testing/selftests/bpf/Makefile
Link: https://lore.kernel.org/bpf/[email protected]/
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Eduard Zingerman says:
====================
support bpf_fastcall patterns for calls to kfuncs
As an extension of [1], allow bpf_fastcall patterns for kfuncs:
- pattern rules are the same as for helpers;
- spill/fill removal is allowed only for kfuncs listed in the
is_fastcall_kfunc_call (under assumption that such kfuncs would
always be members of special_kfunc_list).
Allow bpf_fastcall rewrite for bpf_cast_to_kern_ctx() and
bpf_rdonly_cast() in order to conjure selftests for this feature.
After this patch-set verifier would rewrite the program below:
r2 = 1
*(u64 *)(r10 - 32) = r2
call %[bpf_cast_to_kern_ctx]
r2 = *(u64 *)(r10 - 32)
r0 = r2;"
As follows:
r2 = 1 /* spill/fill at r10[-32] is removed */
r0 = r1 /* replacement for bpf_cast_to_kern_ctx() */
r0 = r2
exit
Also, attribute used by LLVM implementation of the feature had been
changed from no_caller_saved_registers to bpf_fastcall (see [2]).
This patch-set replaces references to nocsr by references to
bpf_fastcall to keep LLVM and Kernel parts in sync.
[1] no_caller_saved_registers attribute for helper calls
https://lore.kernel.org/bpf/[email protected]/
[2] [BPF] introduce __attribute__((bpf_fastcall))
https://github.com/llvm/llvm-project/pull/105417
Changes v2->v3:
- added a patch fixing arch_mask handling in test_loader,
otherwise newly added tests for the feature were skipped
(a fix for regression introduced by a recent commit);
- fixed warning regarding unused 'params' variable;
- applied stylistical fixes suggested by Yonghong;
- added acks from Yonghong;
Changes v1->v2:
- added two patches replacing all mentions of nocsr by bpf_fastcall
(suggested by Andrii);
- removed KF_NOCSR flag (suggested by Yonghong).
v1: https://lore.kernel.org/bpf/[email protected]/
v2: https://lore.kernel.org/bpf/[email protected]/
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Use kfunc_bpf_cast_to_kern_ctx() and kfunc_bpf_rdonly_cast() to verify
that bpf_fastcall pattern is recognized for kfunc calls.
Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
If test case does not specify architecture via __arch_* macro consider
that it should be run for all architectures.
Fixes: 7d743e4c759c ("selftests/bpf: __jited test tag to check disassembly after jit")
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
do_misc_fixups() relaces bpf_cast_to_kern_ctx() and bpf_rdonly_cast()
by a single instruction "r0 = r1". This follows bpf_fastcall contract.
This commit allows bpf_fastcall pattern rewrite for these two
functions in order to use them in bpf_fastcall selftests.
Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Recognize bpf_fastcall patterns around kfunc calls.
For example, suppose bpf_cast_to_kern_ctx() follows bpf_fastcall
contract (which it does), in such a case allow verifier to rewrite BPF
program below:
r2 = 1;
*(u64 *)(r10 - 32) = r2;
call %[bpf_cast_to_kern_ctx];
r2 = *(u64 *)(r10 - 32);
r0 = r2;
By removing the spill/fill pair:
r2 = 1;
call %[bpf_cast_to_kern_ctx];
r0 = r2;
Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Attribute used by LLVM implementation of the feature had been changed
from no_caller_saved_registers to bpf_fastcall (see [1]).
This commit replaces references to nocsr by references to bpf_fastcall
to keep LLVM and selftests parts in sync.
[1] https://github.com/llvm/llvm-project/pull/105417
Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Attribute used by LLVM implementation of the feature had been changed
from no_caller_saved_registers to bpf_fastcall (see [1]).
This commit replaces references to nocsr by references to bpf_fastcall
to keep LLVM and Kernel parts in sync.
[1] https://github.com/llvm/llvm-project/pull/105417
Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
In arraymap.c:
In bpf_array_map_seq_start() and bpf_array_map_seq_next()
cast return values from the __percpu address space to
the generic address space via uintptr_t [1].
Correct the declaration of pptr pointer in __bpf_array_map_seq_show()
to void __percpu * and cast the value from the generic address
space to the __percpu address space via uintptr_t [1].
In hashtab.c:
Assign the return value from bpf_mem_cache_alloc() to void pointer
and cast the value to void __percpu ** (void pointer to percpu void
pointer) before dereferencing.
In memalloc.c:
Explicitly declare __percpu variables.
Cast obj to void __percpu **.
In helpers.c:
Cast ptr in BPF_CALL_1 and BPF_CALL_2 from generic address space
to __percpu address space via const uintptr_t [1].
Found by GCC's named address space checks.
There were no changes in the resulting object files.
[1] https://sparse.docs.kernel.org/en/latest/annotations.html#address-space-name
Signed-off-by: Uros Bizjak <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Eduard Zingerman <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Yonghong Song <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: KP Singh <[email protected]>
Cc: Stanislav Fomichev <[email protected]>
Cc: Hao Luo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
'bpf-fix-null-pointer-access-for-malformed-bpf_core_type_id_local-relos'
Eduard Zingerman says:
====================
bpf: fix null pointer access for malformed BPF_CORE_TYPE_ID_LOCAL relos
Liu RuiTong reported an in-kernel null pointer derefence when
processing BPF_CORE_TYPE_ID_LOCAL relocations referencing non-existing
BTF types. Fix this by adding proper id checks.
Changes v2->v3:
- selftest update suggested by Andrii:
avoid memset(0) for log buffer and do memset(0) for bpf_attr.
Changes v1->v2:
- moved check from bpf_core_calc_relo_insn() to bpf_core_apply()
now both in kernel and in libbpf relocation type id is guaranteed
to exist when bpf_core_calc_relo_insn() is called;
- added a test case.
v1: https://lore.kernel.org/bpf/[email protected]/
v2: https://lore.kernel.org/bpf/[email protected]/
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Check that verifier rejects BPF program containing relocation
pointing to non-existent BTF type.
To force relocation resolution on kernel side test case uses
bpf_attr->core_relos field. This field is not exposed by libbpf,
so directly do BPF system call in the test.
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
In case of malformed relocation record of kind BPF_CORE_TYPE_ID_LOCAL
referencing a non-existing BTF type, function bpf_core_calc_relo_insn
would cause a null pointer deference.
Fix this by adding a proper check upper in call stack, as malformed
relocation records could be passed from user space.
Simplest reproducer is a program:
r0 = 0
exit
With a single relocation record:
.insn_off = 0, /* patch first instruction */
.type_id = 100500, /* this type id does not exist */
.access_str_off = 6, /* offset of string "0" */
.kind = BPF_CORE_TYPE_ID_LOCAL,
See the link for original reproducer or next commit for a test case.
Fixes: 74753e1462e7 ("libbpf: Replace btf__type_by_id() with btf_type_by_id().")
Reported-by: Liu RuiTong <[email protected]>
Closes: https://lore.kernel.org/bpf/CAK55_s6do7C+DVwbwY_7nKfUz0YLDoiA1v6X3Y9+p0sWzipFSA@mail.gmail.com/
Acked-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Let the kmemdup_array() take care about multiplication and possible
overflows.
Signed-off-by: Yu Jiaoliang <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
- ISST: Fix an error-handling corner case
- platform/surface: aggregator: Minor corner case fix and new HW
support
* tag 'platform-drivers-x86-v6.11-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: ISST: Fix return value on last invalid resource
platform/surface: aggregator: Fix warning when controller is destroyed in probe
platform/surface: aggregator_registry: Add support for Surface Laptop 6
platform/surface: aggregator_registry: Add fan and thermal sensor support for Surface Laptop 5
platform/surface: aggregator_registry: Add support for Surface Laptop Studio 2
platform/surface: aggregator_registry: Add support for Surface Laptop Go 3
platform/surface: aggregator_registry: Add Support for Surface Pro 10
platform/x86: asus-wmi: Add quirk for ROG Ally X
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs fixes from Gao Xiang:
"As I mentioned in the merge window pull request, there is a regression
which could cause system hang due to page migration. The corresponding
fix landed upstream through MM tree last week (commit 2e6506e1c4ee:
"mm/migrate: fix deadlock in migrate_pages_batch() on large folios"),
therefore large folios can be safely allowed for compressed inodes and
stress tests have been running on my fleet for over 20 days without
any regression. Users have explicitly requested this for months, so
let's allow large folios for EROFS full cases now for wider testing.
Additionally, there is a fix which addresses invalid memory accesses
on a failure path triggered by fault injection and two minor cleanups
to simplify the codebase.
Summary:
- Allow large folios on compressed inodes
- Fix invalid memory accesses if z_erofs_gbuf_growsize() partially
fails
- Two minor cleanups"
* tag 'erofs-for-6.11-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: fix out-of-bound access when z_erofs_gbuf_growsize() partially fails
erofs: allow large folios for compressed files
erofs: get rid of check_layout_compatibility()
erofs: simplify readdir operation
|
|
Eduard Zingerman says:
====================
__jited test tag to check disassembly after jit
Some of the logic in the BPF jits might be non-trivial.
It might be useful to allow testing this logic by comparing
generated native code with expected code template.
This patch set adds a macro __jited() that could be used for
test_loader based tests in a following manner:
SEC("tp")
__arch_x86_64
__jited(" endbr64")
__jited(" nopl (%rax,%rax)")
__jited(" xorq %rax, %rax")
...
__naked void some_test(void) { ... }
Also add a test for jit code generated for tail calls handling to
demonstrate the feature.
The feature uses LLVM libraries to do the disassembly.
At selftests compilation time Makefile detects if these libraries are
available. When libraries are not available tests using __jit_x86()
are skipped.
Current CI environment does not include llvm development libraries,
but changes to add these are trivial.
This was previously discussed here:
https://lore.kernel.org/bpf/[email protected]/
Patch-set includes a few auxiliary steps:
- patches #2 and #3 fix a few bugs in test_loader behaviour;
- patch #4 replaces __regex macro with ability to specify regular
expressions in __msg and __xlated using "{{" "}}" escapes;
- patch #8 updates __xlated to match disassembly lines consequently,
same way as __jited does.
Changes v2->v3:
- changed macro name from __jit_x86 to __jited with __arch_* to
specify disassembly arch (Yonghong);
- __jited matches disassembly lines consequently with "..."
allowing to skip some number of lines (Andrii);
- __xlated matches disassembly lines consequently, same as __jited;
- "{{...}}" regex brackets instead of __regex macro;
- bug fixes for old commits.
Changes v1->v2:
- stylistic changes suggested by Yonghong;
- fix for -Wformat-truncation related warning when compiled with
llvm15 (Yonghong).
v1: https://lore.kernel.org/bpf/[email protected]/
v2: https://lore.kernel.org/bpf/[email protected]/
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Both __xlated and __jited work with disassembly.
It is logical to have both work in a similar manner.
This commit updates __xlated macro handling in test_loader.c by making
it expect matches on sequential lines, same way as __jited operates.
For example:
__xlated("1: *(u64 *)(r10 -16) = r1") ;; matched on line N
__xlated("3: r0 = &(void __percpu *)(r0)") ;; matched on line N+1
Also:
__xlated("1: *(u64 *)(r10 -16) = r1") ;; matched on line N
__xlated("...") ;; not matched
__xlated("3: r0 = &(void __percpu *)(r0)") ;; mantched on any
;; line >= N
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
A program calling sub-program which does a tail call.
The idea is to verify instructions generated by jit for tail calls:
- in program and sub-program prologues;
- for subprogram call instruction;
- for tail call itself.
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Allow to verify jit behaviour by writing tests as below:
SEC("tp")
__arch_x86_64
__jited(" endbr64")
__jited(" nopl (%rax,%rax)")
__jited(" xorq %rax, %rax")
...
__naked void some_test(void)
{
asm volatile (... ::: __clobber_all);
}
Allow regular expressions in patterns, same way as in __msg.
By default assume that each __jited pattern has to be matched on the
next consecutive line of the disassembly, e.g.:
__jited(" endbr64") # matched on line N
__jited(" nopl (%rax,%rax)") # matched on line N+1
If match occurs on a wrong line an error is reported.
To override this behaviour use __jited("..."), e.g.:
__jited(" endbr64") # matched on line N
__jited("...") # not matched
__jited(" nopl (%rax,%rax)") # matched on any line >= N
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
This commit adds a utility function to get disassembled text for jited
representation of a BPF program designated by file descriptor.
Function prototype looks as follows:
int get_jited_program_text(int fd, char *text, size_t text_sz)
Where 'fd' is a file descriptor for the program, 'text' and 'text_sz'
refer to a destination buffer for disassembled text.
Output format looks as follows:
18: 77 06 ja L0
1a: 50 pushq %rax
1b: 48 89 e0 movq %rsp, %rax
1e: eb 01 jmp L1
20: 50 L0: pushq %rax
21: 50 L1: pushq %rax
^ ^^^^^^^^ ^ ^^^^^^^^^^^^^^^^^^
| binary insn | textual insn
| representation | representation
| |
instruction offset inferred local label name
The code and makefile changes are inspired by jit_disasm.c from bpftool.
Use llvm libraries to disassemble BPF program instead of libbfd to avoid
issues with disassembly output stability pointed out in [1].
Selftests makefile uses Makefile.feature to detect if LLVM libraries
are available. If that is not the case selftests build proceeds but
the function returns -EOPNOTSUPP at runtime.
[1] commit eb9d1acf634b ("bpftool: Add LLVM as default library for disassembling JIT-ed programs")
Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Upcoming changes require a notation to specify regular expression
matches for regular verifier log messages, disassembly of BPF
instructions, disassembly of jited instructions.
Neither basic nor extended POSIX regular expressions w/o additional
escaping are good for this role because of wide use of special
characters in disassembly, for example:
movq -0x10(%rbp), %rax ;; () are special characters
cmpq $0x21, %rax ;; $ is a special character
*(u64 *)(r10 -16) = r1 ;; * and () are special characters
This commit borrows syntax from LLVM's FileCheck utility.
It replaces __regex macro with ability to embed regular expressions
in __msg patters using "{{" "}}" pairs for escaping.
Syntax for __msg patterns:
pattern := (<verbatim text> | regex)*
regex := "{{" <posix extended regular expression> "}}"
For example, pattern "foo{{[0-9]+}}" matches strings like
"foo0", "foo007", etc.
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
__msg, __regex and __xlated tags are based on
__attribute__((btf_decl_tag("..."))) annotations.
Clang de-duplicates such annotations, e.g. the following
two sequences of tags are identical in final BTF:
/* seq A */ /* seq B */
__tag("foo") __tag("foo")
__tag("bar") __tag("bar")
__tag("foo")
Fix this by adding a unique suffix for each tag using __COUNTER__
pre-processor macro. E.g. here is a new definition for __msg:
#define __msg(msg) \
__attribute__((btf_decl_tag("comment:test_expect_msg=" XSTR(__COUNTER__) "=" msg)))
Using this definition the "seq A" from example above is translated to
BTF as follows:
[..] DECL_TAG 'comment:test_expect_msg=0=foo' type_id=X component_idx=-1
[..] DECL_TAG 'comment:test_expect_msg=1=bar' type_id=X component_idx=-1
[..] DECL_TAG 'comment:test_expect_msg=2=foo' type_id=X component_idx=-1
Surprisingly, this bug affects a single existing test:
verifier_spill_fill/old_stack_misc_vs_cur_ctx_ptr,
where sequence of identical messages was expected in the log.
Fixes: 537c3f66eac1 ("selftests/bpf: add generic BPF program tester-loader")
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Suppose log="foo bar buz" and msg->substr="bar".
In such case current match processing logic would update 'log' as
follows: log += strlen(msg->substr); -> log += 3 -> log=" bar".
However, the intent behind the 'log' update is to make it point after
the successful match, e.g. to make log=" buz" in the example above.
Fixes: 4ef5d6af4935 ("selftests/bpf: no need to track next_match_pos in struct test_loader")
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
When running test_loader based tests in the verbose mode each matched
message leaves a trace in the stderr, e.g.:
./test_progs -vvv -t ...
validate_msgs:PASS:expect_msg 0 nsec
validate_msgs:PASS:expect_msg 0 nsec
validate_msgs:PASS:expect_msg 0 nsec
validate_msgs:PASS:expect_msg 0 nsec
validate_msgs:PASS:expect_msg 0 nsec
This is not very helpful when debugging such tests and clobbers the
log a lot.
Signed-off-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Andrii Nakryiko says:
====================
Support passing BPF iterator to kfuncs
Add support for passing BPF iterator state to any kfunc. Such kfunc has to
declare such argument with valid `struct bpf_iter_<type> *` type and should
use "__iter" suffix in argument name, following the established suffix-based
convention. We add a simple test/demo iterator getter in bpf_testmod.
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Define BPF iterator "getter" kfunc, which accepts iterator pointer as
one of the arguments. Make sure that argument passed doesn't have to be
the very first argument (unlike new-next-destroy combo).
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
There are potentially useful cases where a specific iterator type might
need to be passed into some kfunc. So, in addition to existing
bpf_iter_<type>_{new,next,destroy}() kfuncs, allow to pass iterator
pointer to any kfunc.
We employ "__iter" naming suffix for arguments that are meant to accept
iterators. We also enforce that they accept PTR -> STRUCT btf_iter_<type>
type chain and point to a valid initialized on-the-stack iterator state.
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Verifier enforces that all iterator structs are named `bpf_iter_<name>`
and that whenever iterator is passed to a kfunc it's passed as a valid PTR ->
STRUCT chain (with potentially const modifiers in between).
We'll need this check for upcoming changes, so instead of duplicating
the logic, extract it into a helper function.
Signed-off-by: Andrii Nakryiko <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Pull smb server fixes from Steve French:
- important reconnect fix
- fix for memcpy issues on mount
- two minor cleanup patches
* tag '6.11-rc4-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: Replace one-element arrays with flexible-array members
ksmbd: fix spelling mistakes in documentation
ksmbd: fix race condition between destroy_previous_session() and smb2 operations()
ksmbd: Use unsafe_memcpy() for ntlm_negotiate
|
|
If z_erofs_gbuf_growsize() partially fails on a global buffer due to
memory allocation failure or fault injection (as reported by syzbot [1]),
new pages need to be freed by comparing to the existing pages to avoid
memory leaks.
However, the old gbuf->pages[] array may not be large enough, which can
lead to null-ptr-deref or out-of-bound access.
Fix this by checking against gbuf->nrpages in advance.
[1] https://lore.kernel.org/r/[email protected]
Reported-by: [email protected]
Fixes: d6db47e571dc ("erofs: do not use pagepool in z_erofs_gbuf_growsize()")
Cc: <[email protected]> # 6.10+
Reviewed-by: Chunhai Guo <[email protected]>
Reviewed-by: Sandeep Dhavale <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd fixes from Jason Gunthorpe:
- Incorrect error unwind in iommufd_device_do_replace()
- Correct a sparse warning missing static
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
iommufd/selftest: Make dirty_ops static
iommufd/device: Fix hwpt at err_unresv in iommufd_device_do_replace()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl fixes from Dave Jiang:
"Check for RCH dport before accessing pci_host_bridge and a fix to
address a KASAN warning for the cxl regression test suite cxl-test"
* tag 'cxl-fixes-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/test: Skip cxl_setup_parent_dport() for emulated dports
cxl/pci: Get AER capability address from RCRB only for RCH dport
|