Age | Commit message (Collapse) | Author | Files | Lines |
|
On powerpc, the errno is not inverted, and depends on ccr.so being
set. Add this to a powerpc definition of SYSCALL_RET_SET().
Co-developed-by: Thadeu Lima de Souza Cascardo <[email protected]>
Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Link: https://lore.kernel.org/linux-kselftest/[email protected]/
Fixes: 5d83c2b37d43 ("selftests/seccomp: Add powerpc support")
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Reviewed-by: Michael Ellerman <[email protected]>
|
|
Instead of special-casing the specific case of shared registers, create
a default SYSCALL_RET_SET() macro (mirroring SYSCALL_NUM_SET()), that
writes to the SYSCALL_RET register. For architectures that can't set the
return value (for whatever reason), they can define SYSCALL_RET_SET()
without an associated SYSCALL_RET() macro. This also paves the way for
architectures that need to do special things to set the return value
(e.g. powerpc).
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Acked-by: Christian Brauner <[email protected]>
|
|
When none of the registers have changed, don't flush them back. This can
happen if the architecture uses a non-register way to change the syscall
(e.g. arm64) , and a return value hasn't been written.
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Acked-by: Christian Brauner <[email protected]>
|
|
Consolidate the REGSET logic into the new ARCH_GETREG() and
ARCH_SETREG() macros, avoiding more #ifdef code in function bodies.
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Acked-by: Christian Brauner <[email protected]>
|
|
Instead of special-casing the get/set-registers routines, move the
HAVE_GETREG logic into the new ARCH_GETREG() and ARCH_SETREG() macros.
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Acked-by: Christian Brauner <[email protected]>
|
|
With all architectures now using the common SYSCALL_NUM_SET() macro, the
arch-specific #ifdef can be removed from change_syscall() itself.
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Acked-by: Christian Brauner <[email protected]>
|
|
Instead of having the mips O32 macro special-cased, pull the logic into
the SYSCALL_NUM() macro. Additionally include the ABI headers, since
these appear to have been missing, leaving __NR_O32_Linux undefined.
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Acked-by: Christian Brauner <[email protected]>
|
|
Remove the arm64 special-case in change_syscall().
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Acked-by: Christian Brauner <[email protected]>
|
|
Remove the arm special-case in change_syscall().
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Acked-by: Christian Brauner <[email protected]>
|
|
Remove the mips special-case in change_syscall().
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Acked-by: Christian Brauner <[email protected]>
|
|
In order to avoid "#ifdef"s in the main function bodies, create a new
macro, SYSCALL_NUM_SET(), where arch-specific logic can live.
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Acked-by: Christian Brauner <[email protected]>
|
|
To avoid an xtensa special-case, refactor all arch register macros to
take the register variable instead of depending on the macro expanding
as a struct member name.
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Acked-by: Christian Brauner <[email protected]>
|
|
The __NR_mknod syscall doesn't exist on arm64 (only __NR_mknodat).
Switch to the modern syscall.
Fixes: ad5682184a81 ("selftests/seccomp: Check for EPOLLHUP for user_notif")
Signed-off-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Acked-by: Christian Brauner <[email protected]>
|
|
getsetsockopt() calls getsockopt() with optlen == 1, but then checks
the resulting int. It is ok on little endian, but not on big endian.
Fix by checking char instead.
Fixes: 8a027dc0d8f5 ("selftests/bpf: add sockopt test that exercises sk helpers")
Signed-off-by: Ilya Leoshkevich <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
server_map's value size is 8, but the test tries to put an int there.
This sort of works on x86 (unless followed by non-0), but hard fails on
s390.
Fix by using __s64 instead of int.
Fixes: 2d7824ffd25c ("selftests: bpf: Add test for sk_assign")
Signed-off-by: Ilya Leoshkevich <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 5.9:
- Opt us out of the DEBUG_VM_PGTABLE support for now as it's causing
crashes.
- Fix a long standing bug in our DMA mask handling that was hidden
until recently, and which caused problems with some drivers.
- Fix a boot failure on systems with large amounts of RAM, and no
hugepage support and using Radix MMU, only seen in the lab.
- A few other minor fixes.
Thanks to Alexey Kardashevskiy, Aneesh Kumar K.V, Gautham R. Shenoy,
Hari Bathini, Ira Weiny, Nick Desaulniers, Shirisha Ganta, Vaibhav
Jain, and Vaidyanathan Srinivasan"
* tag 'powerpc-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute
cpuidle: pseries: Fix CEDE latency conversion from tb to us
powerpc/dma: Fix dma_map_ops::get_required_mask
Revert "powerpc/build: vdso linker warning for orphan sections"
powerpc/mm: Remove DEBUG_VM_PGTABLE support on powerpc
selftests/powerpc: Skip PROT_SAO test in guests/LPARS
powerpc/book3s64/radix: Fix boot failure with large amount of guest memory
|
|
Integrate the FP tests with the build system and add some documentation
for the ones run outside the kselftest infrastructure. The content in
the README was largely written by Dave Martin with edits by me.
Signed-off-by: Mark Brown <[email protected]>
Acked-by: Dave Martin <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
Add wrapper scripts which invoke fpsimd-test and sve-test with several
copies per CPU such that the context switch code will be appropriately
exercised.
Signed-off-by: Mark Brown <[email protected]>
Acked-by: Dave Martin <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
vlset is a small utility for use in conjunction with tests like the sve-test
stress test which allows another executable to be invoked with a configured
SVE vector length.
Signed-off-by: Mark Brown <[email protected]>
Acked-by: Dave Martin <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
Add programs sve-test and fpsimd-test which spin reading and writing to
the SVE and FPSIMD registers, verifying the operations they perform. The
intended use is to leave them running to stress the context switch code's
handling of these registers which isn't compatible with what kselftest
does so they're not integrated into the framework but there's no other
obvious testsuite where they fit so let's store them here.
These tests were written by Dave Martin and lightly adapted by me.
Signed-off-by: Mark Brown <[email protected]>
Acked-by: Dave Martin <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
Add a test case that does some basic verification of the SVE ptrace
interface, forking off a child with known values in the registers and
then using ptrace to inspect and manipulate the SVE registers of the
child, including in FPSIMD mode to account for sharing between the SVE
and FPSIMD registers.
This program was written by Dave Martin and modified for kselftest by
me.
Signed-off-by: Mark Brown <[email protected]>
Acked-by: Dave Martin <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
Add a test case that verifies that we can enumerate the SVE vector lengths
on systems where we detect SVE, and that those SVE vector lengths are
valid. This program was written by Dave Martin and adapted to kselftest by
me.
Signed-off-by: Mark Brown <[email protected]>
Acked-by: Dave Martin <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
differently initialized keys
PAuth adds 5 different keys that can be used to sign addresses.
Add a test that verifies that the kernel initializes them to different
values and preserves them across context switches.
Signed-off-by: Boyan Karatotev <[email protected]>
Reviewed-by: Vincenzo Frascino <[email protected]>
Reviewed-by: Amit Daniel Kachhap <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
Kernel documentation states that it will change PAuth keys on exec() calls.
Verify that all keys are correctly switched to new ones.
Signed-off-by: Boyan Karatotev <[email protected]>
Reviewed-by: Vincenzo Frascino <[email protected]>
Reviewed-by: Amit Daniel Kachhap <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
PAuth adds sign/verify controls to enable and disable groups of
instructions in hardware for compatibility with libraries that do not
implement PAuth. The kernel always enables them if it detects PAuth.
Add a test that checks that each group of instructions is enabled, if the
kernel reports PAuth as detected.
Note: For groups, for the purpose of this patch, we intend instructions
that use a certain key.
Signed-off-by: Boyan Karatotev <[email protected]>
Reviewed-by: Vincenzo Frascino <[email protected]>
Reviewed-by: Amit Daniel Kachhap <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
PAuth signs and verifies return addresses on the stack. It does so by
inserting a Pointer Authentication code (PAC) into some of the unused top
bits of an address. This is achieved by adding paciasp/autiasp instructions
at the beginning and end of a function.
This feature is partially backwards compatible with earlier versions of the
ARM architecture. To coerce the compiler into emitting fully backwards
compatible code the main file is compiled to target an earlier ARM version.
This allows the tests to check for the feature and print meaningful error
messages instead of crashing.
Add a test to verify that corrupting the return address results in a
SIGSEGV on return.
Signed-off-by: Boyan Karatotev <[email protected]>
Reviewed-by: Vincenzo Frascino <[email protected]>
Reviewed-by: Amit Daniel Kachhap <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
Add four tests to tailcalls selftest explicitly named
"tailcall_bpf2bpf_X" as their purpose is to validate that combination
of tailcalls with bpf2bpf calls are working properly.
These tests also validate LD_ABS from subprograms.
Signed-off-by: Maciej Fijalkowski <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
LD_[ABS|IND] instructions may return from the function early. bpf_tail_call
pseudo instruction is either fallthrough or return. Allow them in the
subprograms only when subprograms are BTF annotated and have scalar return
types. Allow ld_abs and tail_call in the main program even if it calls into
subprograms. In the past that was not ok to do for ld_abs, since it was JITed
with special exit sequence. Since bpf_gen_ld_abs() was introduced the ld_abs
looks like normal exit insn from JIT point of view, so it's safe to allow them
in the main program.
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
IPPROTO_IP (0) is not valid for raw sockets. Default the protocol for
raw sockets to IPPROTO_RAW if the protocol has not been set via the -P
option.
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The test harness forks() a child to run each test. Both the parent and
the child print to stdout using libc functions. That can lead to
duplicated (or more) output if the libc buffers are not flushed before
forking.
It's generally not seen when running programs directly, because stdout
will usually be line buffered when it's pointing to a terminal.
This was noticed when running the seccomp_bpf test, eg:
$ ./seccomp_bpf | tee test.log
$ grep -c "TAP version 13" test.log
2
But we only expect the TAP header to appear once.
It can be exacerbated using stdbuf to increase the buffer size:
$ stdbuf -o 1MB ./seccomp_bpf > test.log
$ grep -c "TAP version 13" test.log
13
The fix is simple, we just flush stdout & stderr before fork. Usually
stderr is unbuffered, but that can be changed, so flush it as well
just to be safe.
Signed-off-by: Michael Ellerman <[email protected]>
Tested-by: Max Filippov <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Acked-by: Kees Cook <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
In case of errors, this message was printed:
(...)
# read: Resource temporarily unavailable
# client exit code 0, server 3
# \nnetns ns1-0-BJlt5D socket stat for 10003:
(...)
Obviously, the idea was to add a new line before the socket stat and not
print "\nnetns".
Fixes: b08fbf241064 ("selftests: add test-cases for MPTCP MP_JOIN")
Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp")
Signed-off-by: Matthieu Baerts <[email protected]>
Acked-by: Paolo Abeni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Alexei Starovoitov says:
====================
pull-request: bpf 2020-09-15
The following pull-request contains BPF updates for your *net* tree.
We've added 12 non-merge commits during the last 19 day(s) which contain
a total of 10 files changed, 47 insertions(+), 38 deletions(-).
The main changes are:
1) docs/bpf fixes, from Andrii.
2) ld_abs fix, from Daniel.
3) socket casting helpers fix, from Martin.
4) hash iterator fixes, from Yonghong.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Merge 183 tests from test_btf into test_progs framework to be exercised
regularly. All the test_btf tests that were moved are modeled as proper
sub-tests in test_progs framework for ease of debugging and reporting.
No functional or behavioral changes were intended, I tried to preserve
original behavior as much as possible. E.g., `test_progs -v` will activate
"always_log" flag to emit BTF validation log.
The only difference is in reducing the max_entries limit for pretty-printing
tests from (128 * 1024) to just 128 to reduce tests running time without
reducing the coverage.
Example test run:
$ sudo ./test_progs -n 8
...
#8 btf:OK
Summary: 1/183 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
This is a simple test to check that loading and dumping metadata
in btftool works, whether or not metadata contents are used by the
program.
A C test is also added to make sure the skeleton code can read the
metadata values.
Signed-off-by: YiFei Zhu <[email protected]>
Signed-off-by: Stanislav Fomichev <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Cc: YiFei Zhu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Commit c7cdbe2efc40 ("vxlan: support for nexthop notifiers") registered
a listener in the VXLAN driver to the nexthop notification chain. Its
purpose is to cleanup FDB entries that use a nexthop that is being
deleted.
Test that such FDB entries are removed when the nexthop group that they
use is deleted. Test that entries are not deleted when a single nexthop
in the group is deleted.
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Make sure the empty nest is reported even without stats.
Make sure reporting only selected stats works fine.
Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Introduce tests to cover simple scenarios where user is watching
memory which can be accessed by kernel as well. We also support
_MODE_EXACT with _SETHWDEBUG interface. Move those testcases outside
of _BP_RANGE condition. This will help to test _MODE_EXACT scenarios
when CONFIG_HAVE_HW_BREAKPOINT is not set, eg:
$ ./ptrace-hwbreak
...
PTRACE_SET_DEBUGREG, Kernel Access Userspace, len: 8: Ok
PPC_PTRACE_SETHWDEBUG, MODE_EXACT, WO, len: 1: Ok
PPC_PTRACE_SETHWDEBUG, MODE_EXACT, RO, len: 1: Ok
PPC_PTRACE_SETHWDEBUG, MODE_EXACT, RW, len: 1: Ok
PPC_PTRACE_SETHWDEBUG, MODE_EXACT, Kernel Access Userspace, len: 1: Ok
success: ptrace-hwbreak
Suggested-by: Pedro Miraglia Franco de Carvalho <[email protected]>
Signed-off-by: Ravi Bangoria <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Add a bunch of test-cases for multiple subflow xmit:
create multiple subflows simulating different links
condition via netem and verify that the msk is able
to use completely the aggregated bandwidth.
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Bring in our fixes branch for this cycle which avoids some small
conflicts with upcoming commits.
|
|
We want the char/misc fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for Linux 5.9, take #1
- Multiple stolen time fixes, with a new capability to match x86
- Fix for hugetlbfs mappings when PUD and PMD are the same level
- Fix for hugetlbfs mappings when PTE mappings are enforced
(dirty logging, for example)
- Fix tracing output of 64bit values
|
|
When tweaking llvm optimizations, I found that selftest build failed
with the following error:
libbpf: elf: skipping unrecognized data section(6) .rodata.str1.1
libbpf: prog 'sysctl_tcp_mem': bad map relo against '.L__const.is_tcp_mem.tcp_mem_name'
in section '.rodata.str1.1'
Error: failed to open BPF object file: Relocation failed
make: *** [/work/net-next/tools/testing/selftests/bpf/test_sysctl_prog.skel.h] Error 255
make: *** Deleting file `/work/net-next/tools/testing/selftests/bpf/test_sysctl_prog.skel.h'
The local string constant "tcp_mem_name" is put into '.rodata.str1.1' section
which libbpf cannot handle. Using untweaked upstream llvm, "tcp_mem_name"
is completely inlined after loop unrolling.
Commit 7fb5eefd7639 ("selftests/bpf: Fix test_sysctl_loop{1, 2}
failure due to clang change") solved a similar problem by defining
the string const as a global. Let us do the same here
for test_sysctl_prog.c so it can weather future potential llvm changes.
Signed-off-by: Yonghong Song <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
On non-SMP kernels __per_cpu_start is not 0, so look it up in kallsyms.
Signed-off-by: Ilya Leoshkevich <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Test that an upper device of netdevs with different parent IDs can be
enslaved to a bridge.
The test fails without previous commit.
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The delay was intended to be configured to "simulate" a high(er) BDP
link. As such, it needs to be set as part of the loss-configuration and
not as part of the netem reordering configuration.
The reordering-config also requires a delay but that delay is the
reordering-extend. So, a good approach is to set the reordering-extend
as a function of the configured latency. E.g., 25% of the overall
latency.
To speed up the selftests, we limit the delay to 50ms maximum to avoid
having the selftests run for too long.
Finally, the intention of tc_reorder was that when it is unset, the test
picks a random configuration. However, currently it is always initialized
and thus the random config won't be picked up.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/6
Reported-and-reviewed-by: Matthieu Baerts <[email protected]>
Signed-off-by: Christoph Paasch <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add a test that exercises a basic sockmap / sockhash iteration. For
now we simply count the number of elements seen. Once sockmap update
from iterators works we can extend this to perform a full copy.
Signed-off-by: Lorenz Bauer <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
eBPF selftests include a script to check that bpftool builds correctly
with different command lines. Let's add one build for bpftool's
documentation so as to detect errors or warning reported by rst2man when
compiling the man pages. Also add a build to the selftests Makefile to
make sure we build bpftool documentation along with bpftool when
building the selftests.
This also builds and checks warnings for the man page for eBPF helpers,
which is built along bpftool's documentation.
This change adds rst2man as a dependency for selftests (it comes with
Python's "docutils").
v2:
- Use "--exit-status=1" option for rst2man instead of counting lines
from stderr.
- Also build bpftool as part as the selftests build (and not only when
the tests are actually run).
Signed-off-by: Quentin Monnet <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Instead of full GNU diff (which smaller boot environments may not have),
use "comm" which is more available.
Reported-by: Naresh Kamboju <[email protected]>
Cc: [email protected]
Fixes: f131d9edc29d ("selftests/lkdtm: Don't clear dmesg when running tests")
Signed-off-by: Kees Cook <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Andrii reported that with latest clang, when building selftests, we have
error likes:
error: progs/test_sysctl_loop1.c:23:16: in function sysctl_tcp_mem i32 (%struct.bpf_sysctl*):
Looks like the BPF stack limit of 512 bytes is exceeded.
Please move large on stack variables into BPF per-cpu array map.
The error is triggered by the following LLVM patch:
https://reviews.llvm.org/D87134
For example, the following code is from test_sysctl_loop1.c:
static __always_inline int is_tcp_mem(struct bpf_sysctl *ctx)
{
volatile char tcp_mem_name[] = "net/ipv4/tcp_mem/very_very_very_very_long_pointless_string";
...
}
Without the above LLVM patch, the compiler did optimization to load the string
(59 bytes long) with 7 64bit loads, 1 8bit load and 1 16bit load,
occupying 64 byte stack size.
With the above LLVM patch, the compiler only uses 8bit loads, but subregister is 32bit.
So stack requirements become 4 * 59 = 236 bytes. Together with other stuff on
the stack, total stack size exceeds 512 bytes, hence compiler complains and quits.
To fix the issue, removing "volatile" key word or changing "volatile" to
"const"/"static const" does not work, the string is put in .rodata.str1.1 section,
which libbpf did not process it and errors out with
libbpf: elf: skipping unrecognized data section(6) .rodata.str1.1
libbpf: prog 'sysctl_tcp_mem': bad map relo against '.L__const.is_tcp_mem.tcp_mem_name'
in section '.rodata.str1.1'
Defining the string const as global variable can fix the issue as it puts the string constant
in '.rodata' section which is recognized by libbpf. In the future, when libbpf can process
'.rodata.str*.*' properly, the global definition can be changed back to local definition.
Defining tcp_mem_name as a global, however, triggered a verifier failure.
./test_progs -n 7/21
libbpf: load bpf program failed: Permission denied
libbpf: -- BEGIN DUMP LOG ---
libbpf:
invalid stack off=0 size=1
verification time 6975 usec
stack depth 160+64
processed 889 insns (limit 1000000) max_states_per_insn 4 total_states
14 peak_states 14 mark_read 10
libbpf: -- END LOG --
libbpf: failed to load program 'sysctl_tcp_mem'
libbpf: failed to load object 'test_sysctl_loop2.o'
test_bpf_verif_scale:FAIL:114
#7/21 test_sysctl_loop2.o:FAIL
This actually exposed a bpf program bug. In test_sysctl_loop{1,2}, we have code
like
const char tcp_mem_name[] = "<...long string...>";
...
char name[64];
...
for (i = 0; i < sizeof(tcp_mem_name); ++i)
if (name[i] != tcp_mem_name[i])
return 0;
In the above code, if sizeof(tcp_mem_name) > 64, name[i] access may be
out of bound. The sizeof(tcp_mem_name) is 59 for test_sysctl_loop1.c and
79 for test_sysctl_loop2.c.
Without promotion-to-global change, old compiler generates code where
the overflowed stack access is actually filled with valid value, so hiding
the bpf program bug. With promotion-to-global change, the code is different,
more specifically, the previous loading constants to stack is gone, and
"name" occupies stack[-64:0] and overflow access triggers a verifier error.
To fix the issue, adjust "name" buffer size properly.
Reported-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Yonghong Song <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Rewrite inner header IPv6 in ICMPv6 messages in ip6t_NPT,
from Michael Zhou.
2) do_ip_vs_set_ctl() dereferences uninitialized value,
from Peilin Ye.
3) Support for userdata in tables, from Jose M. Guisado.
4) Do not increment ct error and invalid stats at the same time,
from Florian Westphal.
5) Remove ct ignore stats, also from Florian.
6) Add ct stats for clash resolution, from Florian Westphal.
7) Bump reference counter bump on ct clash resolution only,
this is safe because bucket lock is held, again from Florian.
8) Use ip_is_fragment() in xt_HMARK, from YueHaibing.
9) Add wildcard support for nft_socket, from Balazs Scheidler.
10) Remove superfluous IPVS dependency on iptables, from
Yaroslav Bolyukin.
11) Remove unused definition in ebt_stp, from Wang Hai.
12) Replace CONFIG_NFT_CHAIN_NAT_{IPV4,IPV6} by CONFIG_NFT_NAT
in selftests/net, from Fabian Frederick.
13) Add userdata support for nft_object, from Jose M. Guisado.
====================
Signed-off-by: David S. Miller <[email protected]>
|