aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-05-13landlock: Add IOCTL access right for character and block devicesGünther Noack6-16/+258
Introduces the LANDLOCK_ACCESS_FS_IOCTL_DEV right and increments the Landlock ABI version to 5. This access right applies to device-custom IOCTL commands when they are invoked on block or character device files. Like the truncate right, this right is associated with a file descriptor at the time of open(2), and gets respected even when the file descriptor is used outside of the thread which it was originally opened in. Therefore, a newly enabled Landlock policy does not apply to file descriptors which are already open. If the LANDLOCK_ACCESS_FS_IOCTL_DEV right is handled, only a small number of safe IOCTL commands will be permitted on newly opened device files. These include FIOCLEX, FIONCLEX, FIONBIO and FIOASYNC, as well as other IOCTL commands for regular files which are implemented in fs/ioctl.c. Noteworthy scenarios which require special attention: TTY devices are often passed into a process from the parent process, and so a newly enabled Landlock policy does not retroactively apply to them automatically. In the past, TTY devices have often supported IOCTL commands like TIOCSTI and some TIOCLINUX subcommands, which were letting callers control the TTY input buffer (and simulate keypresses). This should be restricted to CAP_SYS_ADMIN programs on modern kernels though. Known limitations: The LANDLOCK_ACCESS_FS_IOCTL_DEV access right is a coarse-grained control over IOCTL commands. Landlock users may use path-based restrictions in combination with their knowledge about the file system layout to control what IOCTLs can be done. Cc: Paul Moore <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Arnd Bergmann <[email protected]> Signed-off-by: Günther Noack <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mickaël Salaün <[email protected]>
2024-05-13samples/landlock: Fix incorrect free in populate_ruleset_netIvanov Mikhail1-2/+3
Pointer env_port_name changes after strsep(). Memory allocated via strdup() will not be freed if landlock_add_rule() returns non-zero value. Fixes: 5e990dcef12e ("samples/landlock: Support TCP restrictions") Signed-off-by: Ivanov Mikhail <[email protected]> Reviewed-by: Konstantin Meskhidze <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mickaël Salaün <[email protected]>
2024-05-12bpf: make list_for_each_entry portableJose E. Marchesi4-10/+38
[Changes from V1: - The __compat_break has been abandoned in favor of a more readable can_loop macro that can be used anywhere, including loop conditions.] The macro list_for_each_entry is defined in bpf_arena_list.h as follows: #define list_for_each_entry(pos, head, member) \ for (void * ___tmp = (pos = list_entry_safe((head)->first, \ typeof(*(pos)), member), \ (void *)0); \ pos && ({ ___tmp = (void *)pos->member.next; 1; }); \ cond_break, \ pos = list_entry_safe((void __arena *)___tmp, typeof(*(pos)), member)) The macro cond_break, in turn, expands to a statement expression that contains a `break' statement. Compound statement expressions, and the subsequent ability of placing statements in the header of a `for' loop, are GNU extensions. Unfortunately, clang implements this GNU extension differently than GCC: - In GCC the `break' statement is bound to the containing "breakable" context in which the defining `for' appears. If there is no such context, GCC emits a warning: break statement without enclosing `for' o `switch' statement. - In clang the `break' statement is bound to the defining `for'. If the defining `for' is itself inside some breakable construct, then clang emits a -Wgcc-compat warning. This patch adds a new macro can_loop to bpf_experimental, that implements the same logic than cond_break but evaluates to a boolean expression. The patch also changes all the current instances of usage of cond_break withing the header of loop accordingly. Tested in bpf-next master. No regressions. Signed-off-by: Jose E. Marchesi <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12bpf: ignore expected GCC warning in test_global_func10.cJose E. Marchesi1-0/+4
The BPF selftest global_func10 in progs/test_global_func10.c contains: struct Small { long x; }; struct Big { long x; long y; }; [...] __noinline int foo(const struct Big *big) { if (!big) return 0; return bpf_get_prandom_u32() < big->y; } [...] SEC("cgroup_skb/ingress") __failure __msg("invalid indirect access to stack") int global_func10(struct __sk_buff *skb) { const struct Small small = {.x = skb->len }; return foo((struct Big *)&small) ? 1 : 0; } GCC emits a "maybe uninitialized" warning for the code above, because it knows `foo' accesses `big->y'. Since the purpose of this selftest is to check that the verifier will fail on this sort of invalid memory access, this patch just silences the compiler warning. Tested in bpf-next master. No regressions. Signed-off-by: Jose E. Marchesi <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Yonghong Song <[email protected]> Cc: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12bpf: disable strict aliasing in test_global_func9.cJose E. Marchesi1-0/+1
The BPF selftest test_global_func9.c performs type punning and breaks srict-aliasing rules. In particular, given: int global_func9(struct __sk_buff *skb) { int result = 0; [...] { const struct C c = {.x = skb->len, .y = skb->family }; result |= foo((const struct S *)&c); } } When building with strict-aliasing enabled (the default) the initialization of `c' gets optimized away in its entirely: [... no initialization of `c' ...] r1 = r10 r1 += -40 call foo w0 |= w6 Since GCC knows that `foo' accesses s->x, we get a "maybe uninitialized" warning. On the other hand, when strict-aliasing is disabled GCC only optimizes away the store to `.y': r1 = *(u32 *) (r6+0) *(u32 *) (r10+-40) = r1 ; This is .x = skb->len in `c' r1 = r10 r1 += -40 call foo w0 |= w6 In this case the warning is not emitted, because s-> is initialized. This patch disables strict aliasing in this test when building with GCC. clang seems to not optimize this particular code even when strict aliasing is enabled. Tested in bpf-next master. Signed-off-by: Jose E. Marchesi <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Yonghong Song <[email protected]> Cc: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Free strdup memory in xdp_hw_metadataGeliang Tang1-0/+2
The strdup() function returns a pointer to a new string which is a duplicate of the string "ifname". Memory for the new string is obtained with malloc(), and need to be freed with free(). This patch adds this missing "free(saved_hwtstamp_ifname)" in cleanup() to avoid a potential memory leak in xdp_hw_metadata.c. Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/af9bcccb96655e82de5ce2b4510b88c9c8ed5ed0.1715417367.git.tanggeliang@kylinos.cn Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Fix a few tests for GCC related warnings.Cupertino Miranda4-29/+37
This patch corrects a few warnings to allow selftests to compile for GCC. -- progs/cpumask_failure.c -- progs/bpf_misc.h:136:22: error: ‘cpumask’ is used uninitialized [-Werror=uninitialized] 136 | #define __sink(expr) asm volatile("" : "+g"(expr)) | ^~~ progs/cpumask_failure.c:68:9: note: in expansion of macro ‘__sink’ 68 | __sink(cpumask); The macro __sink(cpumask) with the '+' contraint modifier forces the the compiler to expect a read and write from cpumask. GCC detects that cpumask is never initialized and reports an error. This patch removes the spurious non required definitions of cpumask. -- progs/dynptr_fail.c -- progs/dynptr_fail.c:1444:9: error: ‘ptr1’ may be used uninitialized [-Werror=maybe-uninitialized] 1444 | bpf_dynptr_clone(&ptr1, &ptr2); Many of the tests in the file are related to the detection of uninitialized pointers by the verifier. GCC is able to detect possible uninitialized values, and reports this as an error. The patch initializes all of the previous uninitialized structs. -- progs/test_tunnel_kern.c -- progs/test_tunnel_kern.c:590:9: error: array subscript 1 is outside array bounds of ‘struct geneve_opt[1]’ [-Werror=array-bounds=] 590 | *(int *) &gopt.opt_data = bpf_htonl(0xdeadbeef); | ^~~~~~~~~~~~~~~~~~~~~~~ progs/test_tunnel_kern.c:575:27: note: at offset 4 into object ‘gopt’ of size 4 575 | struct geneve_opt gopt; This tests accesses beyond the defined data for the struct geneve_opt which contains as last field "u8 opt_data[0]" which clearly does not get reserved space (in stack) in the function header. This pattern is repeated in ip6geneve_set_tunnel and geneve_set_tunnel functions. GCC is able to see this and emits a warning. The patch introduces a local struct that allocates enough space to safely allow the write to opt_data field. -- progs/jeq_infer_not_null_fail.c -- progs/jeq_infer_not_null_fail.c:21:40: error: array subscript ‘struct bpf_map[0]’ is partly outside array bounds of ‘struct <anonymous>[1]’ [-Werror=array-bounds=] 21 | struct bpf_map *inner_map = map->inner_map_meta; | ^~ progs/jeq_infer_not_null_fail.c:14:3: note: object ‘m_hash’ of size 32 14 | } m_hash SEC(".maps"); This example defines m_hash in the context of the compilation unit and casts it to struct bpf_map which is much smaller than the size of struct bpf_map. It errors out in GCC when it attempts to access an element that would be defined in struct bpf_map outsize of the defined limits for m_hash. This patch disables the warning through a GCC pragma. This changes were tested in bpf-next master selftests without any regressions. Signed-off-by: Cupertino Miranda <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Yonghong Song <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12bpf: avoid gcc overflow warning in test_xdp_vlan.cDavid Faust1-1/+1
This patch fixes an integer overflow warning raised by GCC in xdp_prognum1 of progs/test_xdp_vlan.c: GCC-BPF [test_maps] test_xdp_vlan.bpf.o progs/test_xdp_vlan.c: In function 'xdp_prognum1': progs/test_xdp_vlan.c:163:25: error: integer overflow in expression '(short int)(((__builtin_constant_p((int)vlan_hdr->h_vlan_TCI)) != 0 ? (int)(short unsigned int)((short int)((int)vlan_hdr->h_vlan_TCI << 8 >> 8) << 8 | (short int)((int)vlan_hdr->h_vlan_TCI << 0 >> 8 << 0)) & 61440 : (int)__builtin_bswap16(vlan_hdr->h_vlan_TCI) & 61440) << 8 >> 8) << 8' of type 'short int' results in '0' [-Werror=overflow] 163 | bpf_htons((bpf_ntohs(vlan_hdr->h_vlan_TCI) & 0xf000) | ^~~~~~~~~ The problem lies with the expansion of the bpf_htons macro and the expression passed into it. The bpf_htons macro (and similarly the bpf_ntohs macro) expand to a ternary operation using either __builtin_bswap16 or ___bpf_swab16 to swap the bytes, depending on whether the expression is constant. For an expression, with 'value' as a u16, like: bpf_htons (value & 0xf000) The entire (value & 0xf000) is 'x' in the expansion of ___bpf_swab16 and we get as one part of the expanded swab16: ((__u16)(value & 0xf000) << 8 >> 8 << 8 This will always evaluate to 0, which is intentional since this subexpression deals with the byte guaranteed to be 0 by the mask. However, GCC warns because the precise reason this always evaluates to 0 is an overflow. Specifically, the plain 0xf000 in the expression is a signed 32-bit integer, which causes 'value' to also be promoted to a signed 32-bit integer, and the combination of the 8-bit left shift and down-cast back to __u16 results in a signed overflow (really a 'warning: overflow in conversion from int to __u16' which is propegated up through the rest of the expression leading to the ultimate overflow warning above), which is a valid warning despite being the intended result of this code. Clang does not warn on this case, likely because it performs constant folding later in the compilation process relative to GCC. It seems that by the time clang does constant folding for this expression, the side of the ternary with this overflow has already been discarded. Fortunately, this warning is easily silenced by simply making the 0xf000 mask explicitly unsigned. This has no impact on the result. Signed-off-by: David Faust <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Eduard Zingerman <[email protected]> Cc: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12tools: remove redundant ethtool.h from tooling infraTushar Vyavahare1-2271/+0
Remove the redundant ethtool.h header file from tools/include/uapi/linux. The file is unnecessary as the system uses the kernel's include/uapi/linux/ethtool.h directly. Signed-off-by: Tushar Vyavahare <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12Merge branch 'retire-progs-test_sock_addr'Alexei Starovoitov19-1274/+1992
Jordan Rife says: ==================== Retire progs/test_sock_addr.c This patch series migrates remaining tests from bpf/test_sock_addr.c to prog_tests/sock_addr.c and progs/verifier_sock_addr.c in order to fully retire the old-style test program and expands test coverage to test previously untested scenarios related to sockaddr hooks. This is a continuation of the work started recently during the expansion of prog_tests/sock_addr.c. Link: https://lore.kernel.org/bpf/[email protected]/T/#u ======= Patches ======= * Patch 1 moves tests that check valid return values for recvmsg hooks into progs/verifier_sock_addr.c, a new addition to the verifier test suite. * Patches 2-5 lay the groundwork for test migration, enabling prog_tests/sock_addr.c to handle more test dimensions. * Patches 6-11 move existing tests to prog_tests/sock_addr.c. * Patch 12 removes some redundant test cases. * Patches 14-17 expand on existing test coverage. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Expand ATTACH_REJECT testsJordan Rife1-0/+187
This expands coverage for ATTACH_REJECT tests to include connect_unix, sendmsg_unix, recvmsg*, getsockname*, and getpeername*. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Expand getsockname and getpeername testsJordan Rife5-2/+412
This expands coverage for getsockname and getpeername hooks to include getsockname4, getsockname6, getpeername4, and getpeername6. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12sefltests/bpf: Expand sockaddr hook deny testsJordan Rife7-0/+378
This patch expands test coverage for EPERM tests to include connect and bind calls and rounds out the coverage for sendmsg by adding tests for sendmsg_unix. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Expand sockaddr program return value testsJordan Rife1-0/+294
This patch expands verifier coverage for program return values to cover bind, connect, sendmsg, getsockname, and getpeername hooks. It also rounds out the recvmsg coverage by adding test cases for recvmsg_unix hooks. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Retire test_sock_addr.(c|sh)Jordan Rife4-636/+1
Fully remove test_sock_addr.c and test_sock_addr.sh, as test coverage has been fully moved to prog_tests/sock_addr.c. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Remove redundant sendmsg test casesJordan Rife1-161/+0
Remove these test cases completely, as the same behavior is already covered by other sendmsg* test cases in prog_tests/sock_addr.c. This just rewrites the destination address similar to sendmsg_v4_prog and sendmsg_v6_prog. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Migrate ATTACH_REJECT test casesJordan Rife2-146/+102
Migrate test case from bpf/test_sock_addr.c ensuring that program attachment fails when using an inappropriate attach type. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Migrate expected_attach_type testsJordan Rife2-84/+96
Migrates tests from progs/test_sock_addr.c ensuring that programs fail to load when the expected attach type does not match. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Migrate wildcard destination rewrite testJordan Rife3-20/+37
Migrate test case from bpf/test_sock_addr.c ensuring that sendmsg respects when sendmsg6 hooks rewrite the destination IP with the IPv6 wildcard IP, [::]. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Migrate sendmsg6 v4 mapped address testsJordan Rife3-20/+42
Migrate test case from bpf/test_sock_addr.c ensuring that sendmsg returns -ENOTSUPP when sending to an IPv4-mapped IPv6 address to prog_tests/sock_addr.c. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Migrate sendmsg deny test casesJordan Rife4-45/+110
This set of tests checks that sendmsg calls are rejected (return -EPERM) when the sendmsg* hook returns 0. Replace those in bpf/test_sock_addr.c with corresponding tests in prog_tests/sock_addr.c. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Migrate WILDCARD_IP testJordan Rife3-20/+56
Move wildcard IP sendmsg test case out of bpf/test_sock_addr.c into prog_tests/sock_addr.c. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Handle SYSCALL_EPERM and SYSCALL_ENOTSUPP test casesJordan Rife1-20/+58
In preparation to move test cases from bpf/test_sock_addr.c that expect system calls to return ENOTSUPP or EPERM, this patch propagates errno from relevant system calls up to test_sock_addr() where the result can be checked. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Handle ATTACH_REJECT test casesJordan Rife1-1/+34
In preparation to move test cases from bpf/test_sock_addr.c that expect ATTACH_REJECT, this patch adds BPF_SKEL_FUNCS_RAW to generate load and destroy functions that use bpf_prog_attach() to control the attach_type. The normal load functions use bpf_program__attach_cgroup which does not have the same degree of control over the attach type, as bpf_program_attach_fd() calls bpf_link_create() with the attach type extracted from prog using bpf_program__expected_attach_type(). It is currently not possible to modify the attach type before bpf_program__attach_cgroup() is called, since bpf_program__set_expected_attach_type() has no effect after the program is loaded. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Handle LOAD_REJECT test casesJordan Rife1-5/+98
In preparation to move test cases from bpf/test_sock_addr.c that expect LOAD_REJECT, this patch adds expected_attach_type and extends load_fn to accept an expected attach type and a flag indicating whether or not rejection is expected. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Use program name for skel load/destroy functionsJordan Rife1-46/+50
In preparation to migrate tests from bpf/test_sock_addr.c to sock_addr.c, update BPF_SKEL_FUNCS so that it generates functions based on prog_name instead of skel_name. This allows us to differentiate between programs in the same skeleton. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12selftests/bpf: Migrate recvmsg* return code tests to verifier_sock_addr.cJordan Rife3-70/+39
This set of tests check that the BPF verifier rejects programs with invalid return codes (recvmsg4 and recvmsg6 hooks can only return 1). This patch replaces the tests in test_sock_addr.c with verifier_sock_addr.c, a new verifier prog_tests for sockaddr hooks, in a step towards fully retiring test_sock_addr.c. Signed-off-by: Jordan Rife <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12tools lib subcmd: Show parent options in helpNamhyung Kim1-12/+24
I've just realized that help message in a subcommand didn't show one in the parent command. Since the option parser understands the parent, display code should do the same. For example, `perf ftrace latency -h` should show options in the `perf ftrace` command too. Before: $ perf ftrace latency -h Usage: perf ftrace [<options>] [<command>] or: perf ftrace [<options>] -- [<command>] [<options>] or: perf ftrace {trace|latency} [<options>] [<command>] or: perf ftrace {trace|latency} [<options>] -- [<command>] [<options>] -b, --use-bpf Use BPF to measure function latency -n, --use-nsec Use nano-second histogram -T, --trace-funcs <func> Show latency of given function After: $ perf ftrace latency -h Usage: perf ftrace [<options>] [<command>] or: perf ftrace [<options>] -- [<command>] [<options>] or: perf ftrace {trace|latency} [<options>] [<command>] or: perf ftrace {trace|latency} [<options>] -- [<command>] [<options>] -a, --all-cpus System-wide collection from all CPUs -b, --use-bpf Use BPF to measure function latency -C, --cpu <cpu> List of cpus to monitor -n, --use-nsec Use nano-second histogram -p, --pid <pid> Trace on existing process id -T, --trace-funcs <func> Show latency of given function -v, --verbose Be more verbose --tid <tid> Trace on existing thread id (exclusive to --pid) Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-05-12riscv, bpf: make some atomic operations fully orderedPuranjay Mohan1-10/+10
The BPF atomic operations with the BPF_FETCH modifier along with BPF_XCHG and BPF_CMPXCHG are fully ordered but the RISC-V JIT implements all atomic operations except BPF_CMPXCHG with relaxed ordering. Section 8.1 of the "The RISC-V Instruction Set Manual Volume I: Unprivileged ISA" [1], titled, "Specifying Ordering of Atomic Instructions" says: | To provide more efficient support for release consistency [5], each | atomic instruction has two bits, aq and rl, used to specify additional | memory ordering constraints as viewed by other RISC-V harts. and | If only the aq bit is set, the atomic memory operation is treated as | an acquire access. | If only the rl bit is set, the atomic memory operation is treated as a | release access. | | If both the aq and rl bits are set, the atomic memory operation is | sequentially consistent. Fix this by setting both aq and rl bits as 1 for operations with BPF_FETCH and BPF_XCHG. [1] https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf Fixes: dd642ccb45ec ("riscv, bpf: Implement more atomic operations for RV64") Signed-off-by: Puranjay Mohan <[email protected]> Reviewed-by: Pu Lehui <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12riscv, bpf: Fix typo in commentXiao Wang1-2/+2
We can use either "instruction" or "insn" in the comment. Signed-off-by: Xiao Wang <[email protected]> Reviewed-by: Pu Lehui <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12s390/bpf: Emit a barrier for BPF_FETCH instructionsIlya Leoshkevich1-2/+6
BPF_ATOMIC_OP() macro documentation states that "BPF_ADD | BPF_FETCH" should be the same as atomic_fetch_add(), which is currently not the case on s390x: the serialization instruction "bcr 14,0" is missing. This applies to "and", "or" and "xor" variants too. s390x is allowed to reorder stores with subsequent fetches from different addresses, so code relying on BPF_FETCH acting as a barrier, for example: stw [%r0], 1 afadd [%r1], %r2 ldxw %r3, [%r4] may be broken. Fix it by emitting "bcr 14,0". Note that a separate serialization instruction is not needed for BPF_XCHG and BPF_CMPXCHG, because COMPARE AND SWAP performs serialization itself. Fixes: ba3b86b9cef0 ("s390/bpf: Implement new atomic ops") Reported-by: Puranjay Mohan <[email protected]> Closes: https://lore.kernel.org/bpf/[email protected]/ Signed-off-by: Ilya Leoshkevich <[email protected]> Reviewed-by: Puranjay Mohan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12Merge branch 'bpf-inline-helpers-in-arm64-and-riscv-jits'Alexei Starovoitov8-0/+132
Puranjay Mohan says: ==================== bpf: Inline helpers in arm64 and riscv JITs Changes in v5 -> v6: arm64 v5: https://lore.kernel.org/all/[email protected]/ riscv v2: https://lore.kernel.org/all/[email protected]/ - Combine riscv and arm64 changes in single series - Some coding style fixes Changes in v4 -> v5: v4: https://lore.kernel.org/all/[email protected]/ - Implement the inlining of the bpf_get_smp_processor_id() in the JIT. NOTE: This needs to be based on: https://lore.kernel.org/all/[email protected]/ to be built. Manual run of bpf-ci with this series rebased on above: https://github.com/kernel-patches/bpf/pull/6929 Changes in v3 -> v4: v3: https://lore.kernel.org/all/[email protected]/ - Fix coding style issue related to C89 standards. Changes in v2 -> v3: v2: https://lore.kernel.org/all/[email protected]/ - Fixed the xlated dump of percpu mov to "r0 = &(void __percpu *)(r0)" - Made ARM64 and x86-64 use the same code for inlining. The only difference that remains is the per-cpu address of the cpu_number. Changes in v1 -> v2: v1: https://lore.kernel.org/all/[email protected]/ - Add a patch to inline bpf_get_smp_processor_id() - Fix an issue in MRS instruction encoding as pointed out by Will - Remove CONFIG_SMP check because arm64 kernel always compiles with CONFIG_SMP This series adds the support of internal only per-CPU instructions and inlines the bpf_get_smp_processor_id() helper call for ARM64 and RISC-V BPF JITs. Here is an example of calls to bpf_get_smp_processor_id() and percpu_array_map_lookup_elem() before and after this series on ARM64. BPF ===== BEFORE AFTER -------- ------- int cpu = bpf_get_smp_processor_id(); int cpu = bpf_get_smp_processor_id(); (85) call bpf_get_smp_processor_id#229032 (85) call bpf_get_smp_processor_id#8 p = bpf_map_lookup_elem(map, &zero); p = bpf_map_lookup_elem(map, &zero); (18) r1 = map[id:78] (18) r1 = map[id:153] (18) r2 = map[id:82][0]+65536 (18) r2 = map[id:157][0]+65536 (85) call percpu_array_map_lookup_elem#313512 (07) r1 += 496 (61) r0 = *(u32 *)(r2 +0) (35) if r0 >= 0x1 goto pc+5 (67) r0 <<= 3 (0f) r0 += r1 (79) r0 = *(u64 *)(r0 +0) (bf) r0 = &(void __percpu *)(r0) (05) goto pc+1 (b7) r0 = 0 ARM64 JIT =========== BEFORE AFTER -------- ------- int cpu = bpf_get_smp_processor_id(); int cpu = bpf_get_smp_processor_id(); mov x10, #0xfffffffffffff4d0 mrs x10, sp_el0 movk x10, #0x802b, lsl #16 ldr w7, [x10, #24] movk x10, #0x8000, lsl #32 blr x10 add x7, x0, #0x0 p = bpf_map_lookup_elem(map, &zero); p = bpf_map_lookup_elem(map, &zero); mov x0, #0xffff0003ffffffff mov x0, #0xffff0003ffffffff movk x0, #0xce5c, lsl #16 movk x0, #0xe0f3, lsl #16 movk x0, #0xca00 movk x0, #0x7c00 mov x1, #0xffff8000ffffffff mov x1, #0xffff8000ffffffff movk x1, #0x8bdb, lsl #16 movk x1, #0xb0c7, lsl #16 movk x1, #0x6000 movk x1, #0xe000 mov x10, #0xffffffffffff3ed0 add x0, x0, #0x1f0 movk x10, #0x802d, lsl #16 ldr w7, [x1] movk x10, #0x8000, lsl #32 cmp x7, #0x1 blr x10 b.cs 0x0000000000000090 add x7, x0, #0x0 lsl x7, x7, #3 add x7, x7, x0 ldr x7, [x7] mrs x10, tpidr_el1 add x7, x7, x10 b 0x0000000000000094 mov x7, #0x0 Performance improvement found using benchmark[1] ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc +---------------+-------------------+-------------------+--------------+ | Name | Before | After | % change | |---------------+-------------------+-------------------+--------------| | glob-arr-inc | 23.380 ± 1.675M/s | 25.893 ± 0.026M/s | + 10.74% | | arr-inc | 23.928 ± 0.034M/s | 25.213 ± 0.063M/s | + 5.37% | | hash-inc | 12.352 ± 0.005M/s | 12.609 ± 0.013M/s | + 2.08% | +---------------+-------------------+-------------------+--------------+ [1] https://github.com/anakryiko/linux/commit/8dec900975ef RISCV64 JIT output for `call bpf_get_smp_processor_id` ======================================================= Before After -------- ------- auipc t1,0x848c ld a5,32(tp) jalr 604(t1) mv a5,a0 Benchmark using [1] on Qemu. ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc +---------------+------------------+------------------+--------------+ | Name | Before | After | % change | |---------------+------------------+------------------+--------------| | glob-arr-inc | 1.077 ± 0.006M/s | 1.336 ± 0.010M/s | + 24.04% | | arr-inc | 1.078 ± 0.002M/s | 1.332 ± 0.015M/s | + 23.56% | | hash-inc | 0.494 ± 0.004M/s | 0.653 ± 0.001M/s | + 32.18% | +---------------+------------------+------------------+--------------+ ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12bpf, arm64: inline bpf_get_smp_processor_id() helperPuranjay Mohan3-0/+28
Inline calls to bpf_get_smp_processor_id() helper in the JIT by emitting a read from struct thread_info. The SP_EL0 system register holds the pointer to the task_struct and thread_info is the first member of this struct. We can read the cpu number from the thread_info. Here is how the ARM64 JITed assembly changes after this commit: ARM64 JIT =========== BEFORE AFTER -------- ------- int cpu = bpf_get_smp_processor_id(); int cpu = bpf_get_smp_processor_id(); mov x10, #0xfffffffffffff4d0 mrs x10, sp_el0 movk x10, #0x802b, lsl #16 ldr w7, [x10, #24] movk x10, #0x8000, lsl #32 blr x10 add x7, x0, #0x0 Performance improvement using benchmark[1] ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc +---------------+-------------------+-------------------+--------------+ | Name | Before | After | % change | |---------------+-------------------+-------------------+--------------| | glob-arr-inc | 23.380 ± 1.675M/s | 25.893 ± 0.026M/s | + 10.74% | | arr-inc | 23.928 ± 0.034M/s | 25.213 ± 0.063M/s | + 5.37% | | hash-inc | 12.352 ± 0.005M/s | 12.609 ± 0.013M/s | + 2.08% | +---------------+-------------------+-------------------+--------------+ [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12arm64, bpf: add internal-only MOV instruction to resolve per-CPU addrsPuranjay Mohan4-0/+38
Support an instruction for resolving absolute addresses of per-CPU data from their per-CPU offsets. This instruction is internal-only and users are not allowed to use them directly. They will only be used for internal inlining optimizations for now between BPF verifier and BPF JITs. Since commit 7158627686f0 ("arm64: percpu: implement optimised pcpu access using tpidr_el1"), the per-cpu offset for the CPU is stored in the tpidr_el1/2 register of that CPU. To support this BPF instruction in the ARM64 JIT, the following ARM64 instructions are emitted: mov dst, src // Move src to dst, if src != dst mrs tmp, tpidr_el1/2 // Move per-cpu offset of the current cpu in tmp. add dst, dst, tmp // Add the per cpu offset to the dst. To measure the performance improvement provided by this change, the benchmark in [1] was used: Before: glob-arr-inc : 23.597 ± 0.012M/s arr-inc : 23.173 ± 0.019M/s hash-inc : 12.186 ± 0.028M/s After: glob-arr-inc : 23.819 ± 0.034M/s arr-inc : 23.285 ± 0.017M/s hash-inc : 12.419 ± 0.011M/s [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12riscv, bpf: inline bpf_get_smp_processor_id()Puranjay Mohan4-0/+42
Inline the calls to bpf_get_smp_processor_id() in the riscv bpf jit. RISCV saves the pointer to the CPU's task_struct in the TP (thread pointer) register. This makes it trivial to get the CPU's processor id. As thread_info is the first member of task_struct, we can read the processor id from TP + offsetof(struct thread_info, cpu). RISCV64 JIT output for `call bpf_get_smp_processor_id` ====================================================== Before After -------- ------- auipc t1,0x848c ld a5,32(tp) jalr 604(t1) mv a5,a0 Benchmark using [1] on Qemu. ./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc +---------------+------------------+------------------+--------------+ | Name | Before | After | % change | |---------------+------------------+------------------+--------------| | glob-arr-inc | 1.077 ± 0.006M/s | 1.336 ± 0.010M/s | + 24.04% | | arr-inc | 1.078 ± 0.002M/s | 1.332 ± 0.015M/s | + 23.56% | | hash-inc | 0.494 ± 0.004M/s | 0.653 ± 0.001M/s | + 32.18% | +---------------+------------------+------------------+--------------+ NOTE: This benchmark includes changes from this patch and the previous patch that implemented the per-cpu insn. [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <[email protected]> Acked-by: Kumar Kartikeya Dwivedi <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Acked-by: Björn Töpel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12riscv, bpf: add internal-only MOV instruction to resolve per-CPU addrsPuranjay Mohan1-0/+24
Support an instruction for resolving absolute addresses of per-CPU data from their per-CPU offsets. This instruction is internal-only and users are not allowed to use them directly. They will only be used for internal inlining optimizations for now between BPF verifier and BPF JITs. RISC-V uses generic per-cpu implementation where the offsets for CPUs are kept in an array called __per_cpu_offset[cpu_number]. RISCV stores the address of the task_struct in TP register. The first element in task_struct is struct thread_info, and we can get the cpu number by reading from the TP register + offsetof(struct thread_info, cpu). Once we have the cpu number in a register we read the offset for that cpu from address: &__per_cpu_offset + cpu_number << 3. Then we add this offset to the destination register. To measure the improvement from this change, the benchmark in [1] was used on Qemu: Before: glob-arr-inc : 1.127 ± 0.013M/s arr-inc : 1.121 ± 0.004M/s hash-inc : 0.681 ± 0.052M/s After: glob-arr-inc : 1.138 ± 0.011M/s arr-inc : 1.366 ± 0.006M/s hash-inc : 0.676 ± 0.001M/s [1] https://github.com/anakryiko/linux/commit/8dec900975ef Signed-off-by: Puranjay Mohan <[email protected]> Acked-by: Björn Töpel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12ARC: Add eBPF JIT supportShahab Vahedi9-2/+4611
This will add eBPF JIT support to the 32-bit ARCv2 processors. The implementation is qualified by running the BPF tests on a Synopsys HSDK board with "ARC HS38 v2.1c at 500 MHz" as the 4-core CPU. The test_bpf.ko reports 2-10 fold improvements in execution time of its tests. For instance: test_bpf: #33 tcpdump port 22 jited:0 704 1766 2104 PASS test_bpf: #33 tcpdump port 22 jited:1 120 224 260 PASS test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:0 238 PASS test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:1 23 PASS test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:0 2034681 PASS test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:1 1020022 PASS Deployment and structure ------------------------ The related codes are added to "arch/arc/net": - bpf_jit.h -- The interface that a back-end translator must provide - bpf_jit_core.c -- Knows how to handle the input eBPF byte stream - bpf_jit_arcv2.c -- The back-end code that knows the translation logic The bpf_int_jit_compile() at the end of bpf_jit_core.c is the entrance to the whole process. Normally, the translation is done in one pass, namely the "normal pass". In case some relocations are not known during this pass, some data (arc_jit_data) is allocated for the next pass to come. This possible next (and last) pass is called the "extra pass". 1. Normal pass # The necessary pass 1a. Dry run # Get the whole JIT length, epilogue offset, etc. 1b. Emit phase # Allocate memory and start emitting instructions 2. Extra pass # Only needed if there are relocations to be fixed 2a. Patch relocations Support status -------------- The JIT compiler supports BPF instructions up to "cpu=v4". However, it does not yet provide support for: - Tail calls - Atomic operations - 64-bit division/remainder - BPF_PROBE_MEM* (exception table) The result of "test_bpf" test suite on an HSDK board is: hsdk-lnx# insmod test_bpf.ko test_suite=test_bpf test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed] All the failing test cases are due to the ones that were not JIT'ed. Categorically, they can be represented as: .-----------.------------.-------------. | test type | opcodes | # of cases | |-----------+------------+-------------| | atomic | 0xC3, 0xDB | 149 | | div64 | 0x37, 0x3F | 22 | | mod64 | 0x97, 0x9F | 15 | `-----------^------------+-------------| | (total) 186 | `-------------' Setup: build config ------------------- The following configs must be set to have a working JIT test: CONFIG_BPF_JIT=y CONFIG_BPF_JIT_ALWAYS_ON=y CONFIG_TEST_BPF=m The following options are not necessary for the tests module, but are good to have: CONFIG_DEBUG_INFO=y # prerequisite for below CONFIG_DEBUG_INFO_BTF=y # so bpftool can generate vmlinux.h CONFIG_FTRACE=y # CONFIG_BPF_SYSCALL=y # all these options lead to CONFIG_KPROBE_EVENTS=y # having CONFIG_BPF_EVENTS=y CONFIG_PERF_EVENTS=y # Some BPF programs provide data through /sys/kernel/debug: CONFIG_DEBUG_FS=y arc# mount -t debugfs debugfs /sys/kernel/debug Setup: elfutils --------------- The libdw.{so,a} library that is used by pahole for processing the final binary must come from elfutils 0.189 or newer. The support for ARCv2 [1] has been added since that version. [1] https://sourceware.org/git/?p=elfutils.git;a=commit;h=de3d46b3e7 Setup: pahole ------------- The line below in linux/scripts/Makefile.btf must be commented out: pahole-flags-$(call test-ge, $(pahole-ver), 121) += --btf_gen_floats Or else, the build will fail: $ make V=1 ... BTF .btf.vmlinux.bin.o pahole -J --btf_gen_floats \ -j --lang_exclude=rust \ --skip_encoding_btf_inconsistent_proto \ --btf_gen_optimized .tmp_vmlinux.btf Complex, interval and imaginary float types are not supported Encountered error while encoding BTF. ... BTFIDS vmlinux ./tools/bpf/resolve_btfids/resolve_btfids vmlinux libbpf: failed to find '.BTF' ELF section in vmlinux FAILED: load BTF from vmlinux: No data available This is due to the fact that the ARC toolchains generate "complex float" DIE entries in libgcc and at the moment, pahole can't handle such entries. Running the tests ----------------- host$ scp /bld/linux/lib/test_bpf.ko arc: arc # sysctl net.core.bpf_jit_enable=1 arc # insmod test_bpf.ko test_suite=test_bpf ... test_bpf: #1048 Staggered jumps: JMP32_JSLE_X jited:1 697811 PASS test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed] Acknowledgments --------------- - Claudiu Zissulescu for his unwavering support - Yuriy Kolerov for testing and troubleshooting - Vladimir Isaev for the pahole workaround - Sergey Matyukevich for paving the road by adding the interpreter support Signed-off-by: Shahab Vahedi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-12hwmon: (nzxt-kraken3) Bail out for unsupported device variantsGuenter Roeck1-1/+2
Dan Carpenter reports: Commit cbeb479ff4cd ("hwmon: (nzxt-kraken3) Decouple device names from kinds") from Apr 28, 2024 (linux-next), leads to the following Smatch static checker warning: drivers/hwmon/nzxt-kraken3.c:957 kraken3_probe() error: uninitialized symbol 'device_name'. Indeed, 'device_name' will be uninitizalized if an unknown product is encountered. In practice this should not matter because the driver should not instantiate on unknown products, but lets play safe and bail out if that happens. Reported-by: Dan Carpenter <[email protected]> Closes: https://lore.kernel.org/linux-hwmon/[email protected]/ Cc: Jonas Malaco <[email protected]> Cc: Aleksa Savic <[email protected]> Fixes: cbeb479ff4cd ("hwmon: (nzxt-kraken3) Decouple device names from kinds") Acked-by: Jonas Malaco <[email protected]> Signed-off-by: Guenter Roeck <[email protected]>
2024-05-12Linux 6.9Linus Torvalds1-1/+1
2024-05-12Merge tag 'kselftest-fix-vfork-2024-05-12' of ↵Linus Torvalds4-67/+147
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull Kselftest fixes from Mickaël Salaün: "Fix Kselftest's vfork() side effects. As reported by Kernel Test Robot and Sean Christopherson, some tests fail since v6.9-rc1 . This is due to the use of vfork() which introduced some side effects. Similarly, while making it more generic, a previous commit made some Landlock file system tests flaky, and subject to the host's file system mount configuration. This fixes all these side effects by replacing vfork() with clone3() and CLONE_VFORK, which is cleaner (no arbitrary shared memory) and makes the Kselftest framework more robust" Link: https://lore.kernel.org/oe-lkp/[email protected] Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] * tag 'kselftest-fix-vfork-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/harness: Handle TEST_F()'s explicit exit codes selftests/harness: Fix vfork() side effects selftests/harness: Share _metadata between forked processes selftests/pidfd: Fix wrong expectation selftests/harness: Constify fixture variants selftests/landlock: Do not allocate memory in fixture data selftests/harness: Fix interleaved scheduling leading to race conditions selftests/harness: Fix fixture teardown selftests/landlock: Fix FS tests when run on a private mount point selftests/pidfd: Fix config for pidfd_setns_test
2024-05-12Merge tag 'for-linus-6.9' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-1/+1
Pull kvm fix from Paolo Bonzini: - Fix NULL pointer read on s390 in ioctl(KVM_CHECK_EXTENSION) for /dev/kvm * tag 'for-linus-6.9' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: s390: Check kvm pointer when testing KVM_CAP_S390_HPAGE_1M
2024-05-12Merge tag 'edac_urgent_for_v6.9' of ↵Linus Torvalds1-13/+37
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fix from Borislav Petkov: - Fix a race condition when clearing error count bits and toggling the error interrupt throug the same register, in synopsys_edac * tag 'edac_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/synopsys: Fix ECC status and IRQ control race condition
2024-05-12hwmon: (emc1403) Add support for EMC1428 and EMC1438.Lars Petter Mostad2-14/+124
EMC1428 and EMC1438 are similar to EMC14xx, but have eight temperature channels, as well as signed data and limit registers. Chips currently supported by this driver have unsigned registers only. Signed-off-by: Lars Petter Mostad <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Guenter Roeck <[email protected]>
2024-05-12Merge tag 'x86_urgent_for_v6.9' of ↵Linus Torvalds3-9/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add a new PCI ID which belongs to a new AMD CPU family 0x1a - Ensure that that last level cache ID is set in all cases, in the AMD CPU topology parsing code, in order to prevent invalid scheduling domain CPU masks * tag 'x86_urgent_for_v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/topology/amd: Ensure that LLC ID is initialized x86/amd_nb: Add new PCI IDs for AMD family 0x1a
2024-05-12RDMA/cma: Fix kmemleak in rdma_core observed during blktests nvme/rdma use siwZhu Yanjun1-1/+3
When running blktests nvme/rdma, the following kmemleak issue will appear. kmemleak: Kernel memory leak detector initialized (mempool available:36041) kmemleak: Automatic memory scanning thread started kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 8 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 17 new suspected memory leaks (see /sys/kernel/debug/kmemleak) kmemleak: 4 new suspected memory leaks (see /sys/kernel/debug/kmemleak) unreferenced object 0xffff88855da53400 (size 192): comm "rdma", pid 10630, jiffies 4296575922 hex dump (first 32 bytes): 37 00 00 00 00 00 00 00 c0 ff ff ff 1f 00 00 00 7............... 10 34 a5 5d 85 88 ff ff 10 34 a5 5d 85 88 ff ff .4.].....4.].... backtrace (crc 47f66721): [<ffffffff911251bd>] kmalloc_trace+0x30d/0x3b0 [<ffffffffc2640ff7>] alloc_gid_entry+0x47/0x380 [ib_core] [<ffffffffc2642206>] add_modify_gid+0x166/0x930 [ib_core] [<ffffffffc2643468>] ib_cache_update.part.0+0x6d8/0x910 [ib_core] [<ffffffffc2644e1a>] ib_cache_setup_one+0x24a/0x350 [ib_core] [<ffffffffc263949e>] ib_register_device+0x9e/0x3a0 [ib_core] [<ffffffffc2a3d389>] 0xffffffffc2a3d389 [<ffffffffc2688cd8>] nldev_newlink+0x2b8/0x520 [ib_core] [<ffffffffc2645fe3>] rdma_nl_rcv_msg+0x2c3/0x520 [ib_core] [<ffffffffc264648c>] rdma_nl_rcv_skb.constprop.0.isra.0+0x23c/0x3a0 [ib_core] [<ffffffff9270e7b5>] netlink_unicast+0x445/0x710 [<ffffffff9270f1f1>] netlink_sendmsg+0x761/0xc40 [<ffffffff9249db29>] __sys_sendto+0x3a9/0x420 [<ffffffff9249dc8c>] __x64_sys_sendto+0xdc/0x1b0 [<ffffffff92db0ad3>] do_syscall_64+0x93/0x180 [<ffffffff92e00126>] entry_SYSCALL_64_after_hwframe+0x71/0x79 The root cause: rdma_put_gid_attr is not called when sgid_attr is set to ERR_PTR(-ENODEV). Reported-and-tested-by: Yi Zhang <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/T/ Fixes: f8ef1be816bf ("RDMA/cma: Avoid GID lookups on iWARP devices") Reviewed-by: Chuck Lever <[email protected]> Signed-off-by: Zhu Yanjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]>
2024-05-12ALSA: scarlett2: Increase mixer range to +12dBGeoffrey D. Bennett1-4/+5
The values loaded into the mixer are 16-bit values, with 8192 representing 0dB, going up to a current maximum of 16345 (+6dB). All supported interfaces have no problem going up to 32612 (+12dB), so update SCARLETT2_MIXER_MAX_DB and scarlett2_mixer_values[] to allow for this. Tested with: - Scarlett 2nd Gen 6i6, 18i8, 18i20 - Scarlett 3rd Gen 4i4, 8i6, 18i8, 18i20 - Scarlett 4th Gen Solo, 2i2, 4i4 - Clarett+ 2Pre, 4Pre, 8Pre - Vocaster One and Two Signed-off-by: Geoffrey D. Bennett <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2024-05-12ALSA: scarlett2: Add S/PDIF source selection controlsGeoffrey D. Bennett1-0/+179
Add S/PDIF Source/Digital I/O Mode selection controls for the Scarlett 3rd Gen 18i8/18i20 and Clarett 4Pre/8Pre interfaces. These models have both coax S/PDIF and optical inputs, and the optical inputs are switchable between being used as S/PDIF and ADAT inputs. The Scarlett 3rd Gen 18i20 also has a "Dual ADAT" mode for 8-channel audio at 88.2/96kHz. Signed-off-by: Geoffrey D. Bennett <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2024-05-12RDMA/IPoIB: Fix format truncation compilation errorsLeon Romanovsky1-2/+6
Truncate the device name to store IPoIB VLAN name. [leonro@5b4e8fba4ddd kernel]$ make -s -j 20 allmodconfig [leonro@5b4e8fba4ddd kernel]$ make -s -j 20 W=1 drivers/infiniband/ulp/ipoib/ drivers/infiniband/ulp/ipoib/ipoib_vlan.c: In function ‘ipoib_vlan_add’: drivers/infiniband/ulp/ipoib/ipoib_vlan.c:187:52: error: ‘%04x’ directive output may be truncated writing 4 bytes into a region of size between 0 and 15 [-Werror=format-truncation=] 187 | snprintf(intf_name, sizeof(intf_name), "%s.%04x", | ^~~~ drivers/infiniband/ulp/ipoib/ipoib_vlan.c:187:48: note: directive argument in the range [0, 65535] 187 | snprintf(intf_name, sizeof(intf_name), "%s.%04x", | ^~~~~~~~~ drivers/infiniband/ulp/ipoib/ipoib_vlan.c:187:9: note: ‘snprintf’ output between 6 and 21 bytes into a destination of size 16 187 | snprintf(intf_name, sizeof(intf_name), "%s.%04x", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 188 | ppriv->dev->name, pkey); | ~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors make[6]: *** [scripts/Makefile.build:244: drivers/infiniband/ulp/ipoib/ipoib_vlan.o] Error 1 make[6]: *** Waiting for unfinished jobs.... Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support") Link: https://lore.kernel.org/r/e9d3e1fef69df4c9beaf402cc3ac342bad680791.1715240029.git.leon@kernel.org Signed-off-by: Leon Romanovsky <[email protected]>
2024-05-12Merge tag 'kvm-x86-misc-6.10' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini7-31/+53
KVM x86 misc changes for 6.10: - Advertise the max mappable GPA in the "guest MAXPHYADDR" CPUID field, which is unused by hardware, so that KVM can communicate its inability to map GPAs that set bits 51:48 due to lack of 5-level paging. Guest firmware is expected to use the information to safely remap BARs in the uppermost GPA space, i.e to avoid placing a BAR at a legal, but unmappable, GPA. - Use vfree() instead of kvfree() for allocations that always use vcalloc() or __vcalloc(). - Don't completely ignore same-value writes to immutable feature MSRs, as doing so results in KVM failing to reject accesses to MSR that aren't supposed to exist given the vCPU model and/or KVM configuration. - Don't mark APICv as being inhibited due to ABSENT if APICv is disabled KVM-wide to avoid confusing debuggers (KVM will never bother clearing the ABSENT inhibit, even if userspace enables in-kernel local APIC).
2024-05-12Merge tag 'kvm-x86-mmu-6.10' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini2-29/+66
KVM x86 MMU changes for 6.10: - Process TDP MMU SPTEs that are are zapped while holding mmu_lock for read after replacing REMOVED_SPTE with '0' and flushing remote TLBs, which allows vCPU tasks to repopulate the zapped region while the zapper finishes tearing down the old, defunct page tables. - Fix a longstanding, likely benign-in-practice race where KVM could fail to detect a write from kvm_mmu_track_write() to a shadowed GPTE if the GPTE is first page table being shadowed.