Age | Commit message (Collapse) | Author | Files | Lines |
|
This adds tests for the various replacement operations using
IFLA_XDP_EXPECTED_FD.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
This adds a new function to set the XDP fd while specifying the FD of the
program to replace, using the newly added IFLA_XDP_EXPECTED_FD netlink
parameter. The new function uses the opts struct mechanism to be extendable
in the future.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
This adds the IFLA_XDP_EXPECTED_FD netlink attribute definition and the
XDP_FLAGS_REPLACE flag to if_link.h in tools/include.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Fix a sharp edge in xsk_umem__create and xsk_socket__create. Almost all of
the members of the ring buffer structs are initialized, but the "cached_xxx"
variables are not all initialized. The caller is required to zero them.
This is needlessly dangerous. The results if you don't do it can be very bad.
For example, they can cause xsk_prod_nb_free and xsk_cons_nb_avail to return
values greater than the size of the queue. xsk_ring_cons__peek can return an
index that does not refer to an item that has been queued.
I have confirmed that without this change, my program misbehaves unless I
memset the ring buffers to zero before calling the function. Afterwards,
my program works without (or with) the memset.
Signed-off-by: Fletcher Dunn <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Magnus Karlsson <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into locking/core
Pull uaccess futex cleanups for Al Viro:
Consolidate access_ok() usage and the futex uaccess function zoo.
|
|
it's not really different from e.g. __sanitizer_cov_trace_cmp4();
as it is, the switches that generate an array of labels get
rejected by objtool, while slightly different set of cases
that gets compiled into a series of comparisons is accepted.
Signed-off-by: Al Viro <[email protected]>
|
|
Add various tests to make sure the verifier keeps catching them:
# ./test_verifier
[...]
#230/p pass ctx or null check, 1: ctx OK
#231/p pass ctx or null check, 2: null OK
#232/p pass ctx or null check, 3: 1 OK
#233/p pass ctx or null check, 4: ctx - const OK
#234/p pass ctx or null check, 5: null (connect) OK
#235/p pass ctx or null check, 6: null (bind) OK
#236/p pass ctx or null check, 7: ctx (bind) OK
#237/p pass ctx or null check, 8: null (bind) OK
[...]
Summary: 1595 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Daniel Borkmann <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/c74758d07b1b678036465ef7f068a49e9efd3548.1585323121.git.daniel@iogearbox.net
|
|
Enable the bpf_get_current_cgroup_id() helper for connect(), sendmsg(),
recvmsg() and bind-related hooks in order to retrieve the cgroup v2
context which can then be used as part of the key for BPF map lookups,
for example. Given these hooks operate in process context 'current' is
always valid and pointing to the app that is performing mentioned
syscalls if it's subject to a v2 cgroup. Also with same motivation of
commit 7723628101aa ("bpf: Introduce bpf_skb_ancestor_cgroup_id helper")
enable retrieval of ancestor from current so the cgroup id can be used
for policy lookups which can then forbid connect() / bind(), for example.
Signed-off-by: Daniel Borkmann <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/d2a7ef42530ad299e3cbb245e6c12374b72145ef.1585323121.git.daniel@iogearbox.net
|
|
In Cilium we're mainly using BPF cgroup hooks today in order to implement
kube-proxy free Kubernetes service translation for ClusterIP, NodePort (*),
ExternalIP, and LoadBalancer as well as HostPort mapping [0] for all traffic
between Cilium managed nodes. While this works in its current shape and avoids
packet-level NAT for inter Cilium managed node traffic, there is one major
limitation we're facing today, that is, lack of netns awareness.
In Kubernetes, the concept of Pods (which hold one or multiple containers)
has been built around network namespaces, so while we can use the global scope
of attaching to root BPF cgroup hooks also to our advantage (e.g. for exposing
NodePort ports on loopback addresses), we also have the need to differentiate
between initial network namespaces and non-initial one. For example, ExternalIP
services mandate that non-local service IPs are not to be translated from the
host (initial) network namespace as one example. Right now, we have an ugly
work-around in place where non-local service IPs for ExternalIP services are
not xlated from connect() and friends BPF hooks but instead via less efficient
packet-level NAT on the veth tc ingress hook for Pod traffic.
On top of determining whether we're in initial or non-initial network namespace
we also have a need for a socket-cookie like mechanism for network namespaces
scope. Socket cookies have the nice property that they can be combined as part
of the key structure e.g. for BPF LRU maps without having to worry that the
cookie could be recycled. We are planning to use this for our sessionAffinity
implementation for services. Therefore, add a new bpf_get_netns_cookie() helper
which would resolve both use cases at once: bpf_get_netns_cookie(NULL) would
provide the cookie for the initial network namespace while passing the context
instead of NULL would provide the cookie from the application's network namespace.
We're using a hole, so no size increase; the assignment happens only once.
Therefore this allows for a comparison on initial namespace as well as regular
cookie usage as we have today with socket cookies. We could later on enable
this helper for other program types as well as we would see need.
(*) Both externalTrafficPolicy={Local|Cluster} types
[0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c
Signed-off-by: Daniel Borkmann <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/c47d2346982693a9cf9da0e12690453aded4c788.1585323121.git.daniel@iogearbox.net
|
|
Linux 5.6-rc7
|
|
Commit 0161a94e2d1c7 ("tools: gpio: Correctly add make dependencies for
gpio_utils") added a make rule for gpio-utils-in.o but used $(output)
instead of the correct $(OUTPUT) for the output directory, breaking
out-of-tree build (O=xx) with the following error:
No rule to make target 'out/tools/gpio/gpio-utils-in.o', needed by 'out/tools/gpio/lsgpio-in.o'. Stop.
Fix that.
Fixes: 0161a94e2d1c ("tools: gpio: Correctly add make dependencies for gpio_utils")
Cc: <[email protected]>
Cc: Laura Abbott <[email protected]>
Signed-off-by: Anssi Hannula <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Bartosz Golaszewski <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
|
|
A new file was added to the tracing directory that will allow a user to
place a PID into it and the task associated to that PID will not have its
events traced. If the event-fork option is enabled, then neither will the
children of that task have its events traced.
Cc: [email protected]
Cc: Shuah Khan <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
|
|
A new file was added to the tracing directory that will allow a user to
place a PID into it and the task associated to that PID will not be traced
by the function tracer. If the function-fork option is enabled, then neither
will the children of that task be traced by the function tracer.
Cc: [email protected]
Cc: Shuah Khan <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
|
|
|
|
For some kind of analysis a deltatime output is more human friendly and
reduce the cognitive load for further analysis.
The following output demonstrate the new option "deltatime": calculate
the time difference in relation to the previous event.
$ perf script --deltatime
test 2525 [001] 0.000000: sdt_libev:ev_add: (5635e72a5ebd)
test 2525 [001] 0.000091: sdt_libev:epoll_wait_enter: (5635e72a76a9)
test 2525 [001] 1.000051: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
test 2525 [001] 0.000685: sdt_libev:ev_add: (5635e72a5ebd)
test 2525 [001] 0.000048: sdt_libev:epoll_wait_enter: (5635e72a76a9)
test 2525 [001] 1.000104: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
test 2525 [001] 0.003895: sdt_libev:epoll_wait_enter: (5635e72a76a9)
test 2525 [001] 0.996034: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
test 2525 [001] 0.000058: sdt_libev:epoll_wait_enter: (5635e72a76a9)
test 2525 [001] 1.000004: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
test 2525 [001] 0.000064: sdt_libev:epoll_wait_enter: (5635e72a76a9)
test 2525 [001] 0.999934: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
test 2525 [001] 0.000056: sdt_libev:epoll_wait_enter: (5635e72a76a9)
test 2525 [001] 0.999930: sdt_libev:epoll_wait_return: (5635e72a772e) arg1=1
Committer testing:
So go from default output to --reltime and then this new --deltatime, to
contrast the various timestamp presentation modes for a random perf.data file I
had laying around:
[root@five ~]# perf script --reltime | head
perf 442394 [000] 0.000000: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [000] 0.000002: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [000] 0.000004: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [000] 0.000006: 128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [000] 0.000009: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 0.000036: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 0.000038: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 0.000040: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 0.000041: 224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 0.000044: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
[root@five ~]# perf script --deltatime | head
perf 442394 [000] 0.000000: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [000] 0.000002: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [000] 0.000001: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [000] 0.000001: 128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [000] 0.000002: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 0.000027: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 0.000002: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 0.000001: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 0.000001: 224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 0.000002: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
[root@five ~]# perf script | head
perf 442394 [000] 7600.157861: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [000] 7600.157864: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [000] 7600.157866: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [000] 7600.157867: 128 cycles: ffffffff972415a1 perf_event_update_userpage+0x1 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [000] 7600.157870: 2597 cycles: ffffffff97463785 cap_task_setscheduler+0x5 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 7600.157897: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 7600.157900: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 7600.157901: 16 cycles: ffffffff9706e544 native_write_msr+0x4 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 7600.157903: 224 cycles: ffffffff9700a53a perf_ibs_handle_irq+0x1da (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
perf 442394 [001] 7600.157906: 4439 cycles: ffffffff97120d85 put_prev_entity+0x45 (/usr/lib/debug/lib/modules/5.5.10-200.fc31.x86_64/vmlinux)
[root@five ~]#
Andi suggested we better implement it as a new field, i.e. -F deltatime, like:
[root@five ~]# perf script -F deltatime
Invalid field requested.
Usage: perf script [<options>]
or: perf script [<options>] record <script> [<record-options>] <command>
or: perf script [<options>] report <script> [script-args]
or: perf script [<options>] <script> [<record-options>] <command>
or: perf script [<options>] <top-script> [script-args]
-F, --fields <str> comma separated output fields prepend with 'type:'. +field to add and -field to remove.Valid types: hw,sw,trace,raw,synth. Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,srcline,period,iregs,uregs,brstack,brstacksym,flags,bpf-output,brstackinsn,brstackoff,callindent,insn,insnlen,synth,phys_addr,metric,misc,ipc
[root@five ~]#
I.e. we have -F for maximum flexibility:
[root@five ~]# perf script -F comm,pid,cpu,time | head
perf 442394 [000] 7600.157861:
perf 442394 [000] 7600.157864:
perf 442394 [000] 7600.157866:
perf 442394 [000] 7600.157867:
perf 442394 [000] 7600.157870:
perf 442394 [001] 7600.157897:
perf 442394 [001] 7600.157900:
perf 442394 [001] 7600.157901:
perf 442394 [001] 7600.157903:
perf 442394 [001] 7600.157906:
[root@five ~]#
But since we already have --reltime, having --deltatime, documented one after
the other is sensible.
Signed-off-by: Hagen Paul Pfeifer <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
[ Added 'perf script' man page entry for --deltatime ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add to the "x86 instruction decoder - new instructions" test the
following instructions:
incsspd
incsspq
rdsspd
rdsspq
saveprevssp
rstorssp
wrssd
wrssq
wrussd
wrussq
setssbsy
clrssbsy
endbr32
endbr64
And the "notrack" prefix for indirect calls and jumps.
For information about the instructions, refer Intel Control-flow
Enforcement Technology Specification May 2019 (334525-003).
Committer testing:
$ perf test instr
67: x86 instruction decoder - new instructions : Ok
$
Then use verbose mode and check one of those new instructions:
$ perf test -v instr |& grep saveprevssp
Decoded ok: f3 0f 01 ea saveprevssp
Decoded ok: f3 0f 01 ea saveprevssp
$
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi v. Shankar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add the following CET instructions to the opcode map:
INCSSP:
Increment Shadow Stack pointer (SSP).
RDSSP:
Read SSP into a GPR.
SAVEPREVSSP:
Use "previous ssp" token at top of current Shadow Stack (SHSTK) to
create a "restore token" on the previous (outgoing) SHSTK.
RSTORSSP:
Restore from a "restore token" to SSP.
WRSS:
Write to kernel-mode SHSTK (kernel-mode instruction).
WRUSS:
Write to user-mode SHSTK (kernel-mode instruction).
SETSSBSY:
Verify the "supervisor token" pointed by MSR_IA32_PL0_SSP, set the
token busy, and set then Shadow Stack pointer(SSP) to the value of
MSR_IA32_PL0_SSP.
CLRSSBSY:
Verify the "supervisor token" and clear its busy bit.
ENDBR64/ENDBR32:
Mark a valid 64/32 bit control transfer endpoint.
Detailed information of CET instructions can be found in Intel Software
Developer's Manual.
Signed-off-by: Yu-cheng Yu <[email protected]>
Reviewed-by: Adrian Hunter <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi v. Shankar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This patch adds a new MACsec offloading option, MACSEC_OFFLOAD_MAC,
allowing a user to select a MAC as a provider for MACsec offloading
operations.
Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: Mark Starovoytov <[email protected]>
Signed-off-by: Igor Russkikh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Implement the .snapshot region operation for the dummy data region. This
enables a region snapshot to be taken upon request via the new
DEVLINK_CMD_REGION_SNAPSHOT command.
Signed-off-by: Jacob Keller <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Currently the test checks the observable effect of skbedit priority:
queueing of packets at the correct qdisc band. It therefore misses the fact
that the counters for offloaded rules are not updated. Add an extra check
for the counter.
Signed-off-by: Petr Machata <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add local header dependency in lib.mk. This enforces the dependency
blindly even when a test doesn't include the file, with the benefit
of a simpler common logic without requiring individual tests to have
special rule for it.
Signed-off-by: Shuah Khan <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
Fix memfd to support relocatable build (O=objdir). This calls out
source files necessary to build tests and simplfies the dependency
enforcement.
Tested the following:
Note that cross-build for fuse_mnt has dependency on -lfuse.
make all
make clean
make kselftest-install O=/arm64_build/ ARCH=arm64 HOSTCC=gcc \
CROSS_COMPILE=aarch64-linux-gnu- TARGETS=memfd
Signed-off-by: Shuah Khan <[email protected]>
|
|
Fix seccomp relocatable builds. This is a simple fix to use the
right lib.mk variable TEST_GEN_PROGS. Local header dependency
is addressed in a change to lib.mk as a framework change that
enforces the dependency without requiring changes to individual
tests.
The following use-cases work with this change:
In seccomp directory:
make all and make clean
From top level from main Makefile:
make kselftest-install O=objdir ARCH=arm64 HOSTCC=gcc \
CROSS_COMPILE=aarch64-linux-gnu- TARGETS=seccomp
Signed-off-by: Shuah Khan <[email protected]>
Acked-by: Kees Cook <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
When a selftest would timeout before, the program would just fall over
and no accounting of failures would be reported (i.e. it would result in
an incomplete TAP report). Instead, add an explicit SIGALRM handler to
cleanly catch and report the timeout.
Before:
[==========] Running 2 tests from 2 test cases.
[ RUN ] timeout.finish
[ OK ] timeout.finish
[ RUN ] timeout.too_long
Alarm clock
After:
[==========] Running 2 tests from 2 test cases.
[ RUN ] timeout.finish
[ OK ] timeout.finish
[ RUN ] timeout.too_long
timeout.too_long: Test terminated by timeout
[ FAIL ] timeout.too_long
[==========] 1 / 2 tests passed.
[ FAILED ]
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
In order to better handle timeout failures, rearrange the child waiting
logic into a separate function. This is mostly a copy/paste with an
indentation change. To handle pid tracking, a new field is added for
the child pid. Also move the alarm() pairing into the function.
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
Add a missing raw dmesg test log to test the kunit_tool's dmesg parser.
test_prefix_poundsign and test_output_with_prefix_isolated_correctly
fail without this test log.
Signed-off-by: Brendan Higgins <[email protected]>
Tested-by: David Gow <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
Introduce KUNIT_SUBTEST_INDENT macro which corresponds to 4-space
indentation and KUNIT_SUBSUBTEST_INDENT macro which corresponds to
8-space indentation in line with TAP spec (e.g. see "Subtests"
section of https://node-tap.org/tap-protocol/).
Use these macros in place of one or two tabs in strings to clarify
why we are indenting.
Suggested-by: Frank Rowand <[email protected]>
Signed-off-by: Alan Maguire <[email protected]>
Reviewed-by: Frank Rowand <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
A fixed y-axis scale was missed during a change to autoscale.
Correct it.
Fixes: 709bd70d070ee ("tools/power/x86/intel_pstate_tracer: change several graphs to autoscale y-axis")
Signed-off-by: Doug Smythies <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
When DSCP is updated through an offloaded pedit action, DSCP rewrite on
egress should be disabled. Add a test that check that it is so.
Signed-off-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add a test that runs packets with dsfield set, and test that pedit adjusts
the DSCP or ECN parts or the whole field.
Signed-off-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The $(CC) passed to arch_errno_names.sh may include a series of parameters
along with gcc itself. To avoid overwriting the following parameters of
arch_errno_names.sh and break the build like below, we just pick up the
first word of the $(CC).
find: unknown predicate `-m64/arch'
x86_64-wrs-linux-gcc: warning: '-x c' after last input file has no effect
x86_64-wrs-linux-gcc: error: unrecognized command line option '-m64/include/uapi/asm-generic/errno.h'
x86_64-wrs-linux-gcc: fatal error: no input files
Signed-off-by: He Zhe <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Terms may have a NULL config in which case a strcmp will SEGV. This can
be reproduced with:
perf stat -e '*/event=?,nr/' sleep 1
Add a NULL check to avoid this. This was caught by LLVM's libfuzzer.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add to the "x86 instruction decoder - new instructions" test the following
instructions:
incsspd
incsspq
rdsspd
rdsspq
saveprevssp
rstorssp
wrssd
wrssq
wrussd
wrussq
setssbsy
clrssbsy
endbr32
endbr64
And the notrack prefix for indirect calls and jumps.
For information about the instructions, refer Intel Control-flow
Enforcement Technology Specification May 2019 (334525-003).
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Add the following CET instructions to the opcode map:
INCSSP:
Increment Shadow Stack pointer (SSP).
RDSSP:
Read SSP into a GPR.
SAVEPREVSSP:
Use "previous ssp" token at top of current Shadow Stack (SHSTK) to
create a "restore token" on the previous (outgoing) SHSTK.
RSTORSSP:
Restore from a "restore token" to SSP.
WRSS:
Write to kernel-mode SHSTK (kernel-mode instruction).
WRUSS:
Write to user-mode SHSTK (kernel-mode instruction).
SETSSBSY:
Verify the "supervisor token" pointed by MSR_IA32_PL0_SSP, set the
token busy, and set then Shadow Stack pointer(SSP) to the value of
MSR_IA32_PL0_SSP.
CLRSSBSY:
Verify the "supervisor token" and clear its busy bit.
ENDBR64/ENDBR32:
Mark a valid 64/32 bit control transfer endpoint.
Detailed information of CET instructions can be found in Intel Software
Developer's Manual.
Signed-off-by: Yu-cheng Yu <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Adrian Hunter <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Fix a copy-paste typo in a comment and error message.
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
After changes to add update_reg_bounds after ALU ops and adding ALU32
bounds tracking the error message is changed in the 32-bit right shift
tests.
Test "#70/u bounds check after 32-bit right shift with 64-bit input FAIL"
now fails with,
Unexpected error message!
EXP: R0 invalid mem access
RES: func#0 @0
7: (b7) r1 = 2
8: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=invP2 R10=fp0 fp-8_w=mmmmmmmm
8: (67) r1 <<= 31
9: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=invP4294967296 R10=fp0 fp-8_w=mmmmmmmm
9: (74) w1 >>= 31
10: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=invP0 R10=fp0 fp-8_w=mmmmmmmm
10: (14) w1 -= 2
11: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=invP4294967294 R10=fp0 fp-8_w=mmmmmmmm
11: (0f) r0 += r1
math between map_value pointer and 4294967294 is not allowed
And test "#70/p bounds check after 32-bit right shift with 64-bit input
FAIL" now fails with,
Unexpected error message!
EXP: R0 invalid mem access
RES: func#0 @0
7: (b7) r1 = 2
8: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=inv2 R10=fp0 fp-8_w=mmmmmmmm
8: (67) r1 <<= 31
9: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=inv4294967296 R10=fp0 fp-8_w=mmmmmmmm
9: (74) w1 >>= 31
10: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=inv0 R10=fp0 fp-8_w=mmmmmmmm
10: (14) w1 -= 2
11: R0_w=map_value(id=0,off=0,ks=8,vs=8,imm=0) R1_w=inv4294967294 R10=fp0 fp-8_w=mmmmmmmm
11: (0f) r0 += r1
last_idx 11 first_idx 0
regs=2 stack=0 before 10: (14) w1 -= 2
regs=2 stack=0 before 9: (74) w1 >>= 31
regs=2 stack=0 before 8: (67) r1 <<= 31
regs=2 stack=0 before 7: (b7) r1 = 2
math between map_value pointer and 4294967294 is not allowed
Before this series we did not trip the "math between map_value pointer..."
error because check_reg_sane_offset is never called in
adjust_ptr_min_max_vals(). Instead we have a register state that looks
like this at line 11*,
11: R0_w=map_value(id=0,off=0,ks=8,vs=8,
smin_value=0,smax_value=0,
umin_value=0,umax_value=0,
var_off=(0x0; 0x0))
R1_w=invP(id=0,
smin_value=0,smax_value=4294967295,
umin_value=0,umax_value=4294967295,
var_off=(0xfffffffe; 0x0))
R10=fp(id=0,off=0,
smin_value=0,smax_value=0,
umin_value=0,umax_value=0,
var_off=(0x0; 0x0)) fp-8_w=mmmmmmmm
11: (0f) r0 += r1
In R1 'smin_val != smax_val' yet we have a tnum_const as seen
by 'var_off(0xfffffffe; 0x0))' with a 0x0 mask. So we hit this check
in adjust_ptr_min_max_vals()
if ((known && (smin_val != smax_val || umin_val != umax_val)) ||
smin_val > smax_val || umin_val > umax_val) {
/* Taint dst register if offset had invalid bounds derived from
* e.g. dead branches.
*/
__mark_reg_unknown(env, dst_reg);
return 0;
}
So we don't throw an error here and instead only throw an error
later in the verification when the memory access is made.
The root cause in verifier without alu32 bounds tracking is having
'umin_value = 0' and 'umax_value = U64_MAX' from BPF_SUB which we set
when 'umin_value < umax_val' here,
if (dst_reg->umin_value < umax_val) {
/* Overflow possible, we know nothing */
dst_reg->umin_value = 0;
dst_reg->umax_value = U64_MAX;
} else { ...}
Later in adjust_calar_min_max_vals we previously did a
coerce_reg_to_size() which will clamp the U64_MAX to U32_MAX by
truncating to 32bits. But either way without a call to update_reg_bounds
the less precise bounds tracking will fall out of the alu op
verification.
After latest changes we now exit adjust_scalar_min_max_vals with the
more precise umin value, due to zero extension propogating bounds from
alu32 bounds into alu64 bounds and then calling update_reg_bounds.
This then causes the verifier to trigger an earlier error and we get
the error in the output above.
This patch updates tests to reflect new error message.
* I have a local patch to print entire verifier state regardless if we
believe it is a constant so we can get a full picture of the state.
Usually if tnum_is_const() then bounds are also smin=smax, etc. but
this is not always true and is a bit subtle. Being able to see these
states helps understand dataflow imo. Let me know if we want something
similar upstream.
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/158507161475.15666.3061518385241144063.stgit@john-Precision-5820-Tower
|
|
Overlapping header include additions in macsec.c
A bug fix in 'net' overlapping with the removal of 'version'
string in ena_netdev.c
Overlapping test additions in selftests Makefile
Overlapping PCI ID table adjustments in iwlwifi driver.
Signed-off-by: David S. Miller <[email protected]>
|
|
For each prog/btf load we allocate and free 16 megs of verifier buffer.
On production systems it doesn't really make sense because the
programs/btf have gone through extensive testing and (mostly) guaranteed
to successfully load.
Let's assume successful case by default and skip buffer allocation
on the first try. If there is an error, start with BPF_LOG_BUF_SIZE
and double it on each ENOSPC iteration.
v3:
* Return -ENOMEM when can't allocate log buffer (Andrii Nakryiko)
v2:
* Don't allocate the buffer at all on the first try (Andrii Nakryiko)
Signed-off-by: Stanislav Fomichev <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Has been unused since commit ef99b02b23ef ("libbpf: capture value in BTF
type info for BTF-defined map defs").
Signed-off-by: Tobias Klauser <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Quentin Monnet <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Pull networking fixes from David Miller:
1) Fix deadlock in bpf_send_signal() from Yonghong Song.
2) Fix off by one in kTLS offload of mlx5, from Tariq Toukan.
3) Add missing locking in iwlwifi mvm code, from Avraham Stern.
4) Fix MSG_WAITALL handling in rxrpc, from David Howells.
5) Need to hold RTNL mutex in tcindex_partial_destroy_work(), from Cong
Wang.
6) Fix producer race condition in AF_PACKET, from Willem de Bruijn.
7) cls_route removes the wrong filter during change operations, from
Cong Wang.
8) Reject unrecognized request flags in ethtool netlink code, from
Michal Kubecek.
9) Need to keep MAC in reset until PHY is up in bcmgenet driver, from
Doug Berger.
10) Don't leak ct zone template in act_ct during replace, from Paul
Blakey.
11) Fix flushing of offloaded netfilter flowtable flows, also from Paul
Blakey.
12) Fix throughput drop during tx backpressure in cxgb4, from Rahul
Lakkireddy.
13) Don't let a non-NULL skb->dev leave the TCP stack, from Eric
Dumazet.
14) TCP_QUEUE_SEQ socket option has to update tp->copied_seq as well,
also from Eric Dumazet.
15) Restrict macsec to ethernet devices, from Willem de Bruijn.
16) Fix reference leak in some ethtool *_SET handlers, from Michal
Kubecek.
17) Fix accidental disabling of MSI for some r8169 chips, from Heiner
Kallweit.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (138 commits)
net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build
net: ena: Add PCI shutdown handler to allow safe kexec
selftests/net/forwarding: define libs as TEST_PROGS_EXTENDED
selftests/net: add missing tests to Makefile
r8169: re-enable MSI on RTL8168c
net: phy: mdio-bcm-unimac: Fix clock handling
cxgb4/ptp: pass the sign of offset delta in FW CMD
net: dsa: tag_8021q: replace dsa_8021q_remove_header with __skb_vlan_pop
net: cbs: Fix software cbs to consider packet sending time
net/mlx5e: Do not recover from a non-fatal syndrome
net/mlx5e: Fix ICOSQ recovery flow with Striding RQ
net/mlx5e: Fix missing reset of SW metadata in Striding RQ reset
net/mlx5e: Enhance ICOSQ WQE info fields
net/mlx5_core: Set IB capability mask1 to fix ib_srpt connection failure
selftests: netfilter: add nfqueue test case
netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress
netfilter: nft_fwd_netdev: validate family and chain type
netfilter: nft_set_rbtree: Detect partial overlaps on insertion
netfilter: nft_set_rbtree: Introduce and use nft_rbtree_interval_start()
netfilter: nft_set_pipapo: Separate partial and complete overlap cases on insertion
...
|
|
The method of unwinding for kernel space is defined by the kernel
config, not by the value of --call-graph. Improve the documentation to
reflect this.
Signed-off-by: Tony Jones <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The lib files should not be defined as TEST_PROGS, or we will run them
in run_kselftest.sh.
Also remove ethtool_lib.sh exec permission.
Fixes: 81573b18f26d ("selftests/net/forwarding: add Makefile to install tests")
Signed-off-by: Hangbin Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Find some tests are missed in Makefile by running:
for file in $(ls *.sh); do grep -q $file Makefile || echo $file; done
Signed-off-by: Hangbin Liu <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Rework kunit_tool in order to allow .kunitconfig files to better enforce
that disabled items in .kunitconfig are disabled in the generated
.config.
Previously, kunit_tool simply enforced that any line present in
.kunitconfig was also present in .config, but this could cause problems
if a config option was disabled in .kunitconfig, but not listed in .config
due to (for example) having disabled dependencies.
To fix this, re-work the parser to track config names and values, and
require values to match unless they are explicitly disabled with the
"CONFIG_x is not set" comment (or by setting its value to 'n'). Those
"disabled" values will pass validation if omitted from the .config, but
not if they have a different value.
Signed-off-by: David Gow <[email protected]>
Reviewed-by: Brendan Higgins <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
In preparation to adding a vmlinux.o specific pass, rearrange some
code. No functional changes intended.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Miroslav Benes <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Perf shows there is significant time in find_rela_by_dest(); this is
because we have to iterate the address space per byte, looking for
relocation entries.
Optimize this by reducing the address space granularity.
This reduces objtool on vmlinux.o runtime from 4.8 to 4.4 seconds.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Miroslav Benes <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Perf shows we spend a measurable amount of time spend cleaning up
right before we exit anyway. Avoid the needsless work and just
terminate.
This reduces objtool on vmlinux.o runtime from 5.4s to 4.8s
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Miroslav Benes <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Perf showed that __hash_init() is a significant portion of
read_sections(), so instead of doing a per section rela_hash, use an
elf-wide rela_hash.
Statistics show us there are about 1.1 million relas, so size it
accordingly.
This reduces the objtool on vmlinux.o runtime to a third, from 15 to 5
seconds.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Miroslav Benes <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Perf showed that find_symbol_by_name() takes time; add a symbol name
hash.
This shaves another second off of objtool on vmlinux.o runtime, down
to 15 seconds.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Miroslav Benes <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Perf shows we're spending a lot of time in find_insn() and the
statistics show we have around 3.2 million instruction. Increase the
hash table size to reduce the bucket load from around 50 to 3.
This shaves about 2s off of objtool on vmlinux.o runtime, down to 16s.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Miroslav Benes <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|