Age | Commit message (Collapse) | Author | Files | Lines |
|
A mis-match between reported and actual mitigation is not restricted to the
Vulnerable case. The guest might also report the mitigation as "Software
count cache flush" and the host will still mitigate with branch cache
disabled.
So, instead of skipping depending on the detected mitigation, simply skip
whenever the detected miss_percent is the expected one for a fully
mitigated system, that is, above 95%.
Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The following warning is triggered when I used clang compiler
to build the selftest.
/.../prog_tests/btf_dedup_split.c:368:6: warning: variable 'btf2' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
if (!ASSERT_OK(err, "btf_dedup"))
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
/.../prog_tests/btf_dedup_split.c:424:12: note: uninitialized use occurs here
btf__free(btf2);
^~~~
/.../prog_tests/btf_dedup_split.c:368:2: note: remove the 'if' if its condition is always false
if (!ASSERT_OK(err, "btf_dedup"))
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/.../prog_tests/btf_dedup_split.c:343:25: note: initialize the variable 'btf2' to silence this warning
struct btf *btf1, *btf2;
^
= NULL
Initialize local variable btf2 = NULL and the warning is gone.
Fixes: 9a49afe6f5a5 ("selftests/bpf: Add btf_dedup case with duplicated structs within CU")
Signed-off-by: Yonghong Song <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
1) Fix bogus compilter warning in nfnetlink_queue, from Florian Westphal.
2) Don't run conntrack on vrf with !dflt qdisc, from Nicolas Dichtel.
3) Fix nft_pipapo bucket load in AVX2 lookup routine for six 8-bit
groups, from Stefano Brivio.
4) Break rule evaluation on malformed TCP options.
5) Use socat instead of nc in selftests/netfilter/nft_zones_many.sh,
also from Florian
6) Fix KCSAN data-race in conntrack timeout updates, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
netfilter: conntrack: annotate data-races around ct->timeout
selftests: netfilter: switch zone stress to socat
netfilter: nft_exthdr: break evaluation if setting TCP option fails
selftests: netfilter: Add correctness test for mac,net set type
nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit groups
vrf: don't run conntrack on vrf with !dflt qdisc
netfilter: nfnetlink_queue: silence bogus compiler warning
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Daniel Borkmann says:
====================
bpf 2021-12-08
We've added 12 non-merge commits during the last 22 day(s) which contain
a total of 29 files changed, 659 insertions(+), 80 deletions(-).
The main changes are:
1) Fix an off-by-two error in packet range markings and also add a batch of
new tests for coverage of these corner cases, from Maxim Mikityanskiy.
2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh.
3) Fix two functional regressions and a build warning related to BTF kfunc
for modules, from Kumar Kartikeya Dwivedi.
4) Fix outdated code and docs regarding BPF's migrate_disable() use on non-
PREEMPT_RT kernels, from Sebastian Andrzej Siewior.
5) Add missing includes in order to be able to detangle cgroup vs bpf header
dependencies, from Jakub Kicinski.
6) Fix regression in BPF sockmap tests caused by missing detachment of progs
from sockets when they are removed from the map, from John Fastabend.
7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF
dispatcher, from Björn Töpel.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Add selftests to cover packet access corner cases
bpf: Fix the off-by-two error in range markings
treewide: Add missing includes masked by cgroup -> bpf dependency
tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets
bpf: Fix bpf_check_mod_kfunc_call for built-in modules
bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
mips, bpf: Fix reference to non-existing Kconfig symbol
bpf: Make sure bpf_disable_instrumentation() is safe vs preemption.
Documentation/locking/locktypes: Update migrate_disable() bits.
bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap
bpf, sockmap: Attach map progs to psock early for feature probes
bpf, x86: Fix "no previous prototype" warning
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
bpf_create_map is deprecated. Replace it with bpf_map_create. Also add a
__weak bpf_map_create() so that when older version of libbpf is linked as
a shared library, it falls back to bpf_create_map().
Fixes: 992c4225419a ("libbpf: Unify low-level map creation APIs w/ new bpf_map_create()")
Signed-off-by: Song Liu <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Teach objtool to validate the straight-line-speculation constraints:
- speculation trap after indirect calls
- speculation trap after RET
Notable: when an instruction is annotated RETPOLINE_SAFE, indicating
speculation isn't a problem, also don't care about sls for that
instruction.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This commit adds BPF verifier selftests that cover all corner cases by
packet boundary checks. Specifically, 8-byte packet reads are tested at
the beginning of data and at the beginning of data_meta, using all kinds
of boundary checks (all comparison operators: <, >, <=, >=; both
permutations of operands: data + length compared to end, end compared to
data + length). For each case there are three tests:
1. Length is just enough for an 8-byte read. Length is either 7 or 8,
depending on the comparison.
2. Length is increased by 1 - should still pass the verifier. These
cases are useful, because they failed before commit 2fa7d94afc1a
("bpf: Fix the off-by-two error in range markings").
3. Length is decreased by 1 - should be rejected by the verifier.
Some existing tests are just renamed to avoid duplication.
Signed-off-by: Maxim Mikityanskiy <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Compiling the ACPI tools when output directory parameter is specified,
but the output directory is not present, triggers the following error:
make O=/data/test/tmp/ -C tools/power/acpi/
make: Entering directory '/data/src/kernel/linux/tools/power/acpi'
DESCEND tools/acpidbg
make[1]: Entering directory '/data/src/kernel/linux/tools/power/acpi/tools/acpidbg'
MKDIR include
CP include
CC tools/acpidbg/acpidbg.o
Assembler messages:
Fatal error: can't create /data/test/tmp/tools/power/acpi/tools/acpidbg/acpidbg.o: No such file or directory
make[1]: *** [../../Makefile.rules:24: /data/test/tmp/tools/power/acpi/tools/acpidbg/acpidbg.o] Error 1
make[1]: Leaving directory '/data/src/kernel/linux/tools/power/acpi/tools/acpidbg'
make: *** [Makefile:18: acpidbg] Error 2
make: Leaving directory '/data/src/kernel/linux/tools/power/acpi'
which occurs because the output directory has not been created yet.
Fix this issue by creating the output directory before compiling.
Reported-by: Andy Shevchenko <[email protected]>
Signed-off-by: Chen Yu <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
[ rjw: New subject, changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
Add tests for TLSv1.2 and TLSv1.3 with AES256-GCM cipher
Signed-off-by: Vadim Fedorenko <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add tests for TLSv1.2 and TLSv1.3 with AES-CCM cipher.
Signed-off-by: Vadim Fedorenko <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Correct indentation to 4 spaces, same as the other JSON files.
Reviewed-by: John Garry <[email protected]>
Signed-off-by: Andrew Kilroy <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In previous patch, we have supported the syntax which enables
the event on a specified pmu, such as:
cpu_core/<event>/
cpu_atom/<event>/
While this syntax is not very easy for applying on a set of
events or applying on a group. In following example, we have to
explicitly assign the pmu prefix.
# ./perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}' -- sleep 1
Performance counter stats for 'sleep 1':
1,158,545 cpu_core/cycles/
1,003,113 cpu_core/instructions/
1.002428712 seconds time elapsed
A much easier way is:
# ./perf stat --cputype core -e '{cycles,instructions}' -- sleep 1
Performance counter stats for 'sleep 1':
1,101,071 cpu_core/cycles/
939,892 cpu_core/instructions/
1.002363142 seconds time elapsed
For this example, the '--cputype' enables the events from specified
pmu (cpu_core).
If '--cputype' conflicts with pmu prefix, '--cputype' is ignored.
# ./perf stat --cputype core -e cycles,cpu_atom/instructions/ -a -- sleep 1
Performance counter stats for 'system wide':
21,003,407 cpu_core/cycles/
367,886 cpu_atom/instructions/
1.002203520 seconds time elapsed
Signed-off-by: Jin Yao <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It's possible to link against libopencsd_c_api without having
libstdc++.so available, only libstdc++.so.6.0.28 (or whatever version is
in use) needs to be available. The same holds true for libopencsd.so.
When -lstdc++ (or -lopencsd) is explicitly passed to the linker however
the .so file must be available.
So wrap adding the dependencies into a check for static linking that
actually requires adding them all. The same construct is already used
for some other tests in the same file to reduce dependencies in the
dynamic linking case.
Fixes: 573cf5c9a152 ("perf build: Add missing -lstdc++ when linking with libopencsd")
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Uwe Kleine-König <[email protected]>
Cc: Adrian Bunk <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Branislav Rankov <[email protected]>
Cc: Diederik de Haas <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently topdown events must appear after a slots event:
$ perf stat -e '{slots,topdown-fe-bound}' /bin/true
Performance counter stats for '/bin/true':
3,183,090 slots
986,133 topdown-fe-bound
Reversing the events yields:
$ perf stat -e '{topdown-fe-bound,slots}' /bin/true
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (topdown-fe-bound).
For metrics the order of events is determined by iterating over a
hashmap, and so slots isn't guaranteed to be first which can yield this
error.
Change the set_leader in parse-events, called when a group is closed, so
that rather than always making the first event the leader, if the slots
event exists then it is made the leader. It is then moved to the head of
the evlist otherwise it won't be opened in the correct order.
The result is:
$ perf stat -e '{topdown-fe-bound,slots}' /bin/true
Performance counter stats for '/bin/true':
3,274,795 slots
1,001,702 topdown-fe-bound
A problem with this approach is the slots event is identified by name,
names can be overwritten like 'cpu/slots,name=foo/' and this causes the
leader change to fail.
The change also modifies and fixes mixed groups like, with the change:
$ perf stat -e '{instructions,slots,topdown-fe-bound}' -a -- sleep 2
Performance counter stats for 'system wide':
5574985410 slots
971981616 instructions
1348461887 topdown-fe-bound
2.001263120 seconds time elapsed
Without the change:
$ perf stat -e '{instructions,slots,topdown-fe-bound}' -a -- sleep 2
Performance counter stats for 'system wide':
<not counted> instructions
<not counted> slots
<not supported> topdown-fe-bound
2.006247990 seconds time elapsed
Something that may be undesirable here is that the events are reordered
in the output.
Reviewed-by: Kajol Jain <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Clarke <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Vineet Singh <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The leader of a group is the first, but allow it to be an arbitrary list
member so that for Intel topdown events slots may always be the group
leader.
Reviewed-by: Kajol Jain <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Clarke <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Vineet Singh <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It is common to use the same counters with and without duration_time.
The ID sharing code treats duration_time as if it were a hardware event
placed in the same group. This causes unnecessary multiplexing such as
in the following example where l3_cache_access isn't shared:
$ perf stat -M l3 -a sleep 1
Performance counter stats for 'system wide':
3,117,007 l3_cache_miss # 199.5 MB/s l3_rd_bw
# 43.6 % l3_hits
# 56.4 % l3_miss (50.00%)
5,526,447 l3_cache_access (50.00%)
5,392,435 l3_cache_access # 5389191.2 access/s l3_access_rate (50.00%)
1,000,601,901 ns duration_time
1.000601901 seconds time elapsed
Fix this by placing duration_time in all groups unless metric
sharing has been disabled on the command line:
$ perf stat -M l3 -a sleep 1
Performance counter stats for 'system wide':
3,597,972 l3_cache_miss # 230.3 MB/s l3_rd_bw
# 48.0 % l3_hits
# 52.0 % l3_miss
6,914,459 l3_cache_access # 6909935.9 access/s l3_access_rate
1,000,654,579 ns duration_time
1.000654579 seconds time elapsed
$ perf stat --metric-no-merge -M l3 -a sleep 1
Performance counter stats for 'system wide':
3,501,834 l3_cache_miss # 53.5 % l3_miss (24.99%)
6,548,173 l3_cache_access (24.99%)
3,417,622 l3_cache_miss # 45.7 % l3_hits (25.04%)
6,294,062 l3_cache_access (25.04%)
5,923,238 l3_cache_access # 5919688.1 access/s l3_access_rate (24.99%)
1,000,599,683 ns duration_time
3,607,486 l3_cache_miss # 230.9 MB/s l3_rd_bw (49.97%)
1.000599683 seconds time elapsed
v2. Doesn't count duration_time in the metric_list_cmp function that
sorts larger metrics first. Without this a metric with duration_time
and an event is sorted the same as a metric with two events,
possibly not allowing the first metric to share with the second.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Clarke <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf already support ignore_missing_thread for -u/-p, but not yet
applied to `perf trace`. This patch enables ignore_missing_thread
for `perf trace`.
Signed-off-by: Gang Li <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This updates the link to documentation on AMD processors. The new link
points to a page where users can find the Processor Programming
Reference (PPR) documents for the family and model codes corresponding
to processors they are using.
Signed-off-by: Sandipan Das <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Ananth Narayan <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kim Phillips <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Santosh Shukla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
AMD processors have events with event select codes and unit masks larger
than a byte. The core PMU, for example, uses 12-bit event select codes
split between bits 0-7 and 32-35 of the PERF_CTL MSRs as can be seen
from /sys/bus/event_sources/devices/cpu/format/*.
The Processor Programming Reference (PPR) lists the event codes as
unified 12-bit hexadecimal values instead and the split between the bits
is not apparent to someone who is not aware of the layout of the
PERF_CTL MSRs.
8-bit event select codes continue to work as the layout matches that of
the PERF_CTL MSRs i.e. bits 0-7 for event select and 8-15 for unit mask.
This adds more details in the perf man pages about using
/sys/bus/event_sources/devices/*/format/* for determining the correct
raw event encoding scheme.
E.g. the "op_cache_hit_miss.op_cache_hit" event with code 0x28f and
umask 0x03 can be programmed using its symbolic name as:
$ sudo perf --debug perf-event-open stat -e op_cache_hit_miss.op_cache_hit sleep 1
------------------------------------------------------------
perf_event_attr:
type 4
size 128
config 0x20000038f
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
[...]
One might use a simple eventsel+umask combination based on what the
current man pages say and incorrectly program the event as:
$ sudo perf --debug perf-event-open stat -e r0328f sleep 1
------------------------------------------------------------
perf_event_attr:
type 4
size 128
config 0x328f
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
[...]
When it should have been based on the format from sysfs:
$ cat /sys/bus/event_source/devices/cpu/format/event
config:0-7,32-35
$ sudo perf --debug perf-event-open stat -e r20000038f sleep 1
------------------------------------------------------------
perf_event_attr:
type 4
size 128
config 0x20000038f
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
------------------------------------------------------------
[...]
Reviewed-by: Kajol Jain <[email protected]>
Signed-off-by: Sandipan Das <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Ananth Narayan <[email protected]>
Cc: Kim Phillips <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Santosh Shukla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Adds a test for a counter obtained using read() system call during
multiplexing.
$ sudo make tests -C ./tools/lib/perf/ V=1
make: Entering directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf'
make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=libperf
make -C /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/api/ O= libapi.a
make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fd obj=libapi
make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fs obj=libapi
make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=tests
make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./tests obj=tests
running static:
- running tests/test-cpumap.c...OK
- running tests/test-threadmap.c...OK
- running tests/test-evlist.c...
Event 0 -- Raw count = 298049842, run = 270269503, enable = 456262127
Scaled count = 503160191 (59.24%, 270269503/456262127)
Event 1 -- Raw count = 299134173, run = 271075173, enable = 456257234
Scaled count = 503484435 (59.41%, 271075173/456257234)
Event 2 -- Raw count = 300461996, run = 272069283, enable = 456253417
Scaled count = 503867290 (59.63%, 272069283/456253417)
Event 3 -- Raw count = 301308704, run = 273063387, enable = 456249352
Scaled count = 503443183 (59.85%, 273063387/456249352)
Event 4 -- Raw count = 302531164, run = 274102932, enable = 456244712
Scaled count = 503563543 (60.08%, 274102932/456244712)
Event 5 -- Raw count = 303710254, run = 275406214, enable = 456228165
Scaled count = 503115633 (60.37%, 275406214/456228165)
Event 6 -- Raw count = 304531302, run = 276396076, enable = 456221130
Scaled count = 502661313 (60.58%, 276396076/456221130)
Event 7 -- Raw count = 304486460, run = 276601890, enable = 456213754
Scaled count = 502205212 (60.63%, 276601890/456213754)
Event 8 -- Raw count = 304116681, run = 276631326, enable = 456205562
Scaled count = 501532936 (60.64%, 276631326/456205562)
Event 9 -- Raw count = 303567766, run = 276188567, enable = 456196839
Scaled count = 501420666 (60.54%, 276188567/456196839)
Event 10 -- Raw count = 302238014, run = 275144001, enable = 456185300
Scaled count = 501106833 (60.31%, 275144001/456185300)
Event 11 -- Raw count = 300805716, run = 273824589, enable = 456175608
Scaled count = 501124573 (60.03%, 273824589/456175608)
Event 12 -- Raw count = 299959051, run = 272834556, enable = 456166593
Scaled count = 501517477 (59.81%, 272834556/456166593)
Event 13 -- Raw count = 299037090, run = 271820805, enable = 456157086
Scaled count = 501830195 (59.59%, 271820805/456157086)
Event 14 -- Raw count = 298327042, run = 270784311, enable = 456147546
Scaled count = 502544433 (59.36%, 270784311/456147546)
Expected: 501614268
High: 503867290 Low: 298049842 Average: 502438527
Average Error = 0.16%
OK
- running tests/test-evsel.c...
loop = 65536, count = 328182
loop = 131072, count = 660214
loop = 262144, count = 1315534
loop = 524288, count = 2635364
loop = 1048576, count = 5271971
loop = 65536, count = 491952
loop = 131072, count = 850061
loop = 262144, count = 1648608
loop = 524288, count = 3162059
loop = 1048576, count = 6353393
OK
running dynamic:
- running tests/test-cpumap.c...OK
- running tests/test-threadmap.c...OK
- running tests/test-evlist.c...
Event 0 -- Raw count = 300218292, run = 297528154, enable = 496789343
Scaled count = 501281125 (59.89%, 297528154/496789343)
Event 1 -- Raw count = 301438606, run = 298515328, enable = 496784768
Scaled count = 501649643 (60.09%, 298515328/496784768)
Event 2 -- Raw count = 302342618, run = 298798983, enable = 496782015
Scaled count = 502673648 (60.15%, 298798983/496782015)
Event 3 -- Raw count = 303132319, run = 299230407, enable = 496778508
Scaled count = 503256412 (60.23%, 299230407/496778508)
Event 4 -- Raw count = 302758195, run = 299218047, enable = 496774243
Scaled count = 502651743 (60.23%, 299218047/496774243)
Event 5 -- Raw count = 303158458, run = 299204274, enable = 496769146
Scaled count = 503334281 (60.23%, 299204274/496769146)
Event 6 -- Raw count = 303471397, run = 299197479, enable = 496763124
Scaled count = 503859189 (60.23%, 299197479/496763124)
Event 7 -- Raw count = 303583387, run = 299196861, enable = 496756458
Scaled count = 504039405 (60.23%, 299196861/496756458)
Event 8 -- Raw count = 303096897, run = 299186924, enable = 496748667
Scaled count = 503240507 (60.23%, 299186924/496748667)
Event 9 -- Raw count = 301424173, run = 297845086, enable = 496739994
Scaled count = 502709122 (59.96%, 297845086/496739994)
Event 10 -- Raw count = 300876415, run = 296851339, enable = 496729034
Scaled count = 503464297 (59.76%, 296851339/496729034)
Event 11 -- Raw count = 300239338, run = 296547963, enable = 496719538
Scaled count = 502902612 (59.70%, 296547963/496719538)
Event 12 -- Raw count = 299751948, run = 296547195, enable = 496710036
Scaled count = 502077926 (59.70%, 296547195/496710036)
Event 13 -- Raw count = 299341883, run = 296549981, enable = 496700423
Scaled count = 501376663 (59.70%, 296549981/496700423)
Event 14 -- Raw count = 299145476, run = 296561684, enable = 496690949
Scaled count = 501018366 (59.71%, 296561684/496690949)
Expected: 501669431
High: 504039405 Low: 300218292 Average: 502635662
Average Error = 0.19%
OK
- running tests/test-evsel.c...
loop = 65536, count = 329275
loop = 131072, count = 664638
loop = 262144, count = 1315367
loop = 524288, count = 2629617
loop = 1048576, count = 5273657
loop = 65536, count = 459641
loop = 131072, count = 978402
loop = 262144, count = 1581219
loop = 524288, count = 3774908
loop = 1048576, count = 7694417
OK
make: Leaving directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf'
Signed-off-by: Shunsuke Nakamura <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rob Herring <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Remove the scaling process from perf_mmap__read_self(), and unify the
counters that can be obtained from perf_evsel__read() to "no scaling".
Signed-off-by: Shunsuke Nakamura <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rob Herring <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Move perf_counts_values__scale() from tools/perf/util to tools/lib/perf
so that it can be used with libperf.
Committer notes:
As noted by Jiri, use __s8 instead of s8 on the exported function.
Signed-off-by: Shunsuke Nakamura <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rob Herring <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The tools build system uses KBUILD_HOSTCFLAGS symbol for obvious purposes.
However this is not set for anything under tools/
As such, host tools apps built have no compiler warnings enabled.
Declare HOSTCFLAGS for perf tools build, and also use that symbol in
declaration of host_c_flags. HOSTCFLAGS comes from EXTRA_WARNINGS, which
is independent of target platform/arch warning flags.
Suggested-by: Jiri Olsa <[email protected]>
Signed-off-by: John Garry <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Laura Abbott <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Helps a bit the user figuring out why it is failing:
Before:
$ perf test sigtrap
73: Sigtrap : FAILED!
$ perf test -v sigtrap
73: Sigtrap :
--- start ---
test child forked, pid 3816772
FAILED sys_perf_event_open()
test child finished with -1
---- end ----
Sigtrap: FAILED!
$
After:
$ perf test sigtrap
73: Sigtrap : FAILED!
$ perf test -v sigtrap
73: Sigtrap :
--- start ---
test child forked, pid 3816772
FAILED sys_perf_event_open(): Permission denied
test child finished with -1
---- end ----
Sigtrap: FAILED!
$
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Fabian Hemmer <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add basic stress test for sigtrap handling as a perf tool built-in test.
This allows sanity checking the basic sigtrap functionality from within
the perf tool.
Committer notes:
Reported that !root was getting -EPERM, applied a fixup from Marco to
set .exclude_{hv,kernel} that made it work.
Signed-off-by: Marco Elver <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Fabian Hemmer <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
With the addition of multiple callback-flood kthreads, the maximum number
of callbacks from any one of those kthreads is reported in the rcutorture
run summary. This commit changes this to report the sum of each kthread's
maximum number of callbacks in a given callback-flooding episode.
Cc: Neeraj Upadhyay <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
This commit modifies the TASKS01 scenario to use four callback queues
and the TRACE01 scenario to use two queues, thus providing testing of
multiple queues by default.
Cc: Neeraj Upadhyay <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
centos9 has nmap-ncat which doesn't like the '-q' option, use socat.
While at it, mark test skipped if needed tools are missing.
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
The existing net,mac test didn't cover the issue recently reported
by Nikita Yushchenko, where MAC addresses wouldn't match if given
as first field of a concatenated set with AVX2 and 8-bit groups,
because there's a different code path covering the lookup of six
8-bit groups (MAC addresses) if that's the first field.
Add a similar mac,net test, with MAC address and IPv4 address
swapped in the set specification.
Signed-off-by: Stefano Brivio <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
After the below patch, the conntrack attached to skb is set to "notrack" in
the context of vrf device, for locally generated packets.
But this is true only when the default qdisc is set to the vrf device. When
changing the qdisc, notrack is not set anymore.
In fact, there is a shortcut in the vrf driver, when the default qdisc is
set, see commit dcdd43c41e60 ("net: vrf: performance improvements for
IPv4") for more details.
This patch ensures that the behavior is always the same, whatever the qdisc
is.
To demonstrate the difference, a new test is added in conntrack_vrf.sh.
Fixes: 8c9c296adfae ("vrf: run conntrack only in context of lower/physdev for locally generated packets")
Signed-off-by: Nicolas Dichtel <[email protected]>
Acked-by: Florian Westphal <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Check that getsockopt(IP_TOS) returns what setsockopt(IP_TOS) did set
right before.
Also check that socklen_t == 0 and -1 input values match those
of normal tcp sockets.
Reviewed-by: Matthieu Baerts <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
client & server use a unix socket connection to communicate
outside of the mptcp connection.
This allows the consumer to know in advance how many bytes have been
(or will be) sent by the peer.
This allows stricter checks on the bytecounts reported by TCP_INQ cmsg.
Suggested-by: Mat Martineau <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Do checks on the returned inq counter.
Fail on:
1. Huge value (> 1 kbyte, test case files are 1 kb)
2. last hint larger than returned bytes when read was short
3. erronenous indication of EOF.
3) happens when a hint of X bytes reads X-1 on next call
but next recvmsg returns more data (instead of EOF).
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Mat Martineau <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
When building bpf_skel with clang-10, typedef causes confusions like:
libbpf: map 'prev_readings': unexpected def kind var.
Fix this by removing the typedef.
Fixes: 7fac83aaf2eecc9e ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Arnaldo reported that building all his containers with BUILD_BPF_SKEL=1
to then make this the default he found problems in some distros where
the system linux/bpf.h file was being used and lacked this:
util/bpf_skel/bperf_leader.bpf.c:13:20: error: use of undeclared identifier 'BPF_F_PRESERVE_ELEMS'
__uint(map_flags, BPF_F_PRESERVE_ELEMS);
So use instead the vmlinux.h file generated by bpftool from BTF info.
This fixed these as well, getting the build back working on debian:11,
debian:experimental and ubuntu:21.10:
In file included from In file included from util/bpf_skel/bperf_leader.bpf.cutil/bpf_skel/bpf_prog_profiler.bpf.c::33:
:
In file included from In file included from /usr/include/linux/bpf.h/usr/include/linux/bpf.h::1111:
:
/usr/include/linux/types.h/usr/include/linux/types.h::55::1010:: In file included from util/bpf_skel/bperf_follower.bpf.c:3fatal errorfatal error:
: : In file included from /usr/include/linux/bpf.h:'asm/types.h' file not found11'asm/types.h' file not found:
/usr/include/linux/types.h:5:10: fatal error: 'asm/types.h' file not found
#include <asm/types.h>#include <asm/types.h>
^~~~~~~~~~~~~ ^~~~~~~~~~~~~
#include <asm/types.h>
^~~~~~~~~~~~~
1 error generated.
Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Tested-by: Athira Rajeev <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
These leaks were found with leak sanitizer running "perf pipe recording
and injection test".
In pipe mode feat_fd may hold onto an events struct that needs freeing.
When string features are processed they may overwrite an already created
string, so free this before the overwrite.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Otherwise load counting is an average. Without this change
duration_time in test_memory_bandwidth will alter its value if an
earlier test contains duration_time.
This patch fixes an issue that's introduced in the proposed patch:
https://lore.kernel.org/lkml/[email protected]/
in perf test "Parse and process metrics".
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Clarke <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
topology info
Some platforms do not have CPU die support, for example s390.
Commit
Cc: Ian Rogers <[email protected]>
Fixes: fdf1e29b6118c18f ("perf expr: Add metric literals for topology.")
fails on s390:
# perf test -Fv 7
...
# FAILED tests/expr.c:173 #num_dies >= #num_packages
---- end ----
Simple expression parser: FAILED!
#
Investigating this issue leads to these functions:
build_cpu_topology()
+--> has_die_topology(void)
{
struct utsname uts;
if (uname(&uts) < 0)
return false;
if (strncmp(uts.machine, "x86_64", 6))
return false;
....
}
which always returns false on s390. The caller build_cpu_topology()
checks has_die_topology() return value. On false the the struct
cpu_topology::die_cpu_list is not contructed and has zero entries. This
leads to the failing comparison: #num_dies >= #num_packages. s390 of
course has a positive number of packages.
Fix this and check if the function build_cpu_topology() did build up
a die_cpus_list. The number of entries in this list should be larger
than 0. If the number of list element is zero, the die_cpus_list has
not been created and the check in function test__expr():
TEST_ASSERT_VAL("#num_dies >= #num_packages", \
num_dies >= num_packages)
always fails.
Output after:
# perf test -Fv 7
7: Simple expression parser :
--- start ---
division by zero
syntax error
---- end ----
Simple expression parser: Ok
#
Fixes: fdf1e29b6118c18f ("perf expr: Add metric literals for topology.")
Signed-off-by: Thomas Richter <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Sumanth Korikkar <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
[ Added comment in the added 'if (num_dies)' line about architectures not having die topology ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
test-all fast path
Since 66dfdff03d196e51 ("perf tools: Add Python 3 support") we don't use
the tools/build/feature/test-libpython-version.c version in any Makefile
feature check:
$ find tools/ -type f | xargs grep feature-libpython-version
$
The only place where this was used was removed in 66dfdff03d196e51:
- ifneq ($(feature-libpython-version), 1)
- $(warning Python 3 is not yet supported; please set)
- $(warning PYTHON and/or PYTHON_CONFIG appropriately.)
- $(warning If you also have Python 2 installed, then)
- $(warning try something like:)
- $(warning $(and ,))
- $(warning $(and ,) make PYTHON=python2)
- $(warning $(and ,))
- $(warning Otherwise, disable Python support entirely:)
- $(warning $(and ,))
- $(warning $(and ,) make NO_LIBPYTHON=1)
- $(warning $(and ,))
- $(error $(and ,))
- else
- LDFLAGS += $(PYTHON_EMBED_LDFLAGS)
- EXTLIBS += $(PYTHON_EMBED_LIBADD)
- LANG_BINDINGS += $(obj-perf)python/perf.so
- $(call detected,CONFIG_LIBPYTHON)
- endif
And nowadays we either build with PYTHON=python3 or just install the
python3 devel packages and perf will build against it.
But the leftover feature-libpython-version check made the fast path
feature detection to break in all cases except when python2 devel files
were installed:
$ rpm -qa | grep python.*devel
python3-devel-3.9.7-1.fc34.x86_64
$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ;
$ make -C tools/perf O=/tmp/build/perf install-bin
make: Entering directory '/var/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j32' parallel build
HOSTCC /tmp/build/perf/fixdep.o
<SNIP>
$ cat /tmp/build/perf/feature/test-all.make.output
In file included from test-all.c:18:
test-libpython-version.c:5:10: error: #error
5 | #error
| ^~~~~
$ ldd ~/bin/perf | grep python
libpython3.9.so.1.0 => /lib64/libpython3.9.so.1.0 (0x00007fda6dbcf000)
$
As python3 is the norm these days, fix this by just removing the unused
feature-libpython-version feature check, making the test-all fast path
to work with the common case.
With this:
$ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ;
$ make -C tools/perf O=/tmp/build/perf install-bin |& head
make: Entering directory '/var/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j32' parallel build
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
$ ldd ~/bin/perf | grep python
libpython3.9.so.1.0 => /lib64/libpython3.9.so.1.0 (0x00007f58800b0000)
$ cat /tmp/build/perf/feature/test-all.make.output
$
Reviewed-by: James Clark <[email protected]>
Fixes: 66dfdff03d196e51 ("perf tools: Add Python 3 support")
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jaroslav Škarvada <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
sysfs__read_int() returns 0 on success, and so the fast read path was
always failing.
Fixes: bb629484d924118e ("perf tools: Simplify checking if SMT is active.")
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Konstantin Khlebnikov <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Clarke <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
futex_waitv syscall
To pick the changes in this cset:
a0eb2da92b715d0c ("futex: Wireup futex_waitv syscall")
That add support for this new syscall in tools such as 'perf trace'.
For instance, this is now possible (adapted from the x86_64 test output):
# perf trace -e futex_waitv
^C#
# perf trace -v -e futex_waitv
event qualifier tracepoint filter: (common_pid != 807333 && common_pid != 3564) && (id == 449)
^C#
# perf trace -v -e futex* --max-events 10
event qualifier tracepoint filter: (common_pid != 812168 && common_pid != 3564) && (id == 221 || id == 449)
mmap size 528384B
? ( ): Timer/219310 ... [continued]: futex()) = -1 ETIMEDOUT (Connection timed out)
0.012 ( 0.002 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0
0.024 ( 0.060 ms): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) = 0
0.086 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0
0.088 ( ): Timer/219310 futex(uaddr: 0x7fd0b152d424, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
0.075 ( 0.005 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d420, op: WAKE|PRIVATE_FLAG, val: 1) = 1
0.169 ( 0.004 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d424, op: WAKE|PRIVATE_FLAG, val: 1) = 1
0.088 ( 0.089 ms): Timer/219310 ... [continued]: futex()) = 0
0.179 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0
0.181 ( ): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
#
That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.
$ grep futex tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
221 32 futex sys_futex_time32
221 64 futex sys_futex
221 spu futex sys_futex
422 32 futex_time64 sys_futex sys_futex
449 common futex_waitv sys_futex_waitv
$
This addresses this perf build warnings:
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
Reviewed-by: Athira Rajeev <[email protected]>
Tested-by: Athira Rajeev <[email protected]>
Cc: Adrian Hunter <[email protected]>,
Cc: André Almeida <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/lkml/YZ%[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The space allowed for new attributes can be too small if existing header
information is large. That can happen, for example, if there are very
many CPUs, due to having an event ID per CPU per event being stored in the
header information.
Fix by adding the existing header.data_offset. Also increase the extra
space allowed to 8KiB and align to a 4KiB boundary for neatness.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
syscall
To pick the changes in these csets:
6c122360cf2f4c5a ("s390: wire up sys_futex_waitv system call")
That add support for this new syscall in tools such as 'perf trace'.
For instance, this is now possible (adapted from the x86_64 test output):
# perf trace -e futex_waitv
^C#
# perf trace -v -e futex_waitv
event qualifier tracepoint filter: (common_pid != 807333 && common_pid != 3564) && (id == 449)
^C#
# perf trace -v -e futex* --max-events 10
event qualifier tracepoint filter: (common_pid != 812168 && common_pid != 3564) && (id == 238 || id == 449)
? ( ): Timer/219310 ... [continued]: futex()) = -1 ETIMEDOUT (Connection timed out)
0.012 ( 0.002 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0
0.024 ( 0.060 ms): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) = 0
0.086 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0
0.088 ( ): Timer/219310 futex(uaddr: 0x7fd0b152d424, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
0.075 ( 0.005 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d420, op: WAKE|PRIVATE_FLAG, val: 1) = 1
0.169 ( 0.004 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d424, op: WAKE|PRIVATE_FLAG, val: 1) = 1
0.088 ( 0.089 ms): Timer/219310 ... [continued]: futex()) = 0
0.179 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1) = 0
0.181 ( ): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
#
That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.
$ grep futex tools/perf/arch/s390/entry/syscalls/syscall.tbl
238 common futex sys_futex sys_futex_time32
422 32 futex_time64 - sys_futex
449 common futex_waitv sys_futex_waitv sys_futex_waitv
$
This addresses this perf build warnings:
Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
Acked-by: Heiko Carstens <[email protected]>
Cc: Adrian Hunter <[email protected]>,
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Link: https://lore.kernel.org/lkml/YZ%2F2qRW%[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This: This reverts commit 92723ea0f11d92496687db8c9725248e9d1e5e1d.
# perf test 91
91: perf stat --bpf-counters test :RRRRRRRRRRRRR FAILED!
# perf test 91
91: perf stat --bpf-counters test :RRRRRRRRRRRRR FAILED!
# perf test 91
91: perf stat --bpf-counters test :RRRRRRRRRRRR FAILED!
# perf test 91
91: perf stat --bpf-counters test :RRRRRRRRRRRRRRRRRR Ok
# perf test 91
91: perf stat --bpf-counters test :RRRRRRRRR FAILED!
# perf test 91
91: perf stat --bpf-counters test :RRRRRRRRRRR Ok
# perf test 91
91: perf stat --bpf-counters test :RRRRRRRRRRRRRRR Ok
yep, it seems the perf bench is broken so the counts won't correlated if
I revert this one:
92723ea0f11d perf bench: Fix two memory leaks detected with ASan
it works for me again.. it seems to break -t option
[root@dell-r440-01 perf]# ./perf bench sched messaging -g 1 -l 100 -t
# Running 'sched/messaging' benchmark:
RRRperf: CLIENT: ready write: Bad file descriptor
Rperf: SENDER: write: Bad file descriptor
Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Sohaib Mohamed <[email protected]>
Cc: Song Liu <[email protected]>
Link: https://lore.kernel.org/lkml/YZev7KClb%2Fud43Lc@krava/
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This adds comments above functions in libbpf.h which document
their uses. These comments are of a format that doxygen and sphinx
can pick up and render. These are rendered by libbpf.readthedocs.org
These doc comments are for:
- bpf_object__open_file()
- bpf_object__open_mem()
- bpf_program__attach_uprobe()
- bpf_program__attach_uprobe_opts()
Signed-off-by: Grant Seltzer <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Fix typo in comment from 'bpf_skeleton_map' to 'bpf_map_skeleton'
and from 'bpf_skeleton_prog' to 'bpf_prog_skeleton'.
Signed-off-by: huangxuesen <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Add new '__rel_loc' dynamic data location attribute support.
This type attribute is similar to the '__data_loc' but records the
offset from the field itself.
The libtraceevent adds TEP_FIELD_IS_RELATIVE to the
'tep_format_field::flags' with TEP_FIELD_IS_DYNAMIC for'__rel_loc'.
Link: https://lkml.kernel.org/r/163757344810.510314.12449413842136229871.stgit@devnote2
Cc: Beau Belgrave <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Tom Zanussi <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
|
|
Add '__rel_loc' new dynamic data location attribute which encodes
the data location from the next to the field itself. This is similar
to the '__data_loc' but the location offset is not from the event
entry but from the next of the field.
This patch adds '__rel_loc' decoding support in the libtraceevent.
Link: https://lkml.kernel.org/r/163757343994.510314.13241077597729303802.stgit@devnote2
Cc: Beau Belgrave <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Tom Zanussi <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
|
|
Make -d flag functional for gen_loader style program loading.
For example:
$ bpftool prog load -L -d test_d_path.o
... // will print:
libbpf: loading ./test_d_path.o
libbpf: elf: section(3) fentry/security_inode_getattr, size 280, link 0, flags 6, type=1
...
libbpf: prog 'prog_close': found data map 0 (test_d_p.bss, sec 7, off 0) for insn 30
libbpf: gen: load_btf: size 5376
libbpf: gen: map_create: test_d_p.bss idx 0 type 2 value_type_id 118
libbpf: map 'test_d_p.bss': created successfully, fd=0
libbpf: gen: map_update_elem: idx 0
libbpf: sec 'fentry/filp_close': found 1 CO-RE relocations
libbpf: record_relo_core: prog 1 insn[15] struct file 0:1 final insn_idx 15
libbpf: gen: prog_load: type 26 insns_cnt 35 progi_idx 0
libbpf: gen: find_attach_tgt security_inode_getattr 12
libbpf: gen: prog_load: type 26 insns_cnt 37 progi_idx 1
libbpf: gen: find_attach_tgt filp_close 12
libbpf: gen: finish 0
... // at this point libbpf finished generating loader program
0: (bf) r6 = r1
1: (bf) r1 = r10
2: (07) r1 += -136
3: (b7) r2 = 136
4: (b7) r3 = 0
5: (85) call bpf_probe_read_kernel#113
6: (05) goto pc+104
... // this is the assembly dump of the loader program
390: (63) *(u32 *)(r6 +44) = r0
391: (18) r1 = map[idx:0]+5584
393: (61) r0 = *(u32 *)(r1 +0)
394: (63) *(u32 *)(r6 +24) = r0
395: (b7) r0 = 0
396: (95) exit
err 0 // the loader program was loaded and executed successfully
(null)
func#0 @0
... // CO-RE in the kernel logs:
CO-RE relocating STRUCT file: found target candidate [500]
prog '': relo #0: kind <byte_off> (0), spec is [8] STRUCT file.f_path (0:1 @ offset 16)
prog '': relo #0: matching candidate #0 [500] STRUCT file.f_path (0:1 @ offset 16)
prog '': relo #0: patched insn #15 (ALU/ALU64) imm 16 -> 16
vmlinux_cand_cache:[11]file(500),
module_cand_cache:
... // verifier logs when it was checking test_d_path.o program:
R1 type=ctx expected=fp
0: R1=ctx(id=0,off=0,imm=0) R10=fp0
; int BPF_PROG(prog_close, struct file *file, void *id)
0: (79) r6 = *(u64 *)(r1 +0)
func 'filp_close' arg0 has btf_id 500 type STRUCT 'file'
1: R1=ctx(id=0,off=0,imm=0) R6_w=ptr_file(id=0,off=0,imm=0) R10=fp0
; pid_t pid = bpf_get_current_pid_tgid() >> 32;
1: (85) call bpf_get_current_pid_tgid#14
... // if there are multiple programs being loaded by the loader program
... // only the last program in the elf file will be printed, since
... // the same verifier log_buf is used for all PROG_LOAD commands.
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Fix a couple of SWAPGS fencing issues in the x86 entry code
- Use the proper operand types in __{get,put}_user() to prevent
truncation in SEV-ES string io
- Make sure the kernel mappings are present in trampoline_pgd in order
to prevent any potential accesses to unmapped memory after switching
to it
- Fix a trivial list corruption in objtool's pv_ops validation
- Disable the clocksource watchdog for TSC on platforms which claim
that the TSC is constant, doesn't stop in sleep states, CPU has TSC
adjust and the number of sockets of the platform are max 2, to
prevent erroneous markings of the TSC as unstable.
- Make sure TSC adjust is always checked not only when going idle
- Prevent a stack leak by initializing struct _fpx_sw_bytes properly in
the FPU code
- Fix INTEL_FAM6_RAPTORLAKE define naming to adhere to the convention
* tag 'x86_urgent_for_v5.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/xen: Add xenpv_restore_regs_and_return_to_usermode()
x86/entry: Use the correct fence macro after swapgs in kernel CR3
x86/entry: Add a fence for kernel entry SWAPGS in paranoid_entry()
x86/sev: Fix SEV-ES INS/OUTS instructions for word, dword, and qword
x86/64/mm: Map all kernel memory into trampoline_pgd
objtool: Fix pv_ops noinstr validation
x86/tsc: Disable clocksource watchdog for TSC on qualified platorms
x86/tsc: Add a timer to make sure TSC_adjust is always checked
x86/fpu/signal: Initialize sw_bytes in save_xstate_epilog()
x86/cpu: Drop spurious underscore from RAPTOR_LAKE #define
|