Age | Commit message (Collapse) | Author | Files | Lines |
|
If we're running on a CPU without VMX/VSX then don't touch them. This
is fragile, the compiler could spill a VMX/VSX register and break the
test anyway. But in practice it seems to work, ie. the test runs to
completion on a system without VSX with this change.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This is a test of specific piece of logic in isa207-common.c, which is
only used on Power8 or later. So skip it on older CPUs.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Both these tests use PMU events that only work on newer CPUs, so skip
them on older CPUs.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The DSCR tests fail on systems that don't have DSCR, so check for the
DSCR in hwcap and skip if it's not present.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
utils.h provides have_hwcap() and have_hwcap2() which check for a
feature bit. Those bits are defined in asm/cputable.h, so include it
in utils.h so users of utils.h don't have to do it manually.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This version of set_dscr() was added for the RFI flush test, and is
fairly specific to it. It also clashes with the version of set_dscr()
in dscr/dscr.h. So move it into the RFI flush test where it's used.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
On older systems this test takes longer to run (duh), give it five
minutes which is long enough on a G5 970FX @ 1.6GHz.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
These platforms don't show the MMU in /proc/cpuinfo, but they always
use hash, so teach using_hash_mmu() that.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This test creates some threads, which write to TM SPRs, and then makes
sure the registers maintain the correct values across context switches
and contention with other threads.
But currently the test finishes almost instantaneously, which reduces
the chance of it hitting an interesting condition.
So increase the number of loops, so it runs a bit longer, though still
less than 2s on a Power8.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This test tries to set affinity to CPUs that don't exist, especially
if the set of online CPUs doesn't start at 0.
But there's no real reason for it to use setaffinity in the first
place, it's just trying to create lots of threads to cause contention.
So drop the setaffinity entirely.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Several of the TM tests fail spuriously if CPU 0 is offline, because
they blindly try to affinitise to CPU 0.
Fix them by picking any online CPU and using that instead.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Replace old parameters with global NFT_NAT from commit db8ab38880e0
("netfilter: nf_tables: merge ipv4 and ipv6 nat chain types")
Signed-off-by: Fabian Frederick <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
The inat-tables.c file has some arrays in it that contain pointers to
other arrays. These pointers need to be relocated when the kernel
image is moved to a different location.
The pre-decompression boot-code has no support for applying ELF
relocations, so initialize these arrays at runtime in the
pre-decompression code to make sure all pointers are correctly
initialized.
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Synchronise the bpf.h header under tools, to report the fixes recently
brought to the documentation for the BPF helpers.
Signed-off-by: Quentin Monnet <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Fix a formatting error in the documentation for bpftool-link, so that
the man page can build correctly.
Signed-off-by: Quentin Monnet <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
We got slightly different patches removing a double word
in a comment in net/ipv4/raw.c - picked the version from net.
Simple conflict in drivers/net/ethernet/ibm/ibmvnic.c. Use cached
values instead of VNIC login response buffer (following what
commit 507ebe6444a4 ("ibmvnic: Fix use-after-free of VNIC login
response buffer") did).
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This is bitrotting, nobody is stepping up to work on it, and since we
treat warnings as errors, feature detection is failing in its main,
faster test (tools/build/feature/test-all.c) because of the GTK+2
infobar check.
So make this opt-in, at some point ditch this if nobody volunteers to
take care of this.
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This enables zen3 users by reusing mostly-compatible zen2 events
until the official public list of zen3 events is published in a
future PPR.
Signed-off-by: Kim Phillips <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Jon Grimm <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Martin Jambor <[email protected]>
Cc: Martin Liška <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Vijay Thakkar <[email protected]>
Cc: William Cohen <[email protected]>
Cc: Yunfeng Ye <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add support for events listed in Section 2.1.15.2 "Performance
Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803
Rev 0.54 - Sep 12, 2019".
perf now supports these new events (-e):
all_dc_accesses
all_tlbs_flushed
l1_dtlb_misses
l2_cache_accesses_from_dc_misses
l2_cache_accesses_from_ic_misses
l2_cache_hits_from_dc_misses
l2_cache_hits_from_ic_misses
l2_cache_misses_from_dc_misses
l2_cache_misses_from_ic_miss
l2_dtlb_misses
l2_itlb_misses
sse_avx_stalls
uops_dispatched
uops_retired
l3_accesses
l3_misses
and these metrics (-M):
branch_misprediction_ratio
all_l2_cache_accesses
all_l2_cache_hits
all_l2_cache_misses
ic_fetch_miss_ratio
l2_cache_accesses_from_l2_hwpf
l2_cache_hits_from_l2_hwpf
l2_cache_misses_from_l2_hwpf
l3_read_miss_latency
l1_itlb_misses
all_remote_links_outbound
nps1_die_to_dram
The nps1_die_to_dram event may need perf stat's --metric-no-group
switch if the number of available data fabric counters is less
than the number it uses (8).
Committer testing:
On a AMD Ryzen 3900x system:
Before:
# perf list all_dc_accesses all_tlbs_flushed l1_dtlb_misses l2_cache_accesses_from_dc_misses l2_cache_accesses_from_ic_misses l2_cache_hits_from_dc_misses l2_cache_hits_from_ic_misses l2_cache_misses_from_dc_misses l2_cache_misses_from_ic_miss l2_dtlb_misses l2_itlb_misses sse_avx_stalls uops_dispatched uops_retired l3_accesses l3_misses | grep -v "^Metric Groups:$" | grep -v "^$"
#
After:
# perf list all_dc_accesses all_tlbs_flushed l1_dtlb_misses l2_cache_accesses_from_dc_misses l2_cache_accesses_from_ic_misses l2_cache_hits_from_dc_misses l2_cache_hits_from_ic_misses l2_cache_misses_from_dc_misses l2_cache_misses_from_ic_miss l2_dtlb_misses l2_itlb_misses sse_avx_stalls uops_dispatched uops_retired l3_accesses l3_misses | grep -v "^Metric Groups:$" | grep -v "^$" | grep -v "^recommended:$"
all_dc_accesses
[All L1 Data Cache Accesses]
all_tlbs_flushed
[All TLBs Flushed]
l1_dtlb_misses
[L1 DTLB Misses]
l2_cache_accesses_from_dc_misses
[L2 Cache Accesses from L1 Data Cache Misses (including prefetch)]
l2_cache_accesses_from_ic_misses
[L2 Cache Accesses from L1 Instruction Cache Misses (including
prefetch)]
l2_cache_hits_from_dc_misses
[L2 Cache Hits from L1 Data Cache Misses]
l2_cache_hits_from_ic_misses
[L2 Cache Hits from L1 Instruction Cache Misses]
l2_cache_misses_from_dc_misses
[L2 Cache Misses from L1 Data Cache Misses]
l2_cache_misses_from_ic_miss
[L2 Cache Misses from L1 Instruction Cache Misses]
l2_dtlb_misses
[L2 DTLB Misses & Data page walks]
l2_itlb_misses
[L2 ITLB Misses & Instruction page walks]
sse_avx_stalls
[Mixed SSE/AVX Stalls]
uops_dispatched
[Micro-ops Dispatched]
uops_retired
[Micro-ops Retired]
l3_accesses
[L3 Accesses. Unit: amd_l3]
l3_misses
[L3 Misses (includes Chg2X). Unit: amd_l3]
#
# perf stat -a -e all_dc_accesses,all_tlbs_flushed,l1_dtlb_misses,l2_cache_accesses_from_dc_misses,l2_cache_accesses_from_ic_misses,l2_cache_hits_from_dc_misses,l2_cache_hits_from_ic_misses,l2_cache_misses_from_dc_misses,l2_cache_misses_from_ic_miss,l2_dtlb_misses,l2_itlb_misses,sse_avx_stalls,uops_dispatched,uops_retired,l3_accesses,l3_misses sleep 2
Performance counter stats for 'system wide':
433,439,949 all_dc_accesses (35.66%)
443 all_tlbs_flushed (35.66%)
2,985,885 l1_dtlb_misses (35.66%)
18,318,019 l2_cache_accesses_from_dc_misses (35.68%)
50,114,810 l2_cache_accesses_from_ic_misses (35.72%)
12,423,978 l2_cache_hits_from_dc_misses (35.74%)
40,703,103 l2_cache_hits_from_ic_misses (35.74%)
6,698,673 l2_cache_misses_from_dc_misses (35.74%)
12,090,892 l2_cache_misses_from_ic_miss (35.74%)
614,267 l2_dtlb_misses (35.74%)
216,036 l2_itlb_misses (35.74%)
11,977 sse_avx_stalls (35.74%)
999,276,223 uops_dispatched (35.73%)
1,075,311,620 uops_retired (35.69%)
1,420,763 l3_accesses
540,164 l3_misses
2.002344121 seconds time elapsed
# perf stat -a -e all_dc_accesses,all_tlbs_flushed,l1_dtlb_misses,l2_cache_accesses_from_dc_misses,l2_cache_accesses_from_ic_misses sleep 2
Performance counter stats for 'system wide':
175,943,104 all_dc_accesses
310 all_tlbs_flushed
2,280,359 l1_dtlb_misses
11,700,151 l2_cache_accesses_from_dc_misses
25,414,963 l2_cache_accesses_from_ic_misses
2.001957818 seconds time elapsed
#
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Kim Phillips <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Jon Grimm <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Martin Jambor <[email protected]>
Cc: Martin Liška <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Vijay Thakkar <[email protected]>
Cc: William Cohen <[email protected]>
Cc: Yunfeng Ye <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The ITLB Instruction Fetch Hits event isn't documented even in later
zen1 PPRs, but it seems to count correctly on zen1 hardware.
Add it to zen1 group so zen1 users can use the upcoming IC Fetch Miss
Ratio Metric.
The IF1G, 1IF2M, IF4K (Instruction fetches to a 1 GB, 2 MB, and 4K page)
unit masks are not added because unlike zen2 hardware, zen1 hardware
counts all its unit masks with a 0 unit mask according to the old
convention:
zen1$ perf stat -e cpu/event=0x94/,cpu/event=0x94,umask=0xff/ sleep 1
Performance counter stats for 'sleep 1':
211,318 cpu/event=0x94/u
211,318 cpu/event=0x94,umask=0xff/u
Rome/zen2:
zen2$ perf stat -e cpu/event=0x94/,cpu/event=0x94,umask=0xff/ sleep 1
Performance counter stats for 'sleep 1':
0 cpu/event=0x94/u
190,744 cpu/event=0x94,umask=0xff/u
Signed-off-by: Kim Phillips <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]> # on Zen2 only (3900x)
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Jon Grimm <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Martin Jambor <[email protected]>
Cc: Martin Liška <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Vijay Thakkar <[email protected]>
Cc: William Cohen <[email protected]>
Cc: Yunfeng Ye <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Later revisions of PPRs that post-date the original Family 17h events
submission patch add these events.
Specifically, they were not in this 2017 revision of the F17h PPR:
Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors Rev 1.14 - April 15, 2017
But e.g., are included in this 2019 version of the PPR:
Processor Programming Reference (PPR) for AMD Family 17h Model 18h, Revision B1 Processors Rev. 3.14 - Sep 26, 2019
Fixes: 98c07a8f74f8 ("perf vendor events amd: perf PMU events for AMD Family 17h")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Kim Phillips <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Jon Grimm <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Martin Jambor <[email protected]>
Cc: Martin Liška <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: Stephane Eranian <[email protected]>
Cc: Vijay Thakkar <[email protected]>
Cc: William Cohen <[email protected]>
Cc: Yunfeng Ye <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Same as 'perf probe -F', this patch adds filter support for the ftrace
subcommand option '-F, --funcs <[FILTER]>'.
Here is an example that only lists functions which start with 'vfs_':
$ sudo perf ftrace -F vfs_*
vfs_fadvise
vfs_fallocate
vfs_truncate
vfs_open
vfs_setpos
vfs_llseek
vfs_readf
vfs_writef
...
Suggested-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Changbin Du <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Consolidate control option fifo closing into one function.
Signed-off-by: Adrian Hunter <[email protected]>
Suggested-by: Alexey Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The documentation describes snapshot mode. Update it to include the new
snapshot control command.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Alexey Budankov <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When we use the 'intel' disassembler style we get 'ret' instead of
'retq', so add that as an alias.
# perf annotate --disassembler-style=intel --stdio2 acpi_processor_ffh_cstate_enter > before
Apply this patch and then:
# perf annotate --disassembler-style=intel --stdio2 acpi_processor_ffh_cstate_enter > after
# diff -u before after
--- before 2020-09-04 14:10:47.768414634 -0300
+++ after 2020-09-04 14:10:59.116681039 -0300
@@ -33,7 +33,7 @@
test al,0x8
↓ je 97
and DWORD PTR gs:[rip+0x7e548509],0x7fffffff
- 97: ret
+ 97: ← ret
mov rax,QWORD PTR gs:0x17bc0
lock or BYTE PTR [rax+0x2],0x20
mov rax,QWORD PTR [rax]
#
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Martin Liška <[email protected]>
Cc: Matt P. Dziubinski <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Wang Nan <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
# perf annotate --stdio2 acpi_processor_ffh_cstate_enter > default
# perf config annotate.disassembler_style=intel
# perf config annotate.disassembler_style
annotate.disassembler_style=intel
# perf annotate --stdio2 acpi_processor_ffh_cstate_enter > intel
# diff -u default intel
--- default 2020-09-04 13:09:26.019205732 -0300
+++ intel 2020-09-04 13:09:52.823795081 -0300
@@ -1,42 +1,42 @@
Samples: 1K of event 'cycles', 4000 Hz, Event count (approx.): 990065316, [percent: local period]
acpi_processor_ffh_cstate_enter() /lib/modules/5.9.0-rc3/build/vmlinux
-Percent → callq __fentry__
- mov cpu_number,%edx
- mov %edx,%edx
- mov cpu_cstate_entry,%rax
- add -0x7dbe9700(,%rdx,8),%rax
- movzbl 0x9(%rdi),%edx
- mov 0x4(%rax,%rdx,8),%edi
- mov (%rax,%rdx,8),%esi
- → jmpq 137ccc6
- 2d: → jmpq 137ccd8
+Percent → call __fentry__
+ mov edx,DWORD PTR gs:[rip+0x7e541d74]
+ mov edx,edx
+ mov rax,QWORD PTR [rip+0x152b8fb]
+ add rax,QWORD PTR [rdx*8-0x7dbe9700]
+ movzx edx,BYTE PTR [rdi+0x9]
+ mov edi,DWORD PTR [rax+rdx*8+0x4]
+ mov esi,DWORD PTR [rax+rdx*8]
+ → jmp 137ccc6
+ 2d: → jmp 137ccd8
mfence
- mov %gs:0x17bc0,%rax
- clflush (%rax)
+ mov rax,QWORD PTR gs:0x17bc0
+ clflush BYTE PTR [rax]
mfence
- xor %edx,%edx
- mov %rdx,%rcx
- mov %gs:0x17bc0,%rax
- 0.00 monitor %rax,%ecx,%edx
- mov (%rax),%rax
- test $0x8,%al
+ xor edx,edx
+ mov rcx,rdx
+ mov rax,QWORD PTR gs:0x17bc0
+ 0.00 monitor
+ mov rax,QWORD PTR [rax]
+ test al,0x8
↓ jne 71
- ↓ jmpq 68
- verw 0x538b08(%rip) # ffffffff82008150 <ds.0>
- 68: mov %rsi,%rax
- mov %rdi,%rcx
-100.00 mwait %eax,%ecx
- 71: mov %gs:0x17bc0,%rax
- lock andb $0xdf,0x2(%rax)
- lock addl $0x0,-0x4(%rsp)
- mov (%rax),%rax
- test $0x8,%al
+ ↓ jmp 68
+ verw WORD PTR [rip+0x538b08] # ffffffff82008150 <ds.0>
+ 68: mov rax,rsi
+ mov rcx,rdi
+100.00 mwait
+ 71: mov rax,QWORD PTR gs:0x17bc0
+ lock and BYTE PTR [rax+0x2],0xdf
+ lock add DWORD PTR [rsp-0x4],0x0
+ mov rax,QWORD PTR [rax]
+ test al,0x8
↓ je 97
- andl $0x7fffffff,__preempt_count
- 97: ← retq
- mov %gs:0x17bc0,%rax
- lock orb $0x20,0x2(%rax)
- mov (%rax),%rax
- test $0x8,%al
+ and DWORD PTR gs:[rip+0x7e548509],0x7fffffff
+ 97: ret
+ mov rax,QWORD PTR gs:0x17bc0
+ lock or BYTE PTR [rax+0x2],0x20
+ mov rax,QWORD PTR [rax]
+ test al,0x8
↑ jne 71
- ↑ jmpq 2d
+ ↑ jmp 2d
#
Requested-by: Matt P. Dziubinski <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This commit adds a key entry enumerating the various types of relaxed
operations. While in the area, it also renames the relaxed rows.
[ paulmck: Apply Boqun Feng feedback. ]
Acked-by: Boqun Feng <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
Add 'snapshot' control command to create an AUX area tracing snapshot
the same as if sending SIGUSR2. The advantage of the FIFO is that access
is governed by access to the FIFO.
Example:
$ mkfifo perf.control
$ mkfifo perf.ack
$ cat perf.ack &
[1] 15235
$ sudo ~/bin/perf record --control fifo:perf.control,perf.ack -S -e intel_pt//u -- sleep 60 &
[2] 15243
$ ps -e | grep perf
15244 pts/1 00:00:00 perf
$ kill -USR2 15244
bash: kill: (15244) - Operation not permitted
$ echo snapshot > perf.control
ack
$
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Alexey Budankov <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Enable the --control option to accept file names as an alternative to
file descriptors.
Example:
$ mkfifo perf.control
$ mkfifo perf.ack
$ cat perf.ack &
[1] 6808
$ perf record --control fifo:perf.control,perf.ack -- sleep 300 &
[2] 6810
$ echo disable > perf.control
$ Events disabled
ack
$ echo enable > perf.control
$ Events enabled
ack
$ echo disable > perf.control
$ Events disabled
ack
$ kill %2
[ perf record: Woken up 4 times to write data ]
$ [ perf record: Captured and wrote 0.018 MB perf.data (7 samples) ]
[1]- Done cat perf.ack
[2]+ Terminated perf record --control fifo:perf.control,perf.ack -- sleep 300
$
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Alexey Budankov <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The --control option does not display well in man pages unless AsciiDoc
formatting is used.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Alexey Budankov <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Handle read errors from ctl_fd such as EINTR, EAGAIN and EWOULDBLOCK.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Alexey Budankov <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Consolidate --control option parsing into one function, in preparation
for adding FIFO file name options.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Alexey Budankov <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This adds a precompiled file in PE binary format, with split debug file,
and tries to read its build_id and .gnu_debuglink sections, as well as
looking up the main symbol from the debug file. This should succeed if
libbfd is supported.
Committer testing:
$ perf test "PE file support"
68: PE file support : Ok
$
Signed-off-by: Remi Bernon <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jacek Caban <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Wine generates PE binaries for its code modules and also generates debug
files in PE or PDB formats, which perf cannot parse either.
Trying to read symbols on non-ELF binaries with libbfd, when supported,
makes it possible for perf to report symbols and annotations for Windows
applications running under Wine.
Because libbfd doesn't provide symbol size (probably because of some
backends not supporting it), we compute it by first sorting the symbols
by addresses and then considering that they are sequential in a given
section.
v3: Also include local and weak bfd symbols and mark them as such, only
global symbols were previously reported, and that caused a very
imprecise address to symbol resolution.
Signed-off-by: Remi Bernon <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jacek Caban <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Wine generates PE binaries for most of its modules and perf is unable to
parse these files to get build_id or .gnu_debuglink section.
Using libbfd when available, instead of libelf, makes it possible to
resolve debug file location regardless of the dso binary format.
Committer notes:
Made the filename__read_build_id() variant that uses abfd->build_id
depend on the feature test that defines HAVE_LIBBFD_BUILDID_SUPPORT, to
get this to continue building with older libbfd/binutils.
Signed-off-by: Remi Bernon <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jacek Caban <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Which is needed by the PE executable support, for instance.
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jacek Caban <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Remi Bernon <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Detected by LGTM static analyze in Github repo, fix potential multiplication
overflow before result is casted to size_t.
Fixes: 8505e8709b5e ("libbpf: Implement generalized .BTF.ext func/line info adjustment")
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Another issue of __u64 needing either %lu or %llu, depending on the
architecture. Fix with cast to `unsigned long long`.
Fixes: 7e06aad52929 ("libbpf: Add multi-prog section support for struct_ops")
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Verify that the PIDFD_NONBLOCK flag works with pidfd_open() and that
waitid() with a non-blocking pidfd returns EAGAIN:
TAP version 13
1..3
# Starting 3 tests from 1 test cases.
# RUN global.wait_simple ...
# OK global.wait_simple
ok 1 global.wait_simple
# RUN global.wait_states ...
# OK global.wait_states
ok 2 global.wait_states
# RUN global.wait_nonblock ...
# OK global.wait_nonblock
ok 3 global.wait_nonblock
# PASSED: 3 / 3 tests passed.
# Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0
Signed-off-by: Christian Brauner <[email protected]>
Reviewed-by: Josh Triplett <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
|
|
For arm64 MTE support it is necessary to be able to mark pages that
contain user space visible tags that will need to be saved/restored e.g.
when swapped out.
To support this add a new arch specific flag (PG_arch_2). This flag is
only available on 64-bit architectures due to the limited number of
spare page flags on the 32-bit ones.
Signed-off-by: Steven Price <[email protected]>
[[email protected]: use CONFIG_64BIT for guarding this new flag]
Signed-off-by: Catalin Marinas <[email protected]>
Cc: Andrew Morton <[email protected]>
|
|
All of the new pidfd selftests already use the new kselftest harness
infrastructure. It makes for clearer output, makes the code easier to
understand, and makes adding new tests way simpler.
Signed-off-by: Christian Brauner <[email protected]>
Reviewed-by: Josh Triplett <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull more perf tools fixes from Arnaldo Carvalho de Melo:
- Use uintptr_t when casting numbers to pointers
- Keep output expected by 3rd parties: Turn off summary for interval
mode by default.
- BPF is in kernel space, make sure do_validate_kcore_modules() knows
about that.
- Explicitly call out event modifiers in the documentation.
- Fix jevents() allocation of space for regular expressions.
- Address libtraceevent build warnings on 32-bit arches.
- Fix checking of functions returns using ERR_PTR() in 'perf bench'.
* tag 'perf-tools-fixes-for-v5.9-2020-09-03' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf tools: Add bpf image check to __map__is_kmodule
perf record/stat: Explicitly call out event modifiers in the documentation
perf bench: The do_run_multi_threaded() function must use IS_ERR(perf_session__new())
perf stat: Turn off summary for interval mode by default
libtraceevent: Fix build warning on 32-bit arches
perf jevents: Fix suspicious code in fixregex()
perf parse-events: Use uintptr_t when casting numbers to pointers
|
|
Pull networking fixes from David Miller:
1) Use netif_rx_ni() when necessary in batman-adv stack, from Jussi
Kivilinna.
2) Fix loss of RTT samples in rxrpc, from David Howells.
3) Memory leak in hns_nic_dev_probe(), from Dignhao Liu.
4) ravb module cannot be unloaded, fix from Yuusuke Ashizuka.
5) We disable BH for too lokng in sctp_get_port_local(), add a
cond_resched() here as well, from Xin Long.
6) Fix memory leak in st95hf_in_send_cmd, from Dinghao Liu.
7) Out of bound access in bpf_raw_tp_link_fill_link_info(), from
Yonghong Song.
8) Missing of_node_put() in mt7530 DSA driver, from Sumera
Priyadarsini.
9) Fix crash in bnxt_fw_reset_task(), from Michael Chan.
10) Fix geneve tunnel checksumming bug in hns3, from Yi Li.
11) Memory leak in rxkad_verify_response, from Dinghao Liu.
12) In tipc, don't use smp_processor_id() in preemptible context. From
Tuong Lien.
13) Fix signedness issue in mlx4 memory allocation, from Shung-Hsi Yu.
14) Missing clk_disable_prepare() in gemini driver, from Dan Carpenter.
15) Fix ABI mismatch between driver and firmware in nfp, from Louis
Peens.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (110 commits)
net/smc: fix sock refcounting in case of termination
net/smc: reset sndbuf_desc if freed
net/smc: set rx_off for SMCR explicitly
net/smc: fix toleration of fake add_link messages
tg3: Fix soft lockup when tg3_reset_task() fails.
doc: net: dsa: Fix typo in config code sample
net: dp83867: Fix WoL SecureOn password
nfp: flower: fix ABI mismatch between driver and firmware
tipc: fix shutdown() of connectionless socket
ipv6: Fix sysctl max for fib_multipath_hash_policy
drivers/net/wan/hdlc: Change the default of hard_header_len to 0
net: gemini: Fix another missing clk_disable_unprepare() in probe
net: bcmgenet: fix mask check in bcmgenet_validate_flow()
amd-xgbe: Add support for new port mode
net: usb: dm9601: Add USB ID of Keenetic Plus DSL
vhost: fix typo in error message
net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init()
pktgen: fix error message with wrong function name
net: ethernet: ti: am65-cpsw: fix rmii 100Mbit link mode
cxgb4: fix thermal zone device registration
...
|
|
Merge gate page refcount fix from Dave Hansen:
"During the conversion over to pin_user_pages(), gate pages were missed.
The fix is pretty simple, and is accompanied by a new test from Andy
which probably would have caught this earlier"
* emailed patches from Dave Hansen <[email protected]>:
selftests/x86/test_vsyscall: Improve the process_vm_readv() test
mm: fix pin vs. gup mismatch with gate pages
|
|
The existing code accepted process_vm_readv() success or failure as long
as it didn't return garbage. This is too weak: if the vsyscall page is
readable, then process_vm_readv() should succeed and, if the page is not
readable, then it should fail.
Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Cc: [email protected]
Cc: Peter Zijlstra <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Added bpf_{updata,delete}_map_elem to the very map element the
iter program is visiting. Due to rcu protection, the visited map
elements, although stale, should still contain correct values.
$ ./test_progs -n 4/18
#4/18 bpf_hash_map:OK
#4 bpf_iter:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Yonghong Song <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
The returned value of bpf_object__open_file() should be checked with
libbpf_get_error() rather than NULL. This fix prevents test_progs from
crash when test_global_data.o is not present.
Signed-off-by: Hao Luo <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
As one of the most complicated and close-to-real-world programs, cls_redirect
is a good candidate to exercise libbpf's logic of handling bpf2bpf calls. So
add variant with using explicit __noinline for majority of functions except
few most basic ones. If those few functions are inlined, verifier starts to
complain about program instruction limit of 1mln instructions being exceeded,
most probably due to instruction overhead of doing a sub-program call.
Convert user-space part of selftest to have to sub-tests: with and without
inlining.
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Cc: Lorenz Bauer <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Update xdp_noinline to use BPF skeleton and force __noinline on helper
sub-programs. Also, split existing logic into v4- and v6-only to complicate
sub-program calling patterns (partially overlapped sets of functions for
entry-level BPF programs).
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Add use of non-inlined subprogs to few bigger selftests to excercise libbpf's
bpf2bpf handling logic. Also split l4lb_all selftest into two sub-tests.
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|