aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2020-11-15selftests: kvm: keep .gitignore add to datePaolo Bonzini1-3/+3
Add tsc_msrs_test, remove clear_dirty_log_test and alphabetize everything. Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-15KVM: selftests: Add "-c" parameter to dirty log testPeter Xu1-3/+10
It's only used to override the existing dirty ring size/count. If with a bigger ring count, we test async of dirty ring. If with a smaller ring count, we test ring full code path. Async is default. It has no use for non-dirty-ring tests. Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Peter Xu <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-15KVM: selftests: Run dirty ring test asynchronouslyPeter Xu1-4/+60
Previously the dirty ring test was working in synchronous way, because only with a vmexit (with that it was the ring full event) we'll know the hardware dirty bits will be flushed to the dirty ring. With this patch we first introduce a vcpu kick mechanism using SIGUSR1, which guarantees a vmexit and also therefore the flushing of hardware dirty bits. Once this is in place, we can keep the vcpu dirty work asynchronous of the whole collection procedure now. Still, we need to be very careful that when reaching the ring buffer soft limit (KVM_EXIT_DIRTY_RING_FULL) we must collect the dirty bits before continuing the vcpu. Further increase the dirty ring size to current maximum to make sure we torture more on the no-ring-full case, which should be the major scenario when the hypervisors like QEMU would like to use this feature. Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Peter Xu <[email protected]> Message-Id: <[email protected]> [Use KVM_SET_SIGNAL_MASK+sigwait instead of a signal handler. - Paolo] Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-15KVM: selftests: Add dirty ring buffer testPeter Xu4-13/+320
Add the initial dirty ring buffer test. The current test implements the userspace dirty ring collection, by only reaping the dirty ring when the ring is full. So it's still running synchronously like this: vcpu main thread 1. vcpu dirties pages 2. vcpu gets dirty ring full (userspace exit) 3. main thread waits until full (so hardware buffers flushed) 4. main thread collects 5. main thread continues vcpu 6. vcpu continues, goes back to 1 We can't directly collects dirty bits during vcpu execution because otherwise we can't guarantee the hardware dirty bits were flushed when we collect and we're very strict on the dirty bits so otherwise we can fail the future verify procedure. A follow up patch will make this test to support async just like the existing dirty log test, by adding a vcpu kick mechanism. Signed-off-by: Peter Xu <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-15KVM: selftests: Introduce after_vcpu_run hook for dirty log testPeter Xu1-12/+24
Provide a hook for the checks after vcpu_run() completes. Preparation for the dirty ring test because we'll need to take care of another exit reason. Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Peter Xu <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-15KVM: selftests: test KVM_GET_SUPPORTED_HV_CPUID as a system ioctlVitaly Kuznetsov3-38/+77
KVM_GET_SUPPORTED_HV_CPUID is now supported as both vCPU and VM ioctl, test that. Signed-off-by: Vitaly Kuznetsov <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-15KVM: selftests: Verify supported CR4 bits can be set before KVM_SET_CPUID2Sean Christopherson3-5/+108
Extend the KVM_SET_SREGS test to verify that all supported CR4 bits, as enumerated by KVM, can be set before KVM_SET_CPUID2, i.e. without first defining the vCPU model. KVM is supposed to skip guest CPUID checks when host userspace is stuffing guest state. Check the inverse as well, i.e. that KVM rejects KVM_SET_REGS if CR4 has one or more unsupported bits set. Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski50-892/+2402
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-11-14 1) Add BTF generation for kernel modules and extend BTF infra in kernel e.g. support for split BTF loading and validation, from Andrii Nakryiko. 2) Support for pointers beyond pkt_end to recognize LLVM generated patterns on inlined branch conditions, from Alexei Starovoitov. 3) Implements bpf_local_storage for task_struct for BPF LSM, from KP Singh. 4) Enable FENTRY/FEXIT/RAW_TP tracing program to use the bpf_sk_storage infra, from Martin KaFai Lau. 5) Add XDP bulk APIs that introduce a defer/flush mechanism to optimize the XDP_REDIRECT path, from Lorenzo Bianconi. 6) Fix a potential (although rather theoretical) deadlock of hashtab in NMI context, from Song Liu. 7) Fixes for cross and out-of-tree build of bpftool and runqslower allowing build for different target archs on same source tree, from Jean-Philippe Brucker. 8) Fix error path in htab_map_alloc() triggered from syzbot, from Eric Dumazet. 9) Move functionality from test_tcpbpf_user into the test_progs framework so it can run in BPF CI, from Alexander Duyck. 10) Lift hashtab key_size limit to be larger than MAX_BPF_STACK, from Florian Lehner. Note that for the fix from Song we have seen a sparse report on context imbalance which requires changes in sparse itself for proper annotation detection where this is currently being discussed on linux-sparse among developers [0]. Once we have more clarification/guidance after their fix, Song will follow-up. [0] https://lore.kernel.org/linux-sparse/CAHk-=wh4bx8A8dHnX612MsDO13st6uzAz1mJ1PaHHVevJx_ZCw@mail.gmail.com/T/ https://lore.kernel.org/linux-sparse/[email protected]/T/ * git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (66 commits) net: mlx5: Add xdp tx return bulking support net: mvpp2: Add xdp tx return bulking support net: mvneta: Add xdp tx return bulking support net: page_pool: Add bulk support for ptr_ring net: xdp: Introduce bulking for xdp tx return path bpf: Expose bpf_d_path helper to sleepable LSM hooks bpf: Augment the set of sleepable LSM hooks bpf: selftest: Use bpf_sk_storage in FENTRY/FEXIT/RAW_TP bpf: Allow using bpf_sk_storage in FENTRY/FEXIT/RAW_TP bpf: Rename some functions in bpf_sk_storage bpf: Folding omem_charge() into sk_storage_charge() selftests/bpf: Add asm tests for pkt vs pkt_end comparison. selftests/bpf: Add skb_pkt_end test bpf: Support for pointers beyond pkt_end. tools/bpf: Always run the *-clean recipes tools/bpf: Add bootstrap/ to .gitignore bpf: Fix NULL dereference in bpf_task_storage tools/bpftool: Fix build slowdown tools/runqslower: Build bpftool using HOSTCC tools/runqslower: Enable out-of-tree build ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-14bpf: Relax return code check for subprogramsDmitrii Banshchikov2-0/+20
Currently verifier enforces return code checks for subprograms in the same manner as it does for program entry points. This prevents returning arbitrary scalar values from subprograms. Scalar type of returned values is checked by btf_prepare_func_args() and hence it should be safe to allow only scalars for now. Relax return code checks for subprograms and allow any correct scalar values. Fixes: 51c39bb1d5d10 (bpf: Introduce function-by-function verification) Signed-off-by: Dmitrii Banshchikov <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-13tools, bpftool: Add missing close before bpftool net attach exitWang Hai1-9/+9
progfd is created by prog_parse_fd() in do_attach() and before the latter returns in case of success, the file descriptor should be closed. Fixes: 04949ccc273e ("tools: bpftool: add net attach command to attach XDP on interface") Signed-off-by: Wang Hai <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-12bpf: selftest: Use bpf_sk_storage in FENTRY/FEXIT/RAW_TPMartin KaFai Lau3-0/+259
This patch tests storing the task's related info into the bpf_sk_storage by fentry/fexit tracing at listen, accept, and connect. It also tests the raw_tp at inet_sock_set_state. A negative test is done by tracing the bpf_sk_storage_free() and using bpf_sk_storage_get() at the same time. It ensures this bpf program cannot load. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-12Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski70-489/+3105
Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-13selftests/bpf: Add asm tests for pkt vs pkt_end comparison.Alexei Starovoitov1-0/+42
Add few assembly tests for packet comparison. Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Tested-by: Jiri Olsa <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-13selftests/bpf: Add skb_pkt_end testAlexei Starovoitov2-0/+95
Add a test that currently makes LLVM generate assembly code: $ llvm-objdump -S skb_pkt_end.o 0000000000000000 <main_prog>: ; if (skb_shorter(skb, ETH_IPV4_TCP_SIZE)) 0: 61 12 50 00 00 00 00 00 r2 = *(u32 *)(r1 + 80) 1: 61 14 4c 00 00 00 00 00 r4 = *(u32 *)(r1 + 76) 2: bf 43 00 00 00 00 00 00 r3 = r4 3: 07 03 00 00 36 00 00 00 r3 += 54 4: b7 01 00 00 00 00 00 00 r1 = 0 5: 2d 23 02 00 00 00 00 00 if r3 > r2 goto +2 <LBB0_2> 6: 07 04 00 00 0e 00 00 00 r4 += 14 ; if (skb_shorter(skb, ETH_IPV4_TCP_SIZE)) 7: bf 41 00 00 00 00 00 00 r1 = r4 0000000000000040 <LBB0_2>: 8: b4 00 00 00 ff ff ff ff w0 = -1 ; if (!(ip = get_iphdr(skb))) 9: 2d 23 05 00 00 00 00 00 if r3 > r2 goto +5 <LBB0_6> ; proto = ip->protocol; 10: 71 12 09 00 00 00 00 00 r2 = *(u8 *)(r1 + 9) ; if (proto != IPPROTO_TCP) 11: 56 02 03 00 06 00 00 00 if w2 != 6 goto +3 <LBB0_6> ; if (tcp->dest != 0) 12: 69 12 16 00 00 00 00 00 r2 = *(u16 *)(r1 + 22) 13: 56 02 01 00 00 00 00 00 if w2 != 0 goto +1 <LBB0_6> ; return tcp->urg_ptr; 14: 69 10 26 00 00 00 00 00 r0 = *(u16 *)(r1 + 38) 0000000000000078 <LBB0_6>: ; } 15: 95 00 00 00 00 00 00 00 exit Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-12selftests: set conf.all.rp_filter=0 in bareudp.shGuillaume Nault1-0/+2
When working on the rp_filter problem, I didn't realise that disabling it on the network devices didn't cover all cases: rp_filter could also be enabled globally in the namespace, in which case it would drop packets, even if the net device has rp_filter=0. Fixes: 1ccd58331f6f ("selftests: disable rp_filter when testing bareudp") Fixes: bbbc7aa45eef ("selftests: add test script for bareudp tunnels") Signed-off-by: Guillaume Nault <[email protected]> Link: https://lore.kernel.org/r/f2d459346471f163b239aa9d63ce3e2ba9c62895.1605107012.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-12Merge tag 'net-5.10-rc4' of ↵Linus Torvalds9-18/+281
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Current release - regressions: - arm64: dts: fsl-ls1028a-kontron-sl28: specify in-band mode for ENETC Current release - bugs in new features: - mptcp: provide rmem[0] limit offset to fix oops Previous release - regressions: - IPv6: Set SIT tunnel hard_header_len to zero to fix path MTU calculations - lan743x: correctly handle chips with internal PHY - bpf: Don't rely on GCC __attribute__((optimize)) to disable GCSE - mlx5e: Fix VXLAN port table synchronization after function reload Previous release - always broken: - bpf: Zero-fill re-used per-cpu map element - fix out-of-order UDP packets when forwarding with UDP GSO fraglists turned on: - fix UDP header access on Fast/frag0 UDP GRO - fix IP header access and skb lookup on Fast/frag0 UDP GRO - ethtool: netlink: add missing netdev_features_change() call - net: Update window_clamp if SOCK_RCVBUF is set - igc: Fix returning wrong statistics - ch_ktls: fix multiple leaks and corner cases in Chelsio TLS offload - tunnels: Fix off-by-one in lower MTU bounds for ICMP/ICMPv6 replies - r8169: disable hw csum for short packets on all chip versions - vrf: Fix fast path output packet handling with async Netfilter rules" * tag 'net-5.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits) lan743x: fix use of uninitialized variable net: udp: fix IP header access and skb lookup on Fast/frag0 UDP GRO net: udp: fix UDP header access on Fast/frag0 UDP GRO devlink: Avoid overwriting port attributes of registered port vrf: Fix fast path output packet handling with async Netfilter rules cosa: Add missing kfree in error path of cosa_write net: switch to the kernel.org patchwork instance ch_ktls: stop the txq if reaches threshold ch_ktls: tcb update fails sometimes ch_ktls/cxgb4: handle partial tag alone SKBs ch_ktls: don't free skb before sending FIN ch_ktls: packet handling prior to start marker ch_ktls: Correction in middle record handling ch_ktls: missing handling of header alone ch_ktls: Correction in trimmed_len calculation cxgb4/ch_ktls: creating skbs causes panic ch_ktls: Update cheksum information ch_ktls: Correction in finding correct length cxgb4/ch_ktls: decrypted bit is not enough net/x25: Fix null-ptr-deref in x25_connect ...
2020-11-12perf test: Update branch sample pattern for cs-etmLeo Yan1-1/+1
Since the commit 943b69ac1884 ("perf parse-events: Set exclude_guest=1 for user-space counting"), 'exclude_guest=1' is set for user-space counting; and the branch sample's modifier has been altered, the sample event name has been changed from "branches:u:" to "branches:uH:", which gives out info for "user-space and host counting". But the cs-etm testing's regular expression cannot match the updated branch sample event and leads to test failure. This patch updates the branch sample pattern by using a more flexible expression '.*' to match branch sample's modifiers, so that allows the testing to work as expected. Fixes: 943b69ac1884 ("perf parse-events: Set exclude_guest=1 for user-space counting") Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Mathieu Poirier <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: coresight ml <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-12perf test: Fix a typo in cs-etm testingLeo Yan1-1/+1
Fix a typo: s/devce_name/device_name. Fixes: fe0aed19b266 ("perf test: Introduce script for Arm CoreSight testing") Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Mathieu Poirier <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: coresight ml <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-12tools arch: Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench ↵Arnaldo Carvalho de Melo5-10/+22
mem memcpy' To bring in the change made in this cset: 4d6ffa27b8e5116c ("x86/lib: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S") 6dcc5627f6aec4cb ("x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_*") I needed to define SYM_FUNC_START_LOCAL() as SYM_L_GLOBAL as mem{cpy,set}_{orig,erms} are used by 'perf bench'. This silences these perf tools build warnings: Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S' diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-12perf lock: Don't free "lock_seq_stat" if read_count isn't zeroLeo Yan1-1/+1
When execute command "perf lock report", it hits failure and outputs log as follows: perf: builtin-lock.c:623: report_lock_release_event: Assertion `!(seq->read_count < 0)' failed. Aborted This is an imbalance issue. The locking sequence structure "lock_seq_stat" contains the reader counter and it is used to check if the locking sequence is balance or not between acquiring and releasing. If the tool wrongly frees "lock_seq_stat" when "read_count" isn't zero, the "read_count" will be reset to zero when allocate a new structure at the next time; thus it causes the wrong counting for reader and finally results in imbalance issue. To fix this issue, if detects "read_count" is not zero (means still have read user in the locking sequence), goto the "end" tag to skip freeing structure "lock_seq_stat". Fixes: e4cef1f65061 ("perf lock: Fix state machine to recognize lock sequence") Signed-off-by: Leo Yan <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-12perf lock: Correct field name "flags"Leo Yan1-1/+1
The tracepoint "lock:lock_acquire" contains field "flags" but not "flag". Current code wrongly retrieves value from field "flag" and it always gets zero for the value, thus "perf lock" doesn't report the correct result. This patch replaces the field name "flag" with "flags", so can read out the correct flags for locking. Fixes: e4cef1f65061 ("perf lock: Fix state machine to recognize lock sequence") Signed-off-by: Leo Yan <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-12tools/bpf: Always run the *-clean recipesJean-Philippe Brucker1-2/+2
Make $(LIBBPF)-clean and $(LIBBPF_BOOTSTRAP)-clean .PHONY targets, in case those files exist. And keep consistency within the Makefile by making the directory dependencies order-only. Suggested-by: Andrii Nakryiko <[email protected]> Signed-off-by: Jean-Philippe Brucker <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-12tools/bpf: Add bootstrap/ to .gitignoreJean-Philippe Brucker1-1/+1
Commit 8859b0da5aac ("tools/bpftool: Fix cross-build") added a build-time bootstrap/ directory for bpftool, and removed bpftool-bootstrap. Update .gitignore accordingly. Reported-by: Andrii Nakryiko <[email protected]> Signed-off-by: Jean-Philippe Brucker <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-12selftests/bpf: Fix unused attribute usage in subprogs_unused testAndrii Nakryiko1-2/+2
Correct attribute name is "unused". maybe_unused is a C++17 addition. This patch fixes compilation warning during selftests compilation. Fixes: 197afc631413 ("libbpf: Don't attempt to load unused subprog as an entry-point BPF program") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-12selftests: pmtu.sh: improve the test result processingPo-Hsu Lin1-1/+14
This test will treat all non-zero return codes as failures, it will make the pmtu.sh test script being marked as FAILED when some sub-test got skipped. Improve the result processing by * Only mark the whole test script as SKIP when all of the sub-tests were skipped * If the sub-tests were either passed or skipped, the overall result will be PASS * If any of them has failed with return code 1 or anything bad happened (e.g. return code 127 for command not found), the overall result will be FAIL Signed-off-by: Po-Hsu Lin <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-12selftests: pmtu.sh: use $ksft_skip for skipped return codePo-Hsu Lin1-32/+32
This test uses return code 2 as a hard-coded skipped state, let's use the kselftest framework skip code variable $ksft_skip instead to make it more readable and easier to maintain. Signed-off-by: Po-Hsu Lin <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-11tools/bpftool: Fix build slowdownJean-Philippe Brucker1-1/+3
Commit ba2fd563b740 ("tools/bpftool: Support passing BPFTOOL_VERSION to make") changed BPFTOOL_VERSION to a recursively expanded variable, forcing it to be recomputed on every expansion of CFLAGS and dramatically slowing down the bpftool build. Restore BPFTOOL_VERSION as a simply expanded variable, guarded by an ifeq(). Fixes: ba2fd563b740 ("tools/bpftool: Support passing BPFTOOL_VERSION to make") Signed-off-by: Jean-Philippe Brucker <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-11tools/runqslower: Build bpftool using HOSTCCJean-Philippe Brucker1-1/+2
When cross building runqslower for an other architecture, the intermediate bpftool used to generate a skeleton must be built using the host toolchain. Pass HOSTCC and HOSTLD, defined in Makefile.include, to the bpftool Makefile. Signed-off-by: Jean-Philippe Brucker <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-11tools/runqslower: Enable out-of-tree buildJean-Philippe Brucker1-14/+18
Enable out-of-tree build for runqslower. Only set OUTPUT=.output if it wasn't already set by the user. Signed-off-by: Jean-Philippe Brucker <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-11tools/runqslower: Use Makefile.includeJean-Philippe Brucker1-15/+9
Makefile.include defines variables such as OUTPUT and CC for out-of-tree build and cross-build. Include it into the runqslower Makefile and use its $(QUIET*) helpers. Signed-off-by: Jean-Philippe Brucker <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-11tools/bpftool: Fix cross-buildJean-Philippe Brucker1-8/+26
The bpftool build first creates an intermediate binary, executed on the host, to generate skeletons required by the final build. When cross-building bpftool for an architecture different from the host, the intermediate binary should be built using the host compiler (gcc) and the final bpftool using the cross compiler (e.g. aarch64-linux-gnu-gcc). Generate the intermediate objects into the bootstrap/ directory using the host toolchain. Signed-off-by: Jean-Philippe Brucker <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-11tools/bpftool: Force clean of out-of-tree buildJean-Philippe Brucker1-3/+5
Cleaning a partial build can fail if the output directory for libbpf wasn't created: $ make -C tools/bpf/bpftool O=/tmp/bpf clean /bin/sh: line 0: cd: /tmp/bpf/libbpf/: No such file or directory tools/scripts/Makefile.include:17: *** output directory "/tmp/bpf/libbpf/" does not exist. Stop. make: *** [Makefile:36: /tmp/bpf/libbpf/libbpf.a-clean] Error 2 As a result make never gets around to clearing the leftover objects. Add the libbpf output directory as clean dependency to ensure clean always succeeds (similarly to the "descend" macro). The directory is later removed by the clean recipe. Signed-off-by: Jean-Philippe Brucker <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-11tools: Factor HOSTCC, HOSTLD, HOSTAR definitionsJean-Philippe Brucker6-27/+10
Several Makefiles in tools/ need to define the host toolchain variables. Move their definition to tools/scripts/Makefile.include Signed-off-by: Jean-Philippe Brucker <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Jiri Olsa <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-11perf arm-spe: Fix packet length handlingLeo Yan1-22/+12
When processing address packet and counter packet, if the packet contains extended header, it misses to account the extra one byte for header length calculation, thus returns the wrong packet length. To correct the packet length calculation, one possible fixing is simply to plus extra 1 for extended header, but will spread some duplicate code in the flows for processing address packet and counter packet. Alternatively, we can refine the function arm_spe_get_payload() to not only support short header and allow it to support extended header, and rely on it for the packet length calculation. So this patch refactors function arm_spe_get_payload() with a new argument 'ext_hdr' for support extended header; the packet processing flows can invoke this function to unify the packet length calculation. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Andre Przywara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf arm-spe: Refactor arm_spe_get_events()Leo Yan1-4/+2
In function arm_spe_get_events(), the event packet's 'index' is assigned as payload length, but the flow is not directive: it firstly gets the packet length from the return value of arm_spe_get_payload(), the value includes header length (1) and payload length: int ret = arm_spe_get_payload(buf, len, packet); and then reduces header length from packet length, so finally get the payload length: packet->index = ret - 1; To simplify the code, this patch directly assigns payload length to event packet's index; and at the end it calls arm_spe_get_payload() to return the payload value. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Andre Przywara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf arm-spe: Refactor payload size calculationLeo Yan1-9/+9
This patch defines macro to extract "sz" field from header, and renames the function payloadlen() to arm_spe_payload_len(). Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Andre Przywara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf arm-spe: Fix a typo in commentLeo Yan1-1/+1
Fix a typo: s/iff/if. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Andre Przywara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf arm-spe: Include bitops.h for BIT() macroLeo Yan2-8/+4
Include header linux/bitops.h, directly use its BIT() macro and remove the self defined macros. Committer notes: Use BIT_ULL() instead of BIT to build on 32-bit arches as mentioned in review by Andre Przywara <[email protected]>. I noticed the build failure when crossbuilding to arm32 from x86_64. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Andre Przywara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf mem: Support ARM SPE eventsLeo Yan2-1/+38
This patch adds ARM SPE events for perf memory profiling: 'spe-load': event for only recording memory load ops; 'spe-store': event for only recording memory store ops; 'spe-ldst': event for recording memory load and store ops. Signed-off-by: Leo Yan <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf c2c: Support AUX traceLeo Yan1-0/+12
This patch adds the AUX callbacks in session structure, so support AUX trace for "perf c2c" tool; make itrace memory event as default for "perf c2c", this tells the AUX trace decoder to synthesize samples and can be used for statistics. Signed-off-by: Leo Yan <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf mem: Support AUX traceLeo Yan1-0/+13
The 'perf mem' tool doesn't support AUX trace data so it cannot receive the hardware tracing data. On arm64, although it doesn't support PMU events for memory load and store, ARM SPE is a good candidate for memory profiling, the hardware tracer can record memory accessing operations with affiliated information (e.g. physical address and virtual address for accessing, cache levels, TLB walking, latency, etc). To allow "perf mem" tool to support AUX trace, this patch adds the AUX callbacks for session structure; make itrace memory event as default for "perf mem", this tells the AUX trace decoder to synthesize memory samples. Signed-off-by: Leo Yan <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf auxtrace: Add itrace option '-M' for memory eventsLeo Yan3-0/+7
This patch is to add itrace option '-M' to synthesize memory event. Signed-off-by: Leo Yan <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf mem: Only initialize memory event for recordingLeo Yan1-5/+5
It's needless to initialize memory events for reporting, this patch moves memory event initialization for only recording. Furthermore, the change allows to parse perf data on cross platforms, e.g. perf tool can report result properly even the machine doesn't support the memory events. Signed-off-by: Leo Yan <[email protected]> Acked-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf c2c: Support memory event PERF_MEM_EVENTS__LOAD_STORELeo Yan1-4/+13
When user doesn't specify event name, perf c2c tool enables both the load and store events, and this leads to failure for opening the duplicate PMU device for AUX trace. After the memory event PERF_MEM_EVENTS__LOAD_STORE is introduced, when the user doesn't specify event name, this patch converts the required operation to PERF_MEM_EVENTS__LOAD_STORE if the arch supports it. Otherwise, the tool still rolls back to enable events PERF_MEM_EVENTS__LOAD and PERF_MEM_EVENTS__STORE. Signed-off-by: Leo Yan <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf mem: Support new memory event PERF_MEM_EVENTS__LOAD_STORELeo Yan3-7/+31
On the architectures with perf memory profiling, two types of hardware events have been supported: load and store; if want to profile memory for both load and store operations, the tool will use these two events at the same time, the usage is: # perf mem record -t load,store -- uname But this cannot be applied for AUX tracing event, the same PMU event can be used to only trace memory load, or only memory store, or trace for both memory load and store. This patch introduces a new event PERF_MEM_EVENTS__LOAD_STORE, which is used to support the event which can record both memory load and store operations. When user specifies memory operation type as 'load,store', or doesn't set type so use 'load,store' as default, if the arch supports the event PERF_MEM_EVENTS__LOAD_STORE, the tool will convert the required operations to this single event; otherwise, if the arch doesn't support PERF_MEM_EVENTS__LOAD_STORE, the tool rolls back to enable both events PERF_MEM_EVENTS__LOAD and PERF_MEM_EVENTS__STORE, which keeps the same behaviour with before. Signed-off-by: Leo Yan <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf mem: Introduce weak function perf_mem_events__ptr()Leo Yan4-21/+46
Different architectures might use different event or different event parameters for memory profiling, this patch introduces a weak perf_mem_events__ptr() function which allows to return back a architecture specific memory event. Since the variable 'perf_mem_events' can be only accessed by the perf_mem_events__ptr() function, mark the variable as 'static', this allows the architectures to define its own memory event array. Signed-off-by: Leo Yan <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf mem: Search event name with more flexible pathLeo Yan1-3/+3
The perf tool searches a memory event name under the folder '/sys/devices/cpu/events/', this leads to the limitation for the selection of a memory profiling event which must be under this folder. Thus it's impossible to use any other event as memory event which is not under this specific folder, e.g. Arm SPE hardware event is not located in '/sys/devices/cpu/events/' so it cannot be enabled for memory profiling. This patch changes to search folder from '/sys/devices/cpu/events/' to '/sys/devices', so it give flexibility to find events which can be used for memory profiling. Signed-off-by: Leo Yan <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-10selftest/bpf: Add missed ip6ip6 test backHangbin Liu2-39/+46
In comment 173ca26e9b51 ("samples/bpf: add comprehensive ipip, ipip6, ip6ip6 test") we added ip6ip6 test for bpf tunnel testing. But in commit 933a741e3b82 ("selftests/bpf: bpf tunnel test.") when we moved it to the current folder, we didn't add it. This patch add the ip6ip6 test back to bpf tunnel test. Update the ipip6's topology for both IPv4 and IPv6 testing. Since iperf test is removed as currect framework simplified it in purpose, I also removed unused tcp checkings in test_tunnel_kern.c. Fixes: 933a741e3b82 ("selftests/bpf: bpf tunnel test.") Signed-off-by: Hangbin Liu <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-10tools/bpftool: Add support for in-kernel and named BTF in `btf show`Andrii Nakryiko1-1/+27
Display vmlinux BTF name and kernel module names when listing available BTFs on the system. In human-readable output mode, module BTFs are reported with "name [module-name]", while vmlinux BTF will be reported as "name [vmlinux]". Square brackets are added by bpftool and follow kernel convention when displaying modules in human-readable text outputs. [vmuser@archvm bpf]$ sudo ../../../bpf/bpftool/bpftool btf s 1: name [vmlinux] size 4082281B 6: size 2365B prog_ids 8,6 map_ids 3 7: name [button] size 46895B 8: name [pcspkr] size 42328B 9: name [serio_raw] size 39375B 10: name [floppy] size 57185B 11: name [i2c_core] size 76186B 12: name [crc32c_intel] size 16036B 13: name [i2c_piix4] size 50497B 14: name [irqbypass] size 14124B 15: name [kvm] size 197985B 16: name [kvm_intel] size 123564B 17: name [cryptd] size 42466B 18: name [crypto_simd] size 17187B 19: name [glue_helper] size 39205B 20: name [aesni_intel] size 41034B 25: size 36150B pids bpftool(2519) In JSON mode, two fields (boolean "kernel" and string "name") are reported for each BTF object. vmlinux BTF is reported with name "vmlinux" (kernel itself returns and empty name for vmlinux BTF). [vmuser@archvm bpf]$ sudo ../../../bpf/bpftool/bpftool btf s -jp [{ "id": 1, "size": 4082281, "prog_ids": [], "map_ids": [], "kernel": true, "name": "vmlinux" },{ "id": 6, "size": 2365, "prog_ids": [8,6 ], "map_ids": [3 ], "kernel": false },{ "id": 7, "size": 46895, "prog_ids": [], "map_ids": [], "kernel": true, "name": "button" },{ ... Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Tested-by: Alan Maguire <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-10bpf: Assign ID to vmlinux BTF and return extra info for BTF in GET_OBJ_INFOAndrii Nakryiko1-0/+3
Allocate ID for vmlinux BTF. This makes it visible when iterating over all BTF objects in the system. To allow distinguishing vmlinux BTF (and later kernel module BTF) from user-provided BTFs, expose extra kernel_btf flag, as well as BTF name ("vmlinux" for vmlinux BTF, will equal to module's name for module BTF). We might want to later allow specifying BTF name for user-provided BTFs as well, if that makes sense. But currently this is reserved only for in-kernel BTFs. Having in-kernel BTFs exposed IDs will allow to extend BPF APIs that require in-kernel BTF type with ability to specify BTF types from kernel modules, not just vmlinux BTF. This will be implemented in a follow up patch set for fentry/fexit/fmod_ret/lsm/etc. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]