aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2021-05-27KVM: selftests: compute correct demand paging sizeAxel Rasmussen1-4/+7
This is a preparatory commit needed before we can use different kinds of backing pages for guest memory. Previously, we used perf_test_args.host_page_size, which is the host's native page size (commonly 4K). For VM_MEM_SRC_ANONYMOUS this turns out to be okay, but in a follow-up commit we want to allow using different kinds of backing memory. Take VM_MEM_SRC_ANONYMOUS_HUGETLB for example. Without this change, if we used that backing page type, when we issued a UFFDIO_COPY ioctl we'd only do so with 4K, rather than the full 2M of a backing hugepage. In this case, UFFDIO_COPY returns -EINVAL (__mcopy_atomic_hugetlb checks the size). Signed-off-by: Axel Rasmussen <[email protected]> Message-Id: <[email protected]> Reviewed-by: Ben Gardon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-05-27KVM: selftests: simplify setup_demand_paging error handlingAxel Rasmussen1-32/+18
A small cleanup. Our caller writes: r = setup_demand_paging(...); if (r < 0) exit(-r); Since we're just going to exit anyway, instead of returning an error we can just re-use TEST_ASSERT. This makes the caller simpler, as well as the function itself - no need to write our branches, etc. Signed-off-by: Axel Rasmussen <[email protected]> Message-Id: <[email protected]> Reviewed-by: Ben Gardon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-05-27KVM: selftests: Print a message if /dev/kvm is missingDavid Matlack4-32/+39
If a KVM selftest is run on a machine without /dev/kvm, it will exit silently. Make it easy to tell what's happening by printing an error message. Opportunistically consolidate all codepaths that open /dev/kvm into a single function so they all print the same message. This slightly changes the semantics of vm_is_unrestricted_guest() by changing a TEST_ASSERT() to exit(KSFT_SKIP). However vm_is_unrestricted_guest() is only called in one place (x86_64/mmio_warning_test.c) and that is to determine if the test should be skipped or not. Signed-off-by: David Matlack <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-05-27KVM: selftests: trivial comment/logging fixesAxel Rasmussen2-3/+3
Some trivial fixes I found while touching related code in this series, factored out into a separate commit for easier reviewing: - s/gor/got/ and add a newline in demand_paging_test.c - s/backing_src/src_type/ in a comment to be consistent with the real function signature in kvm_util.c Signed-off-by: Axel Rasmussen <[email protected]> Message-Id: <[email protected]> Reviewed-by: Ben Gardon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-05-27KVM: selftests: Fix hang in hardware_disable_testDavid Matlack1-1/+31
If /dev/kvm is not available then hardware_disable_test will hang indefinitely because the child process exits before posting to the semaphore for which the parent is waiting. Fix this by making the parent periodically check if the child has exited. We have to be careful to forward the child's exit status to preserve a KSFT_SKIP status. I considered just checking for /dev/kvm before creating the child process, but there are so many other reasons why the child could exit early that it seemed better to handle that as general case. Tested: $ ./hardware_disable_test /dev/kvm not available, skipping test $ echo $? 4 $ modprobe kvm_intel $ ./hardware_disable_test $ echo $? 0 Signed-off-by: David Matlack <[email protected]> Message-Id: <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-05-27KVM: selftests: Ignore CPUID.0DH.1H in get_cpuid_testDavid Matlack1-0/+5
Similar to CPUID.0DH.0H this entry depends on the vCPU's XCR0 register and IA32_XSS MSR. Since this test does not control for either before assigning the vCPU's CPUID, these entries will not necessarily match the supported CPUID exposed by KVM. This fixes get_cpuid_test on Cascade Lake CPUs. Suggested-by: Jim Mattson <[email protected]> Signed-off-by: David Matlack <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-05-27KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()David Matlack4-10/+16
vm_get_max_gfn() casts vm->max_gfn from a uint64_t to an unsigned int, which causes the upper 32-bits of the max_gfn to get truncated. Nobody noticed until now likely because vm_get_max_gfn() is only used as a mechanism to create a memslot in an unused region of the guest physical address space (the top), and the top of the 32-bit physical address space was always good enough. This fix reveals a bug in memslot_modification_stress_test which was trying to create a dummy memslot past the end of guest physical memory. Fix that by moving the dummy memslot lower. Fixes: 52200d0d944e ("KVM: selftests: Remove duplicate guest mode handling") Reviewed-by: Venkatesh Srinivas <[email protected]> Signed-off-by: David Matlack <[email protected]> Message-Id: <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Peter Xu <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-05-27KVM: selftests: add a memslot-related performance benchmarkMaciej S. Szmigiero3-0/+1039
This benchmark contains the following tests: * Map test, where the host unmaps guest memory while the guest writes to it (maps it). The test is designed in a way to make the unmap operation on the host take a negligible amount of time in comparison with the mapping operation in the guest. The test area is actually split in two: the first half is being mapped by the guest while the second half in being unmapped by the host. Then a guest <-> host sync happens and the areas are reversed. * Unmap test which is broadly similar to the above map test, but it is designed in an opposite way: to make the mapping operation in the guest take a negligible amount of time in comparison with the unmap operation on the host. This test is available in two variants: with per-page unmap operation or a chunked one (using 2 MiB chunk size). * Move active area test which involves moving the last (highest gfn) memslot a bit back and forth on the host while the guest is concurrently writing around the area being moved (including over the moved memslot). * Move inactive area test which is similar to the previous move active area test, but now guest writes all happen outside of the area being moved. * Read / write test in which the guest writes to the beginning of each page of the test area while the host writes to the middle of each such page. Then each side checks the values the other side has written. This particular test is not expected to give different results depending on particular memslots implementation, it is meant as a rough sanity check and to provide insight on the spread of test results expected. Each test performs its operation in a loop until a test period ends (this is 5 seconds by default, but it is configurable). Then the total count of loops done is divided by the actual elapsed time to give the test result. The tests have a configurable memslot cap with the "-s" test option, by default the system maximum is used. Each test is repeated a particular number of times (by default 20 times), the best result achieved is printed. The test memory area is divided equally between memslots, the reminder is added to the last memslot. The test area size does not depend on the number of memslots in use. The tests also measure the time that it took to add all these memslots. The best result from the tests that use the whole test area is printed after all the requested tests are done. In general, these tests are designed to use as much memory as possible (within reason) while still doing 100+ loops even on high memslot counts with the default test length. Increasing the test runtime makes it increasingly more likely that some event will happen on the system during the test run, which might lower the test result. Signed-off-by: Maciej S. Szmigiero <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Message-Id: <8d31bb3d92bc8fa33a9756fa802ee14266ab994e.1618253574.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <[email protected]>
2021-05-27KVM: selftests: Keep track of memslots more efficientlyMaciej S. Szmigiero4-35/+124
The KVM selftest framework was using a simple list for keeping track of the memslots currently in use. This resulted in lookups and adding a single memslot being O(n), the later due to linear scanning of the existing memslot set to check for the presence of any conflicting entries. Before this change, benchmarking high count of memslots was more or less impossible as pretty much all the benchmark time was spent in the selftest framework code. We can simply use a rbtree for keeping track of both of gfn and hva. We don't need an interval tree for hva here as we can't have overlapping memslots because we allocate a completely new memory chunk for each new memslot. Signed-off-by: Maciej S. Szmigiero <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Message-Id: <b12749d47ee860468240cf027412c91b76dbe3db.1618253574.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <[email protected]>
2021-05-27selftests: kvm: fix potential issue with ELF loadingPaolo Bonzini1-5/+4
vm_vaddr_alloc() sets up GVA to GPA mapping page by page; therefore, GPAs may not be continuous if same memslot is used for data and page table allocation. kvm_vm_elf_load() however expects a continuous range of HVAs (and thus GPAs) because it does not try to read file data page by page. Fix this mismatch by allocating memory in one step. Reported-by: Zhenzhong Duan <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-05-27selftests: kvm: make allocation of extra memory take effectZhenzhong Duan1-1/+1
The extra memory pages is missed to be allocated during VM creating. perf_test_util and kvm_page_table_test use it to alloc extra memory currently. Fix it by adding extra_mem_pages to the total memory calculation before allocate. Signed-off-by: Zhenzhong Duan <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-05-26Merge tag 'net-5.13-rc4' of ↵Linus Torvalds19-284/+911
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.13-rc4, including fixes from bpf, netfilter, can and wireless trees. Notably including fixes for the recently announced "FragAttacks" WiFi vulnerabilities. Rather large batch, touching some core parts of the stack, too, but nothing hair-raising. Current release - regressions: - tipc: make node link identity publish thread safe - dsa: felix: re-enable TAS guard band mode - stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid() - stmmac: fix system hang if change mac address after interface ifdown Current release - new code bugs: - mptcp: avoid OOB access in setsockopt() - bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers - ethtool: stats: fix a copy-paste error - init correct array size Previous releases - regressions: - sched: fix packet stuck problem for lockless qdisc - net: really orphan skbs tied to closing sk - mlx4: fix EEPROM dump support - bpf: fix alu32 const subreg bound tracking on bitwise operations - bpf: fix mask direction swap upon off reg sign change - bpf, offload: reorder offload callback 'prepare' in verifier - stmmac: Fix MAC WoL not working if PHY does not support WoL - packetmmap: fix only tx timestamp on request - tipc: skb_linearize the head skb when reassembling msgs Previous releases - always broken: - mac80211: address recent "FragAttacks" vulnerabilities - mac80211: do not accept/forward invalid EAPOL frames - mptcp: avoid potential error message floods - bpf, ringbuf: deny reserve of buffers larger than ringbuf to prevent out of buffer writes - bpf: forbid trampoline attach for functions with variable arguments - bpf: add deny list of functions to prevent inf recursion of tracing programs - tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT - can: isotp: prevent race between isotp_bind() and isotp_setsockopt() - netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check, fallback to non-AVX2 version Misc: - bpf: add kconfig knob for disabling unpriv bpf by default" * tag 'net-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (172 commits) net: phy: Document phydev::dev_flags bits allocation mptcp: validate 'id' when stopping the ADD_ADDR retransmit timer mptcp: avoid error message on infinite mapping mptcp: drop unconditional pr_warn on bad opt mptcp: avoid OOB access in setsockopt() nfp: update maintainer and mailing list addresses net: mvpp2: add buffer header handling in RX bnx2x: Fix missing error code in bnx2x_iov_init_one() net: zero-initialize tc skb extension on allocation net: hns: Fix kernel-doc sctp: fix the proc_handler for sysctl encap_port sctp: add the missing setting for asoc encap_port bpf, selftests: Adjust few selftest result_unpriv outcomes bpf: No need to simulate speculative domain for immediates bpf: Fix mask direction swap upon off reg sign change bpf: Wrap aux data inside bpf_sanitize_info container bpf: Fix BPF_LSM kconfig symbol dependency selftests/bpf: Add test for l3 use of bpf_redirect_peer bpftool: Add sock_release help info for cgroup attach/prog load command net: dsa: microchip: enable phy errata workaround on 9567 ...
2021-05-26libbpf: Move BPF_SEQ_PRINTF and BPF_SNPRINTF to bpf_helpers.hFlorent Revest16-68/+74
These macros are convenient wrappers around the bpf_seq_printf and bpf_snprintf helpers. They are currently provided by bpf_tracing.h which targets low level tracing primitives. bpf_helpers.h is a better fit. The __bpf_narg and __bpf_apply are needed in both files and provided twice. __bpf_empty isn't used anywhere and is removed from bpf_tracing.h Reported-by: Andrii Nakryiko <[email protected]> Signed-off-by: Florent Revest <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-05-26perf jevents: Fix getting maximum number of fdsFelix Fietkau1-1/+1
On some hosts, rlim.rlim_max can be returned as RLIM_INFINITY. By casting it to int, it is interpreted as -1, which will cause get_maxfds to return 0, causing "Invalid argument" errors in nftw() calls. Fix this by casting the second argument of min() to rlim_t instead. Fixes: 80eeb67fe577 ("perf jevents: Program to convert JSON file") Signed-off-by: Felix Fietkau <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-26perf inject: Do not inject BUILD_ID record if MMAP2 has itNamhyung Kim1-0/+24
When PERF_RECORD_MISC_MMAP_BUILD_ID is set, the event has a build-id of the DSO already so no need to add it again. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-26perf inject: Call dso__put() even if dso->hit is setNamhyung Kim1-2/+2
Otherwise it'll leak the refcount for the DSO. As dso__put() can handle a NULL dso pointer, we can just call it unconditionally. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-26selftests/bpf: Add xdp_redirect_multi testHangbin Liu4-1/+526
Add a bpf selftest for new helper xdp_redirect_map_multi(). In this test there are 3 forward groups and 1 exclude group. The test will redirect each interface's packets to all the interfaces in the forward group, and exclude the interface in exclude map. Two maps (DEVMAP, DEVMAP_HASH) and two xdp modes (generic, drive) will be tested. XDP egress program will also be tested by setting pkt src MAC to egress interface's MAC address. For more test details, you can find it in the test script. Here is the test result. ]# time ./test_xdp_redirect_multi.sh Pass: xdpgeneric arp(F_BROADCAST) ns1-1 Pass: xdpgeneric arp(F_BROADCAST) ns1-2 Pass: xdpgeneric arp(F_BROADCAST) ns1-3 Pass: xdpgeneric IPv4 (F_BROADCAST|F_EXCLUDE_INGRESS) ns1-1 Pass: xdpgeneric IPv4 (F_BROADCAST|F_EXCLUDE_INGRESS) ns1-2 Pass: xdpgeneric IPv4 (F_BROADCAST|F_EXCLUDE_INGRESS) ns1-3 Pass: xdpgeneric IPv6 (no flags) ns1-1 Pass: xdpgeneric IPv6 (no flags) ns1-2 Pass: xdpdrv arp(F_BROADCAST) ns1-1 Pass: xdpdrv arp(F_BROADCAST) ns1-2 Pass: xdpdrv arp(F_BROADCAST) ns1-3 Pass: xdpdrv IPv4 (F_BROADCAST|F_EXCLUDE_INGRESS) ns1-1 Pass: xdpdrv IPv4 (F_BROADCAST|F_EXCLUDE_INGRESS) ns1-2 Pass: xdpdrv IPv4 (F_BROADCAST|F_EXCLUDE_INGRESS) ns1-3 Pass: xdpdrv IPv6 (no flags) ns1-1 Pass: xdpdrv IPv6 (no flags) ns1-2 Pass: xdpegress mac ns1-2 Pass: xdpegress mac ns1-3 Summary: PASS 18, FAIL 0 real 1m18.321s user 0m0.123s sys 0m0.350s Signed-off-by: Hangbin Liu <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-05-26xdp: Extend xdp_redirect_map with broadcast supportHangbin Liu1-2/+12
This patch adds two flags BPF_F_BROADCAST and BPF_F_EXCLUDE_INGRESS to extend xdp_redirect_map for broadcast support. With BPF_F_BROADCAST the packet will be broadcasted to all the interfaces in the map. with BPF_F_EXCLUDE_INGRESS the ingress interface will be excluded when do broadcasting. When getting the devices in dev hash map via dev_map_hash_get_next_key(), there is a possibility that we fall back to the first key when a device was removed. This will duplicate packets on some interfaces. So just walk the whole buckets to avoid this issue. For dev array map, we also walk the whole map to find valid interfaces. Function bpf_clear_redirect_map() was removed in commit ee75aef23afe ("bpf, xdp: Restructure redirect actions"). Add it back as we need to use ri->map again. With test topology: +-------------------+ +-------------------+ | Host A (i40e 10G) | ---------- | eno1(i40e 10G) | +-------------------+ | | | Host B | +-------------------+ | | | Host C (i40e 10G) | ---------- | eno2(i40e 10G) | +-------------------+ | | | +------+ | | veth0 -- | Peer | | | veth1 -- | | | | veth2 -- | NS | | | +------+ | +-------------------+ On Host A: # pktgen/pktgen_sample03_burst_single_flow.sh -i eno1 -d $dst_ip -m $dst_mac -s 64 On Host B(Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz, 128G Memory): Use xdp_redirect_map and xdp_redirect_map_multi in samples/bpf for testing. All the veth peers in the NS have a XDP_DROP program loaded. The forward_map max_entries in xdp_redirect_map_multi is modify to 4. Testing the performance impact on the regular xdp_redirect path with and without patch (to check impact of additional check for broadcast mode): 5.12 rc4 | redirect_map i40e->i40e | 2.0M | 9.7M 5.12 rc4 | redirect_map i40e->veth | 1.7M | 11.8M 5.12 rc4 + patch | redirect_map i40e->i40e | 2.0M | 9.6M 5.12 rc4 + patch | redirect_map i40e->veth | 1.7M | 11.7M Testing the performance when cloning packets with the redirect_map_multi test, using a redirect map size of 4, filled with 1-3 devices: 5.12 rc4 + patch | redirect_map multi i40e->veth (x1) | 1.7M | 11.4M 5.12 rc4 + patch | redirect_map multi i40e->veth (x2) | 1.1M | 4.3M 5.12 rc4 + patch | redirect_map multi i40e->veth (x3) | 0.8M | 2.6M Signed-off-by: Hangbin Liu <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Acked-by: John Fastabend <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-05-25bpftool: Set errno on skeleton failures and propagate errorsAndrii Nakryiko1-9/+18
Follow libbpf's error handling conventions and pass through errors and errno properly. Skeleton code always returned NULL on errors (not ERR_PTR(err)), so there are no backwards compatibility concerns. But now we also set errno properly, so it's possible to distinguish different reasons for failure, if necessary. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-05-25libbpf: Streamline error reporting for high-level APIsAndrii Nakryiko9-468/+531
Implement changes to error reporting for high-level libbpf APIs to make them less surprising and less error-prone to users: - in all the cases when error happens, errno is set to an appropriate error value; - in libbpf 1.0 mode, all pointer-returning APIs return NULL on error and error code is communicated through errno; this applies both to APIs that already returned NULL before (so now they communicate more detailed error codes), as well as for many APIs that used ERR_PTR() macro and encoded error numbers as fake pointers. - in legacy (default) mode, those APIs that were returning ERR_PTR(err), continue doing so, but still set errno. With these changes, errno can be always used to extract actual error, regardless of legacy or libbpf 1.0 modes. This is utilized internally in libbpf in places where libbpf uses it's own high-level APIs. libbpf_get_error() is adapted to handle both cases completely transparently to end-users (and is used by libbpf consistently as well). More context, justification, and discussion can be found in "Libbpf: the road to v1.0" document ([0]). [0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54nNTY Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-05-25libbpf: Streamline error reporting for low-level APIsAndrii Nakryiko3-50/+156
Ensure that low-level APIs behave uniformly across the libbpf as follows: - in case of an error, errno is always set to the correct error code; - when libbpf 1.0 mode is enabled with LIBBPF_STRICT_DIRECT_ERRS option to libbpf_set_strict_mode(), return -Exxx error value directly, instead of -1; - by default, until libbpf 1.0 is released, keep returning -1 directly. More context, justification, and discussion can be found in "Libbpf: the road to v1.0" document ([0]). [0] https://docs.google.com/document/d/1UyjTZuPFWiPFyKk1tV5an11_iaRuec6U-ZESZ54nNTY Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-05-25selftests/bpf: Turn on libbpf 1.0 mode and fix all IS_ERR checksAndrii Nakryiko56-425/+347
Turn ony libbpf 1.0 mode. Fix all the explicit IS_ERR checks that now will be broken because libbpf returns NULL on error (and sets errno). Fix ASSERT_OK_PTR and ASSERT_ERR_PTR to work for both old mode and new modes and use them throughout selftests. This is trivial to do by using libbpf_get_error() API that all libbpf users are supposed to use, instead of IS_ERR checks. A bunch of checks also did explicit -1 comparison for various fd-returning APIs. Such checks are replaced with >= 0 or < 0 cases. There were also few misuses of bpf_object__find_map_by_name() in test_maps. Those are fixed in this patch as well. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-05-25libbpf: Add libbpf_set_strict_mode() API to turn on libbpf 1.0 behaviorsAndrii Nakryiko5-0/+71
Add libbpf_set_strict_mode() API that allows application to simulate libbpf 1.0 breaking changes before libbpf 1.0 is released. This will help users migrate gradually and with confidence. For now only ALL or NONE options are available, subsequent patches will add more flags. This patch is preliminary for selftests/bpf changes. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-05-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller11-200/+467
Daniel Borkmann says: ==================== pull-request: bpf 2021-05-26 The following pull-request contains BPF updates for your *net* tree. We've added 14 non-merge commits during the last 14 day(s) which contain a total of 17 files changed, 513 insertions(+), 231 deletions(-). The main changes are: 1) Fix bpf_skb_change_head() helper to reset mac_len, from Jussi Maki. 2) Fix masking direction swap upon off-reg sign change, from Daniel Borkmann. 3) Fix BPF offloads in verifier by reordering driver callback, from Yinjun Zhang. 4) BPF selftest for ringbuf mmap ro/rw restrictions, from Andrii Nakryiko. 5) Follow-up fixes to nested bprintf per-cpu buffers, from Florent Revest. 6) Fix bpftool sock_release attach point help info, from Liu Jian. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-05-25bpf, selftests: Adjust few selftest result_unpriv outcomesDaniel Borkmann2-10/+0
Given we don't need to simulate the speculative domain for registers with immediates anymore since the verifier uses direct imm-based rewrites instead of having to mask, we can also lift a few cases that were previously rejected. Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]>
2021-05-25kselftest/arm64: Add missing newline to SVE test skipping outputMark Brown1-1/+1
The newline is expected to come from the caller but got missed for this test. Signed-off-by: Mark Brown <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2021-05-25selftests/bpf: Add test for l3 use of bpf_redirect_peerJussi Maki2-178/+405
Add a test case for using bpf_skb_change_head() in combination with bpf_redirect_peer() to redirect a packet from a L3 device to veth and back. The test uses a BPF program that adds L2 headers to the packet coming from a L3 device and then calls bpf_redirect_peer() to redirect the packet to a veth device. The test fails as skb->mac_len is not set properly and thus the ethernet headers are not properly skb_pull'd in cls_bpf_classify(), causing tcp_v4_rcv() to point the TCP header into middle of the IP header. Signed-off-by: Jussi Maki <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-05-25bpftool: Add sock_release help info for cgroup attach/prog load commandLiu Jian5-7/+10
The help information was not added at the time when the function got added. Fix this and add the missing information to its cli, documentation and bash completion. Fixes: db94cc0b4805 ("bpftool: Add support for BPF_CGROUP_INET_SOCK_RELEASE") Signed-off-by: Liu Jian <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Quentin Monnet <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-05-25perf scripts python: intel-pt-events.py: Add branches to scriptAdrian Hunter3-35/+116
As an example, add branch information to intel-pt-events.py script. This shows how a simple python script can be used to customize perf script output for Intel PT branch traces or power event traces. Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf scripting python: Add auxtrace errorAdrian Hunter3-0/+57
Add auxtrace_error to general python scripting. Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf scripting python: Add context switchAdrian Hunter1-0/+45
Add context_switch to general python scripting. Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf scripting python: Add cpumodeAdrian Hunter1-0/+3
Add cpumode to python scripting. Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf scripting python: Add IPCAdrian Hunter1-0/+8
Add IPC to python scripting. Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf scripting python: Add sample flagsAdrian Hunter1-0/+26
Add sample flags to python scripting. Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf script: Factor out perf_sample__sprintf_flags()Adrian Hunter2-10/+21
Factor out perf_sample__sprintf_flags() so it can be reused. Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf scripting python: Add 'addr_location' for 'addr'Adrian Hunter7-23/+41
If sample addr correlates to a symbol, add "addr_dso", "addr_symbol", and "addr_symoff" to python scripting. Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf scripting python: Factor out set_sym_in_dict()Adrian Hunter1-8/+17
Factor out set_sym_in_dict() so it can be reused. Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf scripting python: Fix tuple_set_u64()Adrian Hunter1-65/+81
tuple_set_u64() produces a signed value instead of an unsigned value. That works for database export but not other cases. Rename to tuple_set_d64() for database export and fix tuple_set_u64(). Fixes: df919b400ad3f ("perf scripting python: Extend interface to export data in a database-friendly way") Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-05-25perf auxtrace: Make perf_event__process_auxtrace*() callableArnaldo Carvalho de Melo1-3/+20
As we'll use it in the upcoming python interfaces and when built with: make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 +NO_LIBZSTD=1 NO_LIBCAP=1 NO_SYSCALL_TABLE=1 make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 NO_LIBZSTD=1 NO_LIBCAP=1 +NO_SYSCALL_TABLE=1 BUILD: Doing 'make -j24' parallel build <SNIP> CC /tmp/tmp.rGrdpQlTCr/builtin-daemon.o In file included from util/events_stats.h:8, from util/evlist.h:12, from builtin-script.c:18: builtin-script.c: In function ‘process_auxtrace_error’: util/auxtrace.h:708:57: error: called object is not a function or function pointer 708 | #define perf_event__process_auxtrace_error 0 | ^ builtin-script.c:2443:16: note: in expansion of macro ‘perf_event__process_auxtrace_error’ 2443 | return perf_event__process_auxtrace_error(session, event); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ MKDIR /tmp/tmp.rGrdpQlTCr/tests/ MKDIR /tmp/tmp.rGrdpQlTCr/bench/ CC /tmp/tmp.rGrdpQlTCr/tests/builtin-test.o CC /tmp/tmp.rGrdpQlTCr/bench/sched-messaging.o builtin-script.c:2444:1: error: control reaches end of non-void function [-Werror=return-type] 2444 | } | ^ To: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf script: Find script file relative to exec pathAdrian Hunter5-2/+46
Allow perf script to find a script in the exec path. Example: Before: $ perf record -a -e intel_pt/branch=0/ sleep 0.1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.954 MB perf.data ] $ perf script intel-pt-events.py 2>&1 | head -3 Error: Couldn't find script `intel-pt-events.py' See perf script -l for available scripts. $ perf script -s intel-pt-events.py 2>&1 | head -3 Can't open python script "intel-pt-events.py": No such file or directory $ perf script ~/libexec/perf-core/scripts/python/intel-pt-events.py 2>&1 | head -3 Error: Couldn't find script `/home/ahunter/libexec/perf-core/scripts/python/intel-pt-events.py' See perf script -l for available scripts. $ After: $ perf script intel-pt-events.py 2>&1 | head -3 Intel PT Power Events and PTWRITE perf 8123/8123 [000] 551.230753986 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown]) perf 8123/8123 [001] 551.230808216 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown]) $ perf script -s intel-pt-events.py 2>&1 | head -3 Intel PT Power Events and PTWRITE perf 8123/8123 [000] 551.230753986 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown]) perf 8123/8123 [001] 551.230808216 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown]) $ perf script ~/libexec/perf-core/scripts/python/intel-pt-events.py 2>&1 | head -3 Intel PT Power Events and PTWRITE perf 8123/8123 [000] 551.230753986 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown]) perf 8123/8123 [001] 551.230808216 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown]) $ Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf arm-spe: Remove redundant checking for "full_auxtrace"Leo Yan1-1/+1
The option "opts->full_auxtrace" is checked at the earlier place, if it is false the function will directly bail out. So remove the redundant checking for "opts->full_auxtrace". Suggested-by: James Clark <[email protected]> Signed-off-by: Leo Yan <[email protected]> Reviewed-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Al Grant <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf arm-spe: Enable timestamp for per-cpu modeLeo Yan1-2/+31
For per-cpu mmap, it should enable timestamp tracing for Arm SPE; this is helpful for samples correlation. To automatically enable the timestamp, a helper arm_spe_set_timestamp() is introduced for setting "ts_enable" format bit. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: James Clark <[email protected]> Tested-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Al Grant <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: [email protected] Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf arm-spe: Correct sample flags for dummy eventLeo Yan1-3/+4
The dummy event is mainly used for mmap, the TIME sample is only needed for per-cpu case so that the perf tool can rely on the correct timing for parsing symbols. And the CPU sample is useless for mmap. The BRANCH_STACK sample bit will be always reset for the dummy event in the function evsel__config(), so don't need to repeatedly reset it for Arm SPE specific. So this patch only enables TIME sample for per-cpu mmap. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: James Clark <[email protected]> Tested-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Al Grant <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf arm-spe: Correct sample flags for SPE eventLeo Yan1-3/+4
Now it's hard code to set sample flags for CPU, TIME and TID for SPE event, which is pointless. The CPU is useful for sampling only for per-mmap case, it is used to indicate the AUX trace is associated to which CPU. The TIME sample is not needed for AUX event, since the time for AUX event is not really used and this time is a different thing from the timestamp in Arm SPE trace, the timestamp tracing which is controlled by Arm SPE's config bit. The TID sample is not useful for AUX event. This patch corrects the sample flags for SPE event, it only set CPU sample bit for per-cpu mmap case. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: James Clark <[email protected]> Tested-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Al Grant <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf vendor events intel: Update event list for Icelake ClientJin Yao8-1575/+3296
- Update core and uncore events for Icelake client to perf. - Add ICL metrics. Based on ICL event list v1.10: https://download.01.org/perfmon/ICL/ Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf vendor events intel: Add metrics for Icelake ServerJin Yao1-0/+327
Add JSON metrics for Icelake Server to perf. Based on TMA metrics 4.21 at 01.org.: https://download.01.org/perfmon/ Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf vendor events intel: Add uncore event list for Icelake ServerJin Yao3-0/+2819
Add JSON uncore events for Icelake Server to perf. Based on JSON list v1.04: https://download.01.org/perfmon/ICX/ Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25perf vendor events intel: Add core event list for Icelake ServerJin Yao8-0/+2961
Add JSON core events for Icelake Server to perf. Based on JSON list v1.04: https://download.01.org/perfmon/ICX/ Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-25Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo33-98/+167
To pick up fixes from perf/urgent. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-05-24libbpf: Add support for new llvm bpf relocationsYonghong Song2-1/+8
LLVM patch https://reviews.llvm.org/D102712 narrowed the scope of existing R_BPF_64_64 and R_BPF_64_32 relocations, and added three new relocations, R_BPF_64_ABS64, R_BPF_64_ABS32 and R_BPF_64_NODYLD32. The main motivation is to make relocations linker friendly. This change, unfortunately, breaks libbpf build, and we will see errors like below: libbpf: ELF relo #0 in section #6 has unexpected type 2 in /home/yhs/work/bpf-next/tools/testing/selftests/bpf/bpf_tcp_nogpl.o Error: failed to link '/home/yhs/work/bpf-next/tools/testing/selftests/bpf/bpf_tcp_nogpl.o': Unknown error -22 (-22) The new relocation R_BPF_64_ABS64 is generated and libbpf linker sanity check doesn't understand it. Relocation section '.rel.struct_ops' at offset 0x1410 contains 1 entries: Offset Info Type Symbol's Value Symbol's Name 0000000000000018 0000000700000002 R_BPF_64_ABS64 0000000000000000 nogpltcp_init Look at the selftests/bpf/bpf_tcp_nogpl.c, void BPF_STRUCT_OPS(nogpltcp_init, struct sock *sk) { } SEC(".struct_ops") struct tcp_congestion_ops bpf_nogpltcp = { .init = (void *)nogpltcp_init, .name = "bpf_nogpltcp", }; The new llvm relocation scheme categorizes 'nogpltcp_init' reference as R_BPF_64_ABS64 instead of R_BPF_64_64 which is used to specify ld_imm64 relocation in the new scheme. Let us fix the linker sanity checking by including R_BPF_64_ABS64 and R_BPF_64_ABS32. There is no need to check R_BPF_64_NODYLD32 which is used for .BTF and .BTF.ext. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]