aboutsummaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2022-03-25Merge tag 's390-5.18-1' of ↵Linus Torvalds1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Raise minimum supported machine generation to z10, which comes with various cleanups and code simplifications (usercopy/spectre mitigation/etc). - Rework extables and get rid of anonymous out-of-line fixups. - Page table helpers cleanup. Add set_pXd()/set_pte() helper functions. Covert pte_val()/pXd_val() macros to functions. - Optimize kretprobe handling by avoiding extra kprobe on __kretprobe_trampoline. - Add support for CEX8 crypto cards. - Allow to trigger AP bus rescan via writing to /sys/bus/ap/scans. - Add CONFIG_EXPOLINE_EXTERN option to build the kernel without COMDAT group sections which simplifies kpatch support. - Always use the packed stack layout and extend kernel unwinder tests. - Add sanity checks for ftrace code patching. - Add s390dbf debug log for the vfio_ap device driver. - Various virtual vs physical address confusion fixes. - Various small fixes and improvements all over the code. * tag 's390-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (69 commits) s390/test_unwind: add kretprobe tests s390/kprobes: Avoid additional kprobe in kretprobe handling s390: convert ".insn" encoding to instruction names s390: assume stckf is always present s390/nospec: move to single register thunks s390: raise minimum supported machine generation to z10 s390/uaccess: Add copy_from/to_user_key functions s390/nospec: align and size extern thunks s390/nospec: add an option to use thunk-extern s390/nospec: generate single register thunks if possible s390/pci: make zpci_set_irq()/zpci_clear_irq() static s390: remove unused expoline to BC instructions s390/irq: use assignment instead of cast s390/traps: get rid of magic cast for per code s390/traps: get rid of magic cast for program interruption code s390/signal: fix typo in comments s390/asm-offsets: remove unused defines s390/test_unwind: avoid build warning with W=1 s390: remove .fixup section s390/bpf: encode register within extable entry ...
2022-03-24Merge tag 'net-next-5.18' of ↵Linus Torvalds3-56/+48
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "The sprinkling of SPI drivers is because we added a new one and Mark sent us a SPI driver interface conversion pull request. Core ---- - Introduce XDP multi-buffer support, allowing the use of XDP with jumbo frame MTUs and combination with Rx coalescing offloads (LRO). - Speed up netns dismantling (5x) and lower the memory cost a little. Remove unnecessary per-netns sockets. Scope some lists to a netns. Cut down RCU syncing. Use batch methods. Allow netdev registration to complete out of order. - Support distinguishing timestamp types (ingress vs egress) and maintaining them across packet scrubbing points (e.g. redirect). - Continue the work of annotating packet drop reasons throughout the stack. - Switch netdev error counters from an atomic to dynamically allocated per-CPU counters. - Rework a few preempt_disable(), local_irq_save() and busy waiting sections problematic on PREEMPT_RT. - Extend the ref_tracker to allow catching use-after-free bugs. BPF --- - Introduce "packing allocator" for BPF JIT images. JITed code is marked read only, and used to be allocated at page granularity. Custom allocator allows for more efficient memory use, lower iTLB pressure and prevents identity mapping huge pages from getting split. - Make use of BTF type annotations (e.g. __user, __percpu) to enforce the correct probe read access method, add appropriate helpers. - Convert the BPF preload to use light skeleton and drop the user-mode-driver dependency. - Allow XDP BPF_PROG_RUN test infra to send real packets, enabling its use as a packet generator. - Allow local storage memory to be allocated with GFP_KERNEL if called from a hook allowed to sleep. - Introduce fprobe (multi kprobe) to speed up mass attachment (arch bits to come later). - Add unstable conntrack lookup helpers for BPF by using the BPF kfunc infra. - Allow cgroup BPF progs to return custom errors to user space. - Add support for AF_UNIX iterator batching. - Allow iterator programs to use sleepable helpers. - Support JIT of add, and, or, xor and xchg atomic ops on arm64. - Add BTFGen support to bpftool which allows to use CO-RE in kernels without BTF info. - Large number of libbpf API improvements, cleanups and deprecations. Protocols --------- - Micro-optimize UDPv6 Tx, gaining up to 5% in test on dummy netdev. - Adjust TSO packet sizes based on min_rtt, allowing very low latency links (data centers) to always send full-sized TSO super-frames. - Make IPv6 flow label changes (AKA hash rethink) more configurable, via sysctl and setsockopt. Distinguish between server and client behavior. - VxLAN support to "collect metadata" devices to terminate only configured VNIs. This is similar to VLAN filtering in the bridge. - Support inserting IPv6 IOAM information to a fraction of frames. - Add protocol attribute to IP addresses to allow identifying where given address comes from (kernel-generated, DHCP etc.) - Support setting socket and IPv6 options via cmsg on ping6 sockets. - Reject mis-use of ECN bits in IP headers as part of DSCP/TOS. Define dscp_t and stop taking ECN bits into account in fib-rules. - Add support for locked bridge ports (for 802.1X). - tun: support NAPI for packets received from batched XDP buffs, doubling the performance in some scenarios. - IPv6 extension header handling in Open vSwitch. - Support IPv6 control message load balancing in bonding, prevent neighbor solicitation and advertisement from using the wrong port. Support NS/NA monitor selection similar to existing ARP monitor. - SMC - improve performance with TCP_CORK and sendfile() - support auto-corking - support TCP_NODELAY - MCTP (Management Component Transport Protocol) - add user space tag control interface - I2C binding driver (as specified by DMTF DSP0237) - Multi-BSSID beacon handling in AP mode for WiFi. - Bluetooth: - handle MSFT Monitor Device Event - add MGMT Adv Monitor Device Found/Lost events - Multi-Path TCP: - add support for the SO_SNDTIMEO socket option - lots of selftest cleanups and improvements - Increase the max PDU size in CAN ISOTP to 64 kB. Driver API ---------- - Add HW counters for SW netdevs, a mechanism for devices which offload packet forwarding to report packet statistics back to software interfaces such as tunnels. - Select the default NIC queue count as a fraction of number of physical CPU cores, instead of hard-coding to 8. - Expose devlink instance locks to drivers. Allow device layer of drivers to use that lock directly instead of creating their own which always runs into ordering issues in devlink callbacks. - Add header/data split indication to guide user space enabling of TCP zero-copy Rx. - Allow configuring completion queue event size. - Refactor page_pool to enable fragmenting after allocation. - Add allocation and page reuse statistics to page_pool. - Improve Multiple Spanning Trees support in the bridge to allow reuse of topologies across VLANs, saving HW resources in switches. - DSA (Distributed Switch Architecture): - replay and offload of host VLAN entries - offload of static and local FDB entries on LAG interfaces - FDB isolation and unicast filtering New hardware / drivers ---------------------- - Ethernet: - LAN937x T1 PHYs - Davicom DM9051 SPI NIC driver - Realtek RTL8367S, RTL8367RB-VB switch and MDIO - Microchip ksz8563 switches - Netronome NFP3800 SmartNICs - Fungible SmartNICs - MediaTek MT8195 switches - WiFi: - mt76: MediaTek mt7916 - mt76: MediaTek mt7921u USB adapters - brcmfmac: Broadcom BCM43454/6 - Mobile: - iosm: Intel M.2 7360 WWAN card Drivers ------- - Convert many drivers to the new phylink API built for split PCS designs but also simplifying other cases. - Intel Ethernet NICs: - add TTY for GNSS module for E810T device - improve AF_XDP performance - GTP-C and GTP-U filter offload - QinQ VLAN support - Mellanox Ethernet NICs (mlx5): - support xdp->data_meta - multi-buffer XDP - offload tc push_eth and pop_eth actions - Netronome Ethernet NICs (nfp): - flow-independent tc action hardware offload (police / meter) - AF_XDP - Other Ethernet NICs: - at803x: fiber and SFP support - xgmac: mdio: preamble suppression and custom MDC frequencies - r8169: enable ASPM L1.2 if system vendor flags it as safe - macb/gem: ZynqMP SGMII - hns3: add TX push mode - dpaa2-eth: software TSO - lan743x: multi-queue, mdio, SGMII, PTP - axienet: NAPI and GRO support - Mellanox Ethernet switches (mlxsw): - source and dest IP address rewrites - RJ45 ports - Marvell Ethernet switches (prestera): - basic routing offload - multi-chain TC ACL offload - NXP embedded Ethernet switches (ocelot & felix): - PTP over UDP with the ocelot-8021q DSA tagging protocol - basic QoS classification on Felix DSA switch using dcbnl - port mirroring for ocelot switches - Microchip high-speed industrial Ethernet (sparx5): - offloading of bridge port flooding flags - PTP Hardware Clock - Other embedded switches: - lan966x: PTP Hardward Clock - qca8k: mdio read/write operations via crafted Ethernet packets - Qualcomm 802.11ax WiFi (ath11k): - add LDPC FEC type and 802.11ax High Efficiency data in radiotap - enable RX PPDU stats in monitor co-exist mode - Intel WiFi (iwlwifi): - UHB TAS enablement via BIOS - band disablement via BIOS - channel switch offload - 32 Rx AMPDU sessions in newer devices - MediaTek WiFi (mt76): - background radar detection - thermal management improvements on mt7915 - SAR support for more mt76 platforms - MBSSID and 6 GHz band on mt7915 - RealTek WiFi: - rtw89: AP mode - rtw89: 160 MHz channels and 6 GHz band - rtw89: hardware scan - Bluetooth: - mt7921s: wake on Bluetooth, SCO over I2S, wide-band-speed (WBS) - Microchip CAN (mcp251xfd): - multiple RX-FIFOs and runtime configurable RX/TX rings - internal PLL, runtime PM handling simplification - improve chip detection and error handling after wakeup" * tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2521 commits) llc: fix netdevice reference leaks in llc_ui_bind() drivers: ethernet: cpsw: fix panic when interrupt coaleceing is set via ethtool ice: don't allow to run ice_send_event_to_aux() in atomic ctx ice: fix 'scheduling while atomic' on aux critical err interrupt net/sched: fix incorrect vlan_push_eth dest field net: bridge: mst: Restrict info size queries to bridge ports net: marvell: prestera: add missing destroy_workqueue() in prestera_module_init() drivers: net: xgene: Fix regression in CRC stripping net: geneve: add missing netlink policy and size for IFLA_GENEVE_INNER_PROTO_INHERIT net: dsa: fix missing host-filtered multicast addresses net/mlx5e: Fix build warning, detected write beyond size of field iwlwifi: mvm: Don't fail if PPAG isn't supported selftests/bpf: Fix kprobe_multi test. Revert "rethook: x86: Add rethook x86 implementation" Revert "arm64: rethook: Add arm64 rethook implementation" Revert "powerpc: Add rethook support" Revert "ARM: rethook: Add rethook arm implementation" netdevice: add missing dm_private kdoc net: bridge: mst: prevent NULL deref in br_mst_info_size() selftests: forwarding: Use same VRF for port and VLAN upper ...
2022-03-23Merge tag 'asm-generic-5.18' of ↵Linus Torvalds3-31/+0
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "There are three sets of updates for 5.18 in the asm-generic tree: - The set_fs()/get_fs() infrastructure gets removed for good. This was already gone from all major architectures, but now we can finally remove it everywhere, which loses some particularly tricky and error-prone code. There is a small merge conflict against a parisc cleanup, the solution is to use their new version. - The nds32 architecture ends its tenure in the Linux kernel. The hardware is still used and the code is in reasonable shape, but the mainline port is not actively maintained any more, as all remaining users are thought to run vendor kernels that would never be updated to a future release. - A series from Masahiro Yamada cleans up some of the uapi header files to pass the compile-time checks" * tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (27 commits) nds32: Remove the architecture uaccess: remove CONFIG_SET_FS ia64: remove CONFIG_SET_FS support sh: remove CONFIG_SET_FS support sparc64: remove CONFIG_SET_FS support lib/test_lockup: fix kernel pointer check for separate address spaces uaccess: generalize access_ok() uaccess: fix type mismatch warnings from access_ok() arm64: simplify access_ok() m68k: fix access_ok for coldfire MIPS: use simpler access_ok() MIPS: Handle address errors for accesses above CPU max virtual user address uaccess: add generic __{get,put}_kernel_nofault nios2: drop access_ok() check from __put_user() x86: use more conventional access_ok() definition x86: remove __range_not_ok() sparc64: add __{get,put}_kernel_nofault() nds32: fix access_ok() checks in get/put_user uaccess: fix nios2 and microblaze get_user_8() sparc64: fix building assembly files ...
2022-03-22Merge tag 'perf-core-2022-03-21' of ↵Linus Torvalds1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf event updates from Ingo Molnar: - Fix address filtering for Intel/PT,ARM/CoreSight - Enable Intel/PEBS format 5 - Allow more fixed-function counters for x86 - Intel/PT: Enable not recording Taken-Not-Taken packets - Add a few branch-types * tag 'perf-core-2022-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Fix the build on !CONFIG_PHYS_ADDR_T_64BIT perf: Add irq and exception return branch types perf/x86/intel/uncore: Make uncore_discovery clean for 64 bit addresses perf/x86/intel/pt: Add a capability and config bit for disabling TNTs perf/x86/intel/pt: Add a capability and config bit for event tracing perf/x86/intel: Increase max number of the fixed counters KVM: x86: use the KVM side max supported fixed counter perf/x86/intel: Enable PEBS format 5 perf/core: Allow kernel address filter when not filtering the kernel perf/x86/intel/pt: Fix address filter config for 32-bit kernel perf/core: Fix address filter parser for multiple filters x86: Share definition of __is_canonical_address() perf/x86/intel/pt: Relax address filter validation
2022-03-21Merge tag 'x86_misc_for_v5.18_rc1' of ↵Linus Torvalds3-0/+3533
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Borislav Petkov: - Add support for a couple new insn sets to the insn decoder: AVX512-FP16, AMX, other misc insns. - Update VMware-specific MAINTAINERS entries * tag 'x86_misc_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: MAINTAINERS: Mark VMware mailing list entries as email aliases MAINTAINERS: Add Zack as maintainer of vmmouse driver MAINTAINERS: Update maintainers for paravirt ops and VMware hypervisor interface x86/insn: Add AVX512-FP16 instructions to the x86 instruction decoder perf/tests: Add AVX512-FP16 instructions to x86 instruction decoder test x86/insn: Add misc instructions to x86 instruction decoder perf/tests: Add misc instructions to the x86 instruction decoder test x86/insn: Add AMX instructions to the x86 instruction decoder perf/tests: Add AMX instructions to x86 instruction decoder test
2022-03-21Merge tag 'arm64-upstream' of ↵Linus Torvalds1-19/+33
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: - Support for including MTE tags in ELF coredumps - Instruction encoder updates, including fixes to 64-bit immediate generation and support for the LSE atomic instructions - Improvements to kselftests for MTE and fpsimd - Symbol aliasing and linker script cleanups - Reduce instruction cache maintenance performed for user mappings created using contiguous PTEs - Support for the new "asymmetric" MTE mode, where stores are checked asynchronously but loads are checked synchronously - Support for the latest pointer authentication algorithm ("QARMA3") - Support for the DDR PMU present in the Marvell CN10K platform - Support for the CPU PMU present in the Apple M1 platform - Use the RNDR instruction for arch_get_random_{int,long}() - Update our copy of the Arm optimised string routines for str{n}cmp() - Fix signal frame generation for CPUs which have foolishly elected to avoid building in support for the fpsimd instructions - Workaround for Marvell GICv3 erratum #38545 - Clarification to our Documentation (booting reqs. and MTE prctl()) - Miscellanous cleanups and minor fixes * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (90 commits) docs: sysfs-devices-system-cpu: document "asymm" value for mte_tcf_preferred arm64/mte: Remove asymmetric mode from the prctl() interface arm64: Add cavium_erratum_23154_cpus missing sentinel perf/marvell: Fix !CONFIG_OF build for CN10K DDR PMU driver arm64: mm: Drop 'const' from conditional arm64_dma_phys_limit definition Documentation: vmcoreinfo: Fix htmldocs warning kasan: fix a missing header include of static_keys.h drivers/perf: Add Apple icestorm/firestorm CPU PMU driver drivers/perf: arm_pmu: Handle 47 bit counters arm64: perf: Consistently make all event numbers as 16-bits arm64: perf: Expose some Armv9 common events under sysfs perf/marvell: cn10k DDR perf event core ownership perf/marvell: cn10k DDR perfmon event overflow handling perf/marvell: CN10k DDR performance monitor support dt-bindings: perf: marvell: cn10k ddr performance monitor arm64: clean up tools Makefile perf/arm-cmn: Update watchpoint format perf/arm-cmn: Hide XP PUB events for CMN-600 arm64: drop unused includes of <linux/personality.h> arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones ...
2022-03-18perf parse-events: Ignore case in topdown.slots checkIan Rogers1-1/+1
An issue with icelakex metrics: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json?h=perf/core&id=65eab2bc7dab326ee892ec5a4c749470b368b51a#n48 That causes the slots not to be first. Fixes: 94dbfd6781a0e87b ("perf parse-events: Architecture specific leader override") Reported-by: Caleb Biggers <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Torgue <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Maxime Coquelin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Zhengjun Xing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-03-18perf evlist: Avoid iteration for empty evlist.Ian Rogers1-11/+17
As seen with 'perf stat --null ..' and reported in: https://lore.kernel.org/lkml/[email protected]/ v2. Avoids setting evsel in the empty list case as suggested by Jiri Olsa. Committer testing: Before: $ perf stat --null sleep 1 Segmentation fault (core dumped) $ After: $ perf stat --null sleep 1 Performance counter stats for 'sleep 1': 1.010340646 seconds time elapsed 0.001420000 seconds user 0.000000000 seconds sys $ Fixes: 472832d2c000b961 ("perf evlist: Refactor evlist__for_each_cpu()") Reported-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-03-18perf symbols: Fix symbol size calculation conditionMichael Petlan1-1/+1
Before this patch, the symbol end address fixup to be called, needed two conditions being met: if (prev->end == prev->start && prev->end != curr->start) Where "prev->end == prev->start" means that prev is zero-long (and thus needs a fixup) and "prev->end != curr->start" means that fixup hasn't been applied yet However, this logic is incorrect in the following situation: *curr = {rb_node = {__rb_parent_color = 278218928, rb_right = 0x0, rb_left = 0x0}, start = 0xc000000000062354, end = 0xc000000000062354, namelen = 40, type = 2 '\002', binding = 0 '\000', idle = 0 '\000', ignore = 0 '\000', inlined = 0 '\000', arch_sym = 0 '\000', annotate2 = false, name = 0x1159739e "kprobe_optinsn_page\t[__builtin__kprobes]"} *prev = {rb_node = {__rb_parent_color = 278219041, rb_right = 0x109548b0, rb_left = 0x109547c0}, start = 0xc000000000062354, end = 0xc000000000062354, namelen = 12, type = 2 '\002', binding = 1 '\001', idle = 0 '\000', ignore = 0 '\000', inlined = 0 '\000', arch_sym = 0 '\000', annotate2 = false, name = 0x1095486e "optinsn_slot"} In this case, prev->start == prev->end == curr->start == curr->end, thus the condition above thinks that "we need a fixup due to zero length of prev symbol, but it has been probably done, since the prev->end == curr->start", which is wrong. After the patch, the execution path proceeds to arch__symbols__fixup_end function which fixes up the size of prev symbol by adding page_size to its end offset. Fixes: 3b01a413c196c910 ("perf symbols: Improve kallsyms symbol end addr calculation") Signed-off-by: Michael Petlan <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-03-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-4/+6
No conflicts. Signed-off-by: Jakub Kicinski <[email protected]>
2022-03-12perf parse: Fix event parser error for hybrid systemsZhengjun Xing1-2/+4
This bug happened on hybrid systems when both cpu_core and cpu_atom have the same event name such as "UOPS_RETIRED.MS" while their event terms are different, then during perf stat, the event for cpu_atom will parse fail and then no output for cpu_atom. UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/ UOPS_RETIRED.MS -> cpu_atom/period=0x1e8483,umask=0x1,event=0xc2/ It is because event terms in the "head" of parse_events_multi_pmu_add will be changed to event terms for cpu_core after parsing UOPS_RETIRED.MS for cpu_core, then when parsing the same event for cpu_atom, it still uses the event terms for cpu_core, but event terms for cpu_atom are different with cpu_core, the event parses for cpu_atom will fail. This patch fixes it, the event terms should be parsed from the original event. This patch can work for the hybrid systems that have the same event in more than 2 PMUs. It also can work in non-hybrid systems. Before: # perf stat -v -e UOPS_RETIRED.MS -a sleep 1 Using CPUID GenuineIntel-6-97-1 UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/ Control descriptor is not initialized UOPS_RETIRED.MS: 2737845 16068518485 16068518485 Performance counter stats for 'system wide': 2,737,845 cpu_core/UOPS_RETIRED.MS/ 1.002553850 seconds time elapsed After: # perf stat -v -e UOPS_RETIRED.MS -a sleep 1 Using CPUID GenuineIntel-6-97-1 UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/ UOPS_RETIRED.MS -> cpu_atom/period=0x1e8483,umask=0x1,event=0xc2/ Control descriptor is not initialized UOPS_RETIRED.MS: 1977555 16076950711 16076950711 UOPS_RETIRED.MS: 568684 8038694234 8038694234 Performance counter stats for 'system wide': 1,977,555 cpu_core/UOPS_RETIRED.MS/ 568,684 cpu_atom/UOPS_RETIRED.MS/ 1.004758259 seconds time elapsed Fixes: fb0811535e92c6c1 ("perf parse-events: Allow config on kernel PMU events") Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Zhengjun Xing <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-03-12perf bench: Fix NULL check against wrong variableWeiguo Li1-1/+1
We did a NULL check after "epollfdp = calloc(...)", but we checked "epollfd" instead of "epollfdp". Signed-off-by: Weiguo Li <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-03-12perf parse-events: Fix NULL check against wrong variableWeiguo Li1-1/+1
We did a null check after "tmp->symbol = strdup(...)", but we checked "list->symbol" other than "tmp->symbol". Reviewed-by: John Garry <[email protected]> Signed-off-by: Weiguo Li <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-03-07nds32: Remove the architectureAlan Kao3-31/+0
The nds32 architecture, also known as AndeStar V3, is a custom 32-bit RISC target designed by Andes Technologies. Support was added to the kernel in 2016 as the replacement RISC-V based V5 processors were already announced, and maintained by (current or former) Andes employees. As explained by Alan Kao, new customers are now all using RISC-V, and all known nds32 users are already on longterm stable kernels provided by Andes, with no development work going into mainline support any more. While the port is still in a reasonably good shape, it only gets worse over time without active maintainers, so it seems best to remove it before it becomes unusable. As always, if it turns out that there are mainline users after all, and they volunteer to maintain the port in the future, the removal can be reverted. Link: https://lore.kernel.org/linux-mm/[email protected]/ Link: https://lore.kernel.org/lkml/[email protected]/ Link: https://www.andestech.com/en/products-solutions/andestar-architecture/ Signed-off-by: Alan Kao <[email protected]> [arnd: rewrite changelog to provide more background] Signed-off-by: Arnd Bergmann <[email protected]>
2022-03-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski4-21/+7
net/batman-adv/hard-interface.c commit 690bb6fb64f5 ("batman-adv: Request iflink once in batadv-on-batadv check") commit 6ee3c393eeb7 ("batman-adv: Demote batadv-on-batadv skip error message") https://lore.kernel.org/all/[email protected]/ net/smc/af_smc.c commit 4d08b7b57ece ("net/smc: Fix cleanup when register ULP fails") commit 462791bbfa35 ("net/smc: add sysctl interface for SMC") https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jakub Kicinski <[email protected]>
2022-03-01perf: Add irq and exception return branch typesAnshuman Khandual1-1/+3
This expands generic branch type classification by adding two more entries there in i.e irq and exception return. Also updates the x86 implementation to process X86_BR_IRET and X86_BR_IRQ records as appropriate. This changes branch types reported to user space on x86 platform but it should not be a problem. The possible scenarios and impacts are enumerated here. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2022-02-22perf script: Fix error when printing 'weight' fieldGerman Gomez1-1/+1
In SPE traces the 'weight' field can't be printed in 'perf script' because the 'dummy:u' event doesn't have the WEIGHT attribute set. Use evsel__do_check_stype(..) to check this field, as it's done with other fields such as "phys_addr". Before: $ perf record -e arm_spe_0// -- sleep 1 $ perf script -F event,ip,weight Samples for 'dummy:u' event do not have WEIGHT attribute set. Cannot print 'weight' field. After: $ perf script -F event,ip,weight l1d-access: 12 ffffaf629d4cb320 tlb-access: 12 ffffaf629d4cb320 memory: 12 ffffaf629d4cb320 Fixes: b0fde9c6e291e528 ("perf arm-spe: Add SPE total latency as PERF_SAMPLE_WEIGHT") Signed-off-by: German Gomez <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-22perf data: Fix double free in perf_session__delete()Alexey Bayduraev1-4/+3
When perf_data__create_dir() fails, it calls close_dir(), but perf_session__delete() also calls close_dir() and since dir.version and dir.nr were initialized by perf_data__create_dir(), a double free occurs. This patch moves the initialization of dir.version and dir.nr after successful initialization of dir.files, that prevents double freeing. This behavior is already implemented in perf_data__open_dir(). Fixes: 145520631130bd64 ("perf data: Add perf_data__(create_dir|close_dir) functions") Signed-off-by: Alexey Bayduraev <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Antonov <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-22linkage: remove SYM_FUNC_{START,END}_ALIAS()Mark Rutland1-21/+0
Now that all aliases are defined using SYM_FUNC_ALIAS(), remove the old SYM_FUNC_{START,END}_ALIAS() macros. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <[email protected]> Acked-by: Ard Biesheuvel <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Acked-by: Mark Brown <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: Peter Zijlstra <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-02-22linkage: add SYM_FUNC_ALIAS{,_LOCAL,_WEAK}()Mark Rutland1-0/+35
Currently aliasing an asm function requires adding START and END annotations for each name, as per Documentation/asm-annotations.rst: SYM_FUNC_START_ALIAS(__memset) SYM_FUNC_START(memset) ... asm insns ... SYM_FUNC_END(memset) SYM_FUNC_END_ALIAS(__memset) This is more painful than necessary to maintain, especially where a function has many aliases, some of which we may wish to define conditionally. For example, arm64's memcpy/memmove implementation (which uses some arch-specific SYM_*() helpers) has: SYM_FUNC_START_ALIAS(__memmove) SYM_FUNC_START_ALIAS_WEAK_PI(memmove) SYM_FUNC_START_ALIAS(__memcpy) SYM_FUNC_START_WEAK_PI(memcpy) ... asm insns ... SYM_FUNC_END_PI(memcpy) EXPORT_SYMBOL(memcpy) SYM_FUNC_END_ALIAS(__memcpy) EXPORT_SYMBOL(__memcpy) SYM_FUNC_END_ALIAS_PI(memmove) EXPORT_SYMBOL(memmove) SYM_FUNC_END_ALIAS(__memmove) EXPORT_SYMBOL(__memmove) SYM_FUNC_START(name) It would be much nicer if we could define the aliases *after* the standard function definition. This would avoid the need to specify each symbol name twice, and would make it easier to spot the canonical function definition. This patch adds new macros to allow us to do so, which allows the above example to be rewritten more succinctly as: SYM_FUNC_START(__pi_memcpy) ... asm insns ... SYM_FUNC_END(__pi_memcpy) SYM_FUNC_ALIAS(__memcpy, __pi_memcpy) EXPORT_SYMBOL(__memcpy) SYM_FUNC_ALIAS_WEAK(memcpy, __memcpy) EXPORT_SYMBOL(memcpy) SYM_FUNC_ALIAS(__pi_memmove, __pi_memcpy) SYM_FUNC_ALIAS(__memmove, __pi_memmove) EXPORT_SYMBOL(__memmove) SYM_FUNC_ALIAS_WEAK(memmove, __memmove) EXPORT_SYMBOL(memmove) The reduction in duplication will also make it possible to replace some uses of WEAK with more accurate Kconfig guards, e.g. #ifndef CONFIG_KASAN SYM_FUNC_ALIAS(memmove, __memmove) EXPORT_SYMBOL(memmove) #endif ... which should make it easier to ensure that symbols are neither used nor overidden unexpectedly. The existing SYM_FUNC_START_ALIAS() and SYM_FUNC_START_LOCAL_ALIAS() are marked as deprecated, and will be removed once existing users are moved over to the new scheme. The tools/perf/ copy of linkage.h is updated to match. A subsequent patch will depend upon this when updating the x86 asm annotations. Signed-off-by: Mark Rutland <[email protected]> Acked-by: Ard Biesheuvel <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Acked-by: Mark Brown <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: Peter Zijlstra <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-02-18perf evlist: Fix failed to use cpu list for uncore eventsZhengjun Xing1-2/+2
The 'perf record' and 'perf stat' commands have supported the option '-C/--cpus' to count or collect only on the list of CPUs provided. Commit 1d3351e631fc34d7 ("perf tools: Enable on a list of CPUs for hybrid") add it to be supported for hybrid. For hybrid support, it checks the cpu list are available on hybrid PMU. But when we test only uncore events(or events not in cpu_core and cpu_atom), there is a bug: Before: # perf stat -C0 -e uncore_clock/clockticks/ sleep 1 failed to use cpu list 0 In this case, for uncore event, its pmu_name is not cpu_core or cpu_atom, so in evlist__fix_hybrid_cpus, perf_pmu__find_hybrid_pmu should return NULL,both events_nr and unmatched_count should be 0 ,then the cpu list check function evlist__fix_hybrid_cpus return -1 and the error "failed to use cpu list 0" will happen. Bypass "events_nr=0" case then the issue is fixed. After: # perf stat -C0 -e uncore_clock/clockticks/ sleep 1 Performance counter stats for 'CPU(s) 0': 195,476,873 uncore_clock/clockticks/ 1.004518677 seconds time elapsed When testing with at least one core event and uncore events, it has no issue. # perf stat -C0 -e cpu_core/cpu-cycles/,uncore_clock/clockticks/ sleep 1 Performance counter stats for 'CPU(s) 0': 5,993,774 cpu_core/cpu-cycles/ 301,025,912 uncore_clock/clockticks/ 1.003964934 seconds time elapsed Fixes: 1d3351e631fc34d7 ("perf tools: Enable on a list of CPUs for hybrid") Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Zhengjun Xing <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: [email protected] Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-18perf test: Skip failing sigtrap test for arm+aarch64John Garry1-14/+1
Skip the Sigtrap test for arm + arm64, same as was done for s390 in commit a840974e96fd ("perf test: Test 73 Sig_trap fails on s390"). For this, reuse BP_SIGNAL_IS_SUPPORTED - meaning that the arch can use BP to generate signals - instead of BP_ACCOUNT_IS_SUPPORTED, which is appropriate. As described by Will at [0], in the test we get stuck in a loop of handling the HW breakpoint exception and never making progress. GDB handles this by stepping over the faulting instruction, but with perf the kernel is expected to handle the step (which it doesn't for arm). Dmitry made an attempt to get this work, also mentioned in the same thread as [0], which was appreciated. But the best thing to do is skip the test for now. [0] https://lore.kernel.org/linux-perf-users/20220118124343.GC98966@leoy-ThinkPad-X240s/T/#m13b06c39d2a5100d340f009435df6f4d8ee57b5a Fixes: 5504f67944484495 ("perf test sigtrap: Add basic stress test for sigtrap handling") Signed-off-by: John Garry <[email protected]> Tested-by: Leo Yan <[email protected]> Acked-by: Marco Elver <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Marco Elver <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski8-15/+51
No conflicts. Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-17perf bpf: Defer freeing string after possible strlen() on itArnaldo Carvalho de Melo1-1/+2
This was detected by the gcc in Fedora Rawhide's gcc: 50 11.01 fedora:rawhide : FAIL gcc version 12.0.1 20220205 (Red Hat 12.0.1-0) (GCC) inlined from 'bpf__config_obj' at util/bpf-loader.c:1242:9: util/bpf-loader.c:1225:34: error: pointer 'map_opt' may be used after 'free' [-Werror=use-after-free] 1225 | *key_scan_pos += strlen(map_opt); | ^~~~~~~~~~~~~~~ util/bpf-loader.c:1223:9: note: call to 'free' here 1223 | free(map_name); | ^~~~~~~~~~~~~~ cc1: all warnings being treated as errors So do the calculations on the pointer before freeing it. Fixes: 04f9bf2bac72480c ("perf bpf-loader: Add missing '*' for key_scan_pos") Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang ShaoBo <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-16perf test: Fix arm64 perf_event_attr tests wrt --call-graph initializationGerman Gomez5-0/+24
The struct perf_event_attr is initialised differently in Arm64 when recording in call-graph fp mode, so update the relevant tests, and add two extra arm64-only tests. Before: $ perf test 17 -v 17: Setup struct perf_event_attr [...] running './tests/attr/test-record-graph-default' expected sample_type=295, got 4391 expected sample_regs_user=0, got 1073741824 FAILED './tests/attr/test-record-graph-default' - match failure test child finished with -1 ---- end ---- After: [...] running './tests/attr/test-record-graph-default-aarch64' test limitation 'aarch64' running './tests/attr/test-record-graph-fp-aarch64' test limitation 'aarch64' running './tests/attr/test-record-graph-default' test limitation '!aarch64' excluded architecture list ['aarch64'] skipped [aarch64] './tests/attr/test-record-graph-default' running './tests/attr/test-record-graph-fp' test limitation '!aarch64' excluded architecture list ['aarch64'] skipped [aarch64] './tests/attr/test-record-graph-fp' [...] Fixes: 7248e308a5758761 ("perf tools: Record ARM64 LR register automatically") Signed-off-by: German Gomez <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Truong <[email protected]> Cc: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Fastabend <[email protected]> Cc: KP Singh <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Song Liu <[email protected]> Cc: Yonghong Song <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-16perf cs-etm: Fix corrupt inject files when only last branch option is enabledJames Clark1-0/+2
'perf inject' with Coresight data generates files that cannot be opened when only the last branch option is specified: perf inject -i perf.data --itrace=l -o inject.data perf script -i inject.data 0x33faa8 [0x8]: failed to process type: 9 [Bad address] This is because cs_etm__synth_instruction_sample() is called even when the sample type for instructions hasn't been setup. Last branch records are attached to instruction samples so it doesn't make sense to generate them when --itrace=i isn't specified anyway. This change disables all calls of cs_etm__synth_instruction_sample() unless --itrace=i is specified, resulting in a file with no samples if only --itrace=l is provided, rather than a bad file. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-16perf cs-etm: No-op refactor of synth opt usageJames Clark1-9/+5
sample_branches and sample_instructions are already saved in the synth_opts struct. Other usages like synth_opts.last_branch don't save a value, so make this more consistent by always going through synth_opts and not saving duplicate values. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-16perf trace: Avoid early exit due SIGCHLD from non-workload processesChangbin Du1-5/+18
The function trace__symbols_init() runs "perf-read-vdso32" and that ends up with a SIGCHLD delivered to 'perf'. And this SIGCHLD make perf exit early. 'perf trace' should exit only if the SIGCHLD is from our workload process. So let's use sigaction() instead of signal() to match such condition. Committer notes: Use memset to zero the 'struct sigaction' variable as the '= { 0 }' method isn't accepted in many compiler versions, e.g.: 4 34.02 alpine:3.6 : FAIL clang version 4.0.0 (tags/RELEASE_400/final) builtin-trace.c:4897:35: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces] struct sigaction sigchld_act = { 0 }; ^ {} builtin-trace.c:4897:37: error: missing field 'sa_mask' initializer [-Werror,-Wmissing-field-initializers] struct sigaction sigchld_act = { 0 }; ^ 2 errors generated. 6 32.60 alpine:3.8 : FAIL gcc version 6.4.0 (Alpine 6.4.0) builtin-trace.c:4897:35: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces] struct sigaction sigchld_act = { 0 }; ^ {} builtin-trace.c:4897:37: error: missing field 'sa_mask' initializer [-Werror,-Wmissing-field-initializers] struct sigaction sigchld_act = { 0 }; ^ 2 errors generated. 7 34.82 alpine:3.9 : FAIL gcc version 8.3.0 (Alpine 8.3.0) builtin-trace.c:4897:35: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces] struct sigaction sigchld_act = { 0 }; ^ {} builtin-trace.c:4897:37: error: missing field 'sa_mask' initializer [-Werror,-Wmissing-field-initializers] struct sigaction sigchld_act = { 0 }; ^ 2 errors generated. Signed-off-by: Changbin Du <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski11-36/+64
No conflicts. Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-09Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski2-5/+7
Daniel Borkmann says: ==================== pull-request: bpf-next 2022-02-09 We've added 126 non-merge commits during the last 16 day(s) which contain a total of 201 files changed, 4049 insertions(+), 2215 deletions(-). The main changes are: 1) Add custom BPF allocator for JITs that pack multiple programs into a huge page to reduce iTLB pressure, from Song Liu. 2) Add __user tagging support in vmlinux BTF and utilize it from BPF verifier when generating loads, from Yonghong Song. 3) Add per-socket fast path check guarding from cgroup/BPF overhead when used by only some sockets, from Pavel Begunkov. 4) Continued libbpf deprecation work of APIs/features and removal of their usage from samples, selftests, libbpf & bpftool, from Andrii Nakryiko and various others. 5) Improve BPF instruction set documentation by adding byte swap instructions and cleaning up load/store section, from Christoph Hellwig. 6) Switch BPF preload infra to light skeleton and remove libbpf dependency from it, from Alexei Starovoitov. 7) Fix architecture-agnostic macros in libbpf for accessing syscall arguments from BPF progs for non-x86 architectures, from Ilya Leoshkevich. 8) Rework port members in struct bpf_sk_lookup and struct bpf_sock to be of 16-bit field with anonymous zero padding, from Jakub Sitnicki. 9) Add new bpf_copy_from_user_task() helper to read memory from a different task than current. Add ability to create sleepable BPF iterator progs, from Kenny Yu. 10) Implement XSK batching for ice's zero-copy driver used by AF_XDP and utilize TX batching API from XSK buffer pool, from Maciej Fijalkowski. 11) Generate temporary netns names for BPF selftests to avoid naming collisions, from Hangbin Liu. 12) Implement bpf_core_types_are_compat() with limited recursion for in-kernel usage, from Matteo Croce. 13) Simplify pahole version detection and finally enable CONFIG_DEBUG_INFO_DWARF5 to be selected with CONFIG_DEBUG_INFO_BTF, from Nathan Chancellor. 14) Misc minor fixes to libbpf and selftests from various folks. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (126 commits) selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide libbpf: Fix compilation warning due to mismatched printf format selftests/bpf: Test BPF_KPROBE_SYSCALL macro libbpf: Add BPF_KPROBE_SYSCALL macro libbpf: Fix accessing the first syscall argument on s390 libbpf: Fix accessing the first syscall argument on arm64 libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL selftests/bpf: Skip test_bpf_syscall_macro's syscall_arg1 on arm64 and s390 libbpf: Fix accessing syscall arguments on riscv libbpf: Fix riscv register names libbpf: Fix accessing syscall arguments on powerpc selftests/bpf: Use PT_REGS_SYSCALL_REGS in bpf_syscall_macro libbpf: Add PT_REGS_SYSCALL_REGS macro selftests/bpf: Fix an endianness issue in bpf_syscall_macro test bpf: Fix bpf_prog_pack build HPAGE_PMD_SIZE bpf: Fix leftover header->pages in sparc and powerpc code. libbpf: Fix signedness bug in btf_dump_array_data() selftests/bpf: Do not export subtest as standalone test bpf, x86_64: Fail gracefully on bpf_jit_binary_pack_finalize failures ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-02-06s390: remove invalid email address of Heiko CarstensHeiko Carstens1-2/+1
Remove my old invalid email address which can be found in a couple of files. Instead of updating it, just remove my contact data completely from source files. We have git and other tools which allow to figure out who is responsible for what with recent contact data. Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2022-02-06perf ftrace: system_wide collection is not effective by defaultChangbin Du1-21/+24
The ftrace.target.system_wide must be set before invoking evlist__create_maps(), otherwise it has no effect. Fixes: 53be50282269b46c ("perf ftrace: Add 'latency' subcommand") Signed-off-by: Changbin Du <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-06perf stat: Fix display of grouped aliased eventsIan Rogers1-9/+10
An event may have a number of uncore aliases that when added to the evlist are consecutive. If there are multiple uncore events in a group then parse_events__set_leader_for_uncore_aliase will reorder the evlist so that events on the same PMU are adjacent. The collect_all_aliases function assumes that aliases are in blocks so that only the first counter is printed and all others are marked merged. The reordering for groups breaks the assumption and so all counts are printed. This change removes the assumption from collect_all_aliases that the events are in blocks and instead processes the entire evlist. Before: ``` $ perf stat -e '{UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE,UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE},duration_time' -a -A -- sleep 1 Performance counter stats for 'system wide': CPU0 256,866 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 494,413 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 967 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,738 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 285,161 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 429,920 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 955 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,443 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 310,753 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 416,657 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,231 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,573 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 416,067 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 405,966 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,481 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,447 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 312,911 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 408,154 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,086 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,380 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 333,994 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 370,349 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,287 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,335 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 188,107 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 302,423 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 701 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,070 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 307,221 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 383,642 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,036 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,158 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 318,479 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 821,545 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,028 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 2,550 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 227,618 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 372,272 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 903 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,456 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 376,783 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 419,827 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,406 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,453 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 286,583 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 429,956 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 999 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,436 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 313,867 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 370,159 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,114 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,291 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 342,083 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 409,111 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,399 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,684 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 365,828 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 376,037 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,378 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,411 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 382,456 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 621,743 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,232 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,955 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 342,316 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 385,067 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,176 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,268 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 373,588 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 386,163 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,394 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,464 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 381,206 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 546,891 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,266 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,712 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 221,176 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 392,069 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 831 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,456 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 355,401 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 705,595 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,235 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 2,216 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 371,436 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 428,103 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,306 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,442 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 384,352 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 504,200 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,468 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,860 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 228,856 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 287,976 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 832 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,060 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 215,121 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 334,162 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 681 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,026 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 296,179 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 436,083 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,084 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,525 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 262,296 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 416,573 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 986 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,533 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 285,852 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 359,842 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,073 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,326 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 303,379 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 367,222 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,008 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,156 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 273,487 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 425,449 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 932 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,367 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 297,596 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 414,793 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,140 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,601 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 342,365 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 360,422 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,291 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,342 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 327,196 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 580,858 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,122 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 2,014 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 296,564 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 452,817 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,087 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,694 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 375,002 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 389,393 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,478 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 1,540 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 365,213 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 594,685 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 1,401 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 2,222 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 1,000,749,060 ns duration_time 1.000749060 seconds time elapsed ``` After: ``` Performance counter stats for 'system wide': CPU0 20,547,434 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU36 45,202,862 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE CPU0 82,001 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU36 159,688 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE CPU0 1,000,464,828 ns duration_time 1.000464828 seconds time elapsed ``` Fixes: 3cdc5c2cb924acb4 ("perf parse-events: Handle uncore event aliases in small groups properly") Reviewed-by: Andi Kleen <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Torgue <[email protected]> Cc: Asaf Yaffe <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Maxime Coquelin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Vineet Singh <[email protected]> Cc: Zhengjun Xing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-06perf tools: Apply correct label to user/kernel symbols in branch modeGerman Gomez3-2/+5
In branch mode, the branch symbols were being displayed with incorrect cpumode labels. So fix this. For example, before: # perf record -b -a -- sleep 1 # perf report -b Overhead Command Source Shared Object Source Symbol Target Symbol 0.08% swapper [kernel.kallsyms] [k] rcu_idle_enter [k] cpuidle_enter_state ==> 0.08% cmd0 [kernel.kallsyms] [.] psi_group_change [.] psi_group_change 0.08% cmd1 [kernel.kallsyms] [k] psi_group_change [k] psi_group_change After: # perf report -b Overhead Command Source Shared Object Source Symbol Target Symbol 0.08% swapper [kernel.kallsyms] [k] rcu_idle_enter [k] cpuidle_enter_state 0.08% cmd0 [kernel.kallsyms] [k] psi_group_change [k] pei_group_change 0.08% cmd1 [kernel.kallsyms] [k] psi_group_change [k] psi_group_change Reviewed-by: James Clark <[email protected]> Signed-off-by: German Gomez <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-06perf bpf: Fix a typo in bpf_counter_cgroup.cMasanari Iida1-1/+1
This patch fixes a spelling typo in error message. Signed-off-by: Masanari Iida <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-06perf synthetic-events: Return error if procfs isn't mounted for PID namespacesLeo Yan1-0/+19
For perf recording, it retrieves process info by iterating nodes in proc fs. If we run perf in a non-root PID namespace with command: # unshare --fork --pid perf record -e cycles -a -- test_program ... in this case, unshare command creates a child PID namespace and launches perf tool in it, but the issue is the proc fs is not mounted for the non-root PID namespace, this leads to the perf tool gathering process info from its parent PID namespace. We can use below command to observe the process nodes under proc fs: # unshare --pid --fork ls /proc 1 137 1968 2128 3 342 48 62 78 crypto kcore net uptime 10 138 2 2142 30 35 49 63 8 devices keys pagetypeinfo version 11 139 20 2143 304 36 50 64 82 device-tree key-users partitions vmallocinfo 12 14 2011 22 305 37 51 65 83 diskstats kmsg self vmstat 128 140 2038 23 307 39 52 656 84 driver kpagecgroup slabinfo zoneinfo 129 15 2074 24 309 4 53 67 9 execdomains kpagecount softirqs 13 16 2094 241 31 40 54 68 asound fb kpageflags stat 130 164 2096 242 310 41 55 69 buddyinfo filesystems loadavg swaps 131 17 2098 25 317 42 56 70 bus fs locks sys 132 175 21 26 32 43 57 71 cgroups interrupts meminfo sysrq-trigger 133 179 2102 263 329 44 58 75 cmdline iomem misc sysvipc 134 1875 2103 27 330 45 59 76 config.gz ioports modules thread-self 135 19 2117 29 333 46 6 77 consoles irq mounts timer_list 136 1941 2121 298 34 47 60 773 cpuinfo kallsyms mtd tty So it shows many existed tasks, since unshared command has not mounted the proc fs for the new created PID namespace, it still accesses the proc fs of the root PID namespace. This leads to two prominent issues: - Firstly, PID values are mismatched between thread info and samples. The gathered thread info are coming from the proc fs of the root PID namespace, but samples record its PID from the child PID namespace. - The second issue is profiled program 'test_program' returns its forked PID number from the child PID namespace, perf tool wrongly uses this PID number to retrieve the process info via the proc fs of the root PID namespace. To avoid issues, we need to mount proc fs for the child PID namespace with the option '--mount-proc' when use unshare command: # unshare --fork --pid --mount-proc perf record -e cycles -a -- test_program Conversely, when the proc fs of the root PID namespace is used by child namespace, perf tool can detect the multiple PID levels and nsinfo__is_in_root_namespace() returns false, this patch reports error for this case: # unshare --fork --pid perf record -e cycles -a -- test_program Couldn't synthesize bpf events. Perf runs in non-root PID namespace but it tries to gather process info from its parent PID namespace. Please mount the proc file system properly, e.g. add the option '--mount-proc' for unshare command. Reviewed-by: James Clark <[email protected]> Signed-off-by: Leo Yan <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Fastabend <[email protected]> Cc: KP Singh <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-06perf session: Check for NULL pointer before dereferenceAmeer Hamza1-1/+2
Move NULL pointer check before dereferencing the variable. Addresses-Coverity: 1497622 ("Derereference before null check") Reviewed-by: James Clark <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: German Gomez <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-06perf annotate: Set error stream of objdump process for TUINamhyung Kim1-0/+1
The stderr should be set to a pipe when using TUI. Otherwise it'd print to stdout and break TUI windows with an error message. Signed-off-by: Namhyung Kim <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-06perf tools: Add missing branch_sample_type to perf_event_attr__fprintf()Anshuman Khandual1-1/+1
This updates branch sample type with missing PERF_SAMPLE_BRANCH_TYPE_SAVE. Suggested-by: James Clark <[email protected]> Signed-off-by: Anshuman Khandual <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: James Clark <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-01perf beauty: Make the prctl arg regexp more strict to cope with PR_SET_VMAArnaldo Carvalho de Melo1-1/+1
This new PR_SET_VMA value isn't in sequence with all the other prctl arguments and instead uses a big, 0x prefixed hex number: 0x53564d41 (S V M A). This makes it harder to generate a string table as it would be rather sparse, so make the regexp more stricter to avoid catching those. A followup patch for 'perf trace' to cope with such oddities will be needed, but then its a matter for the next merge window. The next patch will update the prctl.h file to cope with this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h' diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Here is the output of this script: $ tools/perf/trace/beauty/prctl_option.sh static const char *prctl_options[] = { [1] = "SET_PDEATHSIG", [2] = "GET_PDEATHSIG", [3] = "GET_DUMPABLE", [4] = "SET_DUMPABLE", [5] = "GET_UNALIGN", [6] = "SET_UNALIGN", [7] = "GET_KEEPCAPS", [8] = "SET_KEEPCAPS", [9] = "GET_FPEMU", [10] = "SET_FPEMU", [11] = "GET_FPEXC", [12] = "SET_FPEXC", [13] = "GET_TIMING", [14] = "SET_TIMING", [15] = "SET_NAME", [16] = "GET_NAME", [19] = "GET_ENDIAN", [20] = "SET_ENDIAN", [21] = "GET_SECCOMP", [22] = "SET_SECCOMP", [23] = "CAPBSET_READ", [24] = "CAPBSET_DROP", [25] = "GET_TSC", [26] = "SET_TSC", [27] = "GET_SECUREBITS", [28] = "SET_SECUREBITS", [29] = "SET_TIMERSLACK", [30] = "GET_TIMERSLACK", [31] = "TASK_PERF_EVENTS_DISABLE", [32] = "TASK_PERF_EVENTS_ENABLE", [33] = "MCE_KILL", [34] = "MCE_KILL_GET", [35] = "SET_MM", [36] = "SET_CHILD_SUBREAPER", [37] = "GET_CHILD_SUBREAPER", [38] = "SET_NO_NEW_PRIVS", [39] = "GET_NO_NEW_PRIVS", [40] = "GET_TID_ADDRESS", [41] = "SET_THP_DISABLE", [42] = "GET_THP_DISABLE", [43] = "MPX_ENABLE_MANAGEMENT", [44] = "MPX_DISABLE_MANAGEMENT", [45] = "SET_FP_MODE", [46] = "GET_FP_MODE", [47] = "CAP_AMBIENT", [50] = "SVE_SET_VL", [51] = "SVE_GET_VL", [52] = "GET_SPECULATION_CTRL", [53] = "SET_SPECULATION_CTRL", [54] = "PAC_RESET_KEYS", [55] = "SET_TAGGED_ADDR_CTRL", [56] = "GET_TAGGED_ADDR_CTRL", [57] = "SET_IO_FLUSHER", [58] = "GET_IO_FLUSHER", [59] = "SET_SYSCALL_USER_DISPATCH", [60] = "PAC_SET_ENABLED_KEYS", [61] = "PAC_GET_ENABLED_KEYS", [62] = "SCHED_CORE", }; static const char *prctl_set_mm_options[] = { [1] = "START_CODE", [2] = "END_CODE", [3] = "START_DATA", [4] = "END_DATA", [5] = "START_STACK", [6] = "START_BRK", [7] = "BRK", [8] = "ARG_START", [9] = "ARG_END", [10] = "ENV_START", [11] = "ENV_END", [12] = "AUXV", [13] = "EXE_FILE", [14] = "MAP", [15] = "MAP_SIZE", }; $ Cc: Adrian Hunter <[email protected]> Cc: Colin Cross <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kees Cook <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-01-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski48-134/+334
No conflicts. Signed-off-by: Jakub Kicinski <[email protected]>
2022-01-25perf: use generic bpf_program__set_type() to set BPF prog typeAndrii Nakryiko1-2/+2
bpf_program__set_<type>() APIs are deprecated, use generic bpf_program__set_type() APIs instead. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-01-24perf: Stop using bpf_object__open_buffer() APIChristy Lee2-3/+5
bpf_object__open_buffer() API is deprecated, use the unified opts bpf_object__open_mem() API in perf instead. This requires at least libbpf 0.0.6. Signed-off-by: Christy Lee <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-01-24Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski2-51/+41
Daniel Borkmann says: ==================== pull-request: bpf-next 2022-01-24 We've added 80 non-merge commits during the last 14 day(s) which contain a total of 128 files changed, 4990 insertions(+), 895 deletions(-). The main changes are: 1) Add XDP multi-buffer support and implement it for the mvneta driver, from Lorenzo Bianconi, Eelco Chaudron and Toke Høiland-Jørgensen. 2) Add unstable conntrack lookup helpers for BPF by using the BPF kfunc infra, from Kumar Kartikeya Dwivedi. 3) Extend BPF cgroup programs to export custom ret value to userspace via two helpers bpf_get_retval() and bpf_set_retval(), from YiFei Zhu. 4) Add support for AF_UNIX iterator batching, from Kuniyuki Iwashima. 5) Complete missing UAPI BPF helper description and change bpf_doc.py script to enforce consistent & complete helper documentation, from Usama Arif. 6) Deprecate libbpf's legacy BPF map definitions and streamline XDP APIs to follow tc-based APIs, from Andrii Nakryiko. 7) Support BPF_PROG_QUERY for BPF programs attached to sockmap, from Di Zhu. 8) Deprecate libbpf's bpf_map__def() API and replace users with proper getters and setters, from Christy Lee. 9) Extend libbpf's btf__add_btf() with an additional hashmap for strings to reduce overhead, from Kui-Feng Lee. 10) Fix bpftool and libbpf error handling related to libbpf's hashmap__new() utility function, from Mauricio Vásquez. 11) Add support to BTF program names in bpftool's program dump, from Raman Shukhau. 12) Fix resolve_btfids build to pick up host flags, from Connor O'Brien. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (80 commits) selftests, bpf: Do not yet switch to new libbpf XDP APIs selftests, xsk: Fix rx_full stats test bpf: Fix flexible_array.cocci warnings xdp: disable XDP_REDIRECT for xdp frags bpf: selftests: add CPUMAP/DEVMAP selftests for xdp frags bpf: selftests: introduce bpf_xdp_{load,store}_bytes selftest net: xdp: introduce bpf_xdp_pointer utility routine bpf: generalise tail call map compatibility check libbpf: Add SEC name for xdp frags programs bpf: selftests: update xdp_adjust_tail selftest to include xdp frags bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature bpf: introduce frags support to bpf_prog_test_run_xdp() bpf: move user_size out of bpf_test_init bpf: add frags support to xdp copy helpers bpf: add frags support to the bpf_xdp_adjust_tail() API bpf: introduce bpf_xdp_get_buff_len helper net: mvneta: enable jumbo frames if the loaded XDP program support frags bpf: introduce BPF_F_XDP_HAS_FRAGS flag in prog_flags loading the ebpf program net: mvneta: add frags support to XDP_TX xdp: add frags support to xdp_return_{buff/frame} ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-01-23perf/tests: Add AVX512-FP16 instructions to x86 instruction decoder testAdrian Hunter3-0/+3426
The x86 instruction decoder is used for both kernel instructions and user space instructions (e.g. uprobes, perf tools Intel PT), so it is good to update it with new instructions. Add AVX512-FP16 instructions to x86 instruction decoder test. A subsequent patch adds the instructions to the instruction decoder. Reference: Intel AVX512-FP16 Architecture Specification June 2021 Revision 1.0 Document Number: 347407-001US Example: $ perf test -v "x86 instruction decoder" |& grep vfcmaddcph | head -2 Failed to decode: 62 f6 6f 48 56 cb vfcmaddcph %zmm3,%zmm2,%zmm1 Failed to decode: 62 f6 6f 48 56 8c c8 78 56 34 12 vfcmaddcph 0x12345678(%eax,%ecx,8),%zmm2,%zmm1 Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-01-23perf/tests: Add misc instructions to the x86 instruction decoder testAdrian Hunter3-0/+50
The x86 instruction decoder is used for both kernel instructions and user space instructions (e.g. uprobes, perf tools Intel PT), so it is good to update it with new instructions. Add the following instructions to the x86 instruction decoder test: User Interrupt clui senduipi stui testui uiret Prediction history reset hreset Serialize instruction execution serialize TSX suspend load address tracking xresldtrk xsusldtrk A subsequent patch adds the instructions to the instruction decoder. Reference: Intel Architecture Instruction Set Extensions and Future Features Programming Reference May 2021 Document Number: 319433-044 Example: $ perf test -v "x86 instruction decoder" |& grep -i hreset Failed to decode length (4 vs expected 6): f3 0f 3a f0 c0 00 hreset $0x0 Failed to decode length (4 vs expected 6): f3 0f 3a f0 c0 00 hreset $0x0 Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-01-23perf/tests: Add AMX instructions to x86 instruction decoder testAdrian Hunter2-0/+57
The x86 instruction decoder is used for both kernel instructions and user space instructions (e.g. uprobes, perf tools Intel PT), so it is good to update it with new instructions. Add AMX instructions to the x86 instruction decoder test. A subsequent patch adds the instructions to the instruction decoder. Reference: Intel Architecture Instruction Set Extensions and Future Features Programming Reference May 2021 Document Number: 319433-044 Example: $ INSN='ldtilecfg\|sttilecfg\|tdpbf16ps\|tdpbssd\|' $ INSN+='tdpbsud\|tdpbusd\|'tdpbuud\|tileloadd\|' $ INSN+='tileloaddt1\|tilerelease\|tilestored\|tilezero' $ perf test -v "x86 instruction decoder" |& grep -i $INSN Failed to decode: c4 e2 78 49 04 c8 ldtilecfg (%rax,%rcx,8) Failed to decode: c4 c2 78 49 04 c8 ldtilecfg (%r8,%rcx,8) Failed to decode: c4 e2 79 49 04 c8 sttilecfg (%rax,%rcx,8) Failed to decode: c4 c2 79 49 04 c8 sttilecfg (%r8,%rcx,8) Failed to decode: c4 e2 7a 5c d1 tdpbf16ps %tmm0,%tmm1,%tmm2 Failed to decode: c4 e2 7b 5e d1 tdpbssd %tmm0,%tmm1,%tmm2 Failed to decode: c4 e2 7a 5e d1 tdpbsud %tmm0,%tmm1,%tmm2 Failed to decode: c4 e2 79 5e d1 tdpbusd %tmm0,%tmm1,%tmm2 Failed to decode: c4 e2 78 5e d1 tdpbuud %tmm0,%tmm1,%tmm2 Failed to decode: c4 e2 7b 4b 0c c8 tileloadd (%rax,%rcx,8),%tmm1 Failed to decode: c4 c2 7b 4b 14 c8 tileloadd (%r8,%rcx,8),%tmm2 Failed to decode: c4 e2 79 4b 0c c8 tileloaddt1 (%rax,%rcx,8),%tmm1 Failed to decode: c4 c2 79 4b 14 c8 tileloaddt1 (%r8,%rcx,8),%tmm2 Failed to decode: c4 e2 78 49 c0 tilerelease Failed to decode: c4 e2 7a 4b 0c c8 tilestored %tmm1,(%rax,%rcx,8) Failed to decode: c4 c2 7a 4b 14 c8 tilestored %tmm2,(%r8,%rcx,8) Failed to decode: c4 e2 7b 49 c0 tilezero %tmm0 Failed to decode: c4 e2 7b 49 f8 tilezero %tmm7 Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-01-22perf tools: Remove redundant err variableMinghao Chi1-4/+1
Return value from perf_event__process_tracing_data() directly instead of taking this in another redundant variable. Reported-by: Zeal Robot <[email protected]> Signed-off-by: Minghao Chi <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: CGEL ZTE <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-01-22perf test: Add parse-events test for aliases with hyphensJohn Garry2-9/+82
Add a test which allows us to test parsing an event alias with hyphens. Since these events typically do not exist on most host systems, add the alias to the fake pmu. Function perf_pmu__test_parse_init() has terms added to match known test aliases. Signed-off-by: John Garry <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Qi Liu <[email protected]> Cc: Shaokun Zhang <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-01-22perf test: Add pmu-events test for aliases with hyphensJohn Garry2-0/+48
Add a test for aliases with hyphens in the name to ensure that the pmu-events tables are as expects. There should be no reason why these sort of aliases would be treated differently, but no harm in checking. Signed-off-by: John Garry <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Qi Liu <[email protected]> Cc: Shaokun Zhang <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>