aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-04-29Merge tag 'kconfig-v5.13' of ↵Linus Torvalds29-668/+578
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kconfig updates from Masahiro Yamada: - Change 'option defconfig' to the environment variable KCONFIG_DEFCONFIG_LIST - Refactor tinyconfig without using allnoconfig_y - Remove 'option allnoconfig_y' syntax - Change 'option modules' to 'modules' - Do not use /boot/config-* etc. as base config for cross-compilation - Fix a search bug in nconf - Various code cleanups * tag 'kconfig-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (34 commits) kconfig: refactor .gitignore kconfig: highlight xconfig 'comment' lines with '***' kconfig: highlight gconfig 'comment' lines with '***' kconfig: gconf: remove unused code kconfig: remove unused PACKAGE definition kconfig: nconf: stop endless search loops kconfig: split menu.c out of parser.y kconfig: nconf: refactor in print_in_middle() kconfig: nconf: remove meaningless wattrset() call from show_menu() kconfig: nconf: change set_config_filename() to void function kconfig: nconf: refactor attributes setup code kconfig: nconf: remove unneeded default for menu prompt kconfig: nconf: get rid of (void) casts from wattrset() calls kconfig: nconf: fix NORMAL attributes kconfig: mconf,nconf: remove unneeded '\0' termination after snprintf() kconfig: use /boot/config-* etc. as DEFCONFIG_LIST only for native build kconfig: change sym_change_count to a boolean flag kconfig: nconf: fix core dump when searching in empty menu kconfig: lxdialog: A spello fix and a punctuation added kconfig: streamline_config.pl: Couple of typo fixes ...
2021-04-29Merge tag 'kbuild-v5.13' of ↵Linus Torvalds54-596/+635
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Evaluate $(call cc-option,...) etc. only for build targets - Add CONFIG_VMLINUX_MAP to generate .map file when linking vmlinux - Remove unnecessary --gcc-toolchains Clang flag because the --prefix flag finds the toolchains - Do not pass Clang's --prefix flag when using the integrated as - Check the assembler version in Kconfig time - Add new CONFIG options, AS_VERSION, AS_IS_GNU, AS_IS_LLVM to clean up some dependencies in Kconfig - Fix invalid Module.symvers creation when building only modules without vmlinux - Fix false-positive modpost warnings when CONFIG_TRIM_UNUSED_KSYMS is set, but there is no module to build - Refactor module installation Makefile - Support zstd for module compression - Convert alpha and ia64 to use generic shell scripts to generate the syscall headers - Add a new elfnote to indicate if the kernel was built with LTO, which will be used by pahole - Flatten the directory structure under include/config/ so CONFIG options and filenames match - Change the deb source package name from linux-$(KERNELRELEASE) to linux-upstream * tag 'kbuild-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (42 commits) kbuild: Add $(KBUILD_HOSTLDFLAGS) to 'has_libelf' test kbuild: deb-pkg: change the source package name to linux-upstream tools: do not include scripts/Kbuild.include kbuild: redo fake deps at include/config/*.h kbuild: remove TMPO from try-run MAINTAINERS: add pattern for dummy-tools kbuild: add an elfnote for whether vmlinux is built with lto ia64: syscalls: switch to generic syscallhdr.sh ia64: syscalls: switch to generic syscalltbl.sh alpha: syscalls: switch to generic syscallhdr.sh alpha: syscalls: switch to generic syscalltbl.sh sysctl: use min() helper for namecmp() kbuild: add support for zstd compressed modules kbuild: remove CONFIG_MODULE_COMPRESS kbuild: merge scripts/Makefile.modsign to scripts/Makefile.modinst kbuild: move module strip/compression code into scripts/Makefile.modinst kbuild: refactor scripts/Makefile.modinst kbuild: rename extmod-prefix to extmod_prefix kbuild: check module name conflict for external modules as well kbuild: show the target directory for depmod log ...
2021-04-29Merge branch 'for-5.13/warnings' into for-linusJiri Kosina12-31/+30
- plethora of fixes for valid gcc warnings from Lee Jones
2021-04-29Merge branch 'for-5.13/wacom' into for-linusJiri Kosina2-24/+27
- Wacom pen handling fix from Ping Cheng
2021-04-29Merge branch 'for-5.13/thrustmaster' into for-linusJiri Kosina4-4/+380
- support for initialization of some newer Thrustmaster wheels from Dario Pagani
2021-04-29Merge branch 'for-5.13/surface-system-aggregator-intergration' into for-linusJiri Kosina1834-31531/+64385
- Surface Aggregator Module support from Maximilian Luz
2021-04-29Merge branch 'for-5.13/plantronics' into for-linusJiri Kosina3-2/+61
- Workaround for broken behavior of Plantronics Blackwire from Maxim Mikityanskiy
2021-04-29Merge branch 'for-5.13/magicmouse' into for-linusJiri Kosina5-38/+147
- Apple Magic Mouse 2 support from John Chen
2021-04-29Merge branch 'for-5.13/logitech' into for-linusJiri Kosina4-48/+120
- Support for newer Quad/BT 2.0 Logitech receivers in HID proxy mode from Hans de Goede
2021-04-29Merge branch 'for-5.13/lenovo' into for-linusJiri Kosina1-33/+114
- LED fixes and Thinkpad X1 Tablet keyboard support, from Hans de Goede
2021-04-29Merge branch 'for-5.13/i2c-hid' into for-linusJiri Kosina1-27/+25
- Cleanups to ACPI handling in i2c-hid driver from Andy Shevchenko
2021-04-29Merge branch 'for-5.13/ft260' into for-linusJiri Kosina5-0/+1074
- Support for FTDI FT260 I2C host adapter from Michael Zaidman
2021-04-29Merge branch 'for-5.13/elan' into for-linusJiri Kosina1-12/+5
- Mute LED support fixes for Elantech driver from Hans de Goede
2021-04-29Merge branch 'for-5.13/core' into for-linusJiri Kosina1-3/+3
- hiddev_connect() return value fix from Jason Gerecke
2021-04-29Merge tag 'net-next-5.13' of ↵Linus Torvalds1895-34814/+120846
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - bpf: - allow bpf programs calling kernel functions (initially to reuse TCP congestion control implementations) - enable task local storage for tracing programs - remove the need to store per-task state in hash maps, and allow tracing programs access to task local storage previously added for BPF_LSM - add bpf_for_each_map_elem() helper, allowing programs to walk all map elements in a more robust and easier to verify fashion - sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT redirection - lpm: add support for batched ops in LPM trie - add BTF_KIND_FLOAT support - mostly to allow use of BTF on s390 which has floats in its headers files - improve BPF syscall documentation and extend the use of kdoc parsing scripts we already employ for bpf-helpers - libbpf, bpftool: support static linking of BPF ELF files - improve support for encapsulation of L2 packets - xdp: restructure redirect actions to avoid a runtime lookup, improving performance by 4-8% in microbenchmarks - xsk: build skb by page (aka generic zerocopy xmit) - improve performance of software AF_XDP path by 33% for devices which don't need headers in the linear skb part (e.g. virtio) - nexthop: resilient next-hop groups - improve path stability on next-hops group changes (incl. offload for mlxsw) - ipv6: segment routing: add support for IPv4 decapsulation - icmp: add support for RFC 8335 extended PROBE messages - inet: use bigger hash table for IP ID generation - tcp: deal better with delayed TX completions - make sure we don't give up on fast TCP retransmissions only because driver is slow in reporting that it completed transmitting the original - tcp: reorder tcp_congestion_ops for better cache locality - mptcp: - add sockopt support for common TCP options - add support for common TCP msg flags - include multiple address ids in RM_ADDR - add reset option support for resetting one subflow - udp: GRO L4 improvements - improve 'forward' / 'frag_list' co-existence with UDP tunnel GRO, allowing the first to take place correctly even for encapsulated UDP traffic - micro-optimize dev_gro_receive() and flow dissection, avoid retpoline overhead on VLAN and TEB GRO - use less memory for sysctls, add a new sysctl type, to allow using u8 instead of "int" and "long" and shrink networking sysctls - veth: allow GRO without XDP - this allows aggregating UDP packets before handing them off to routing, bridge, OvS, etc. - allow specifing ifindex when device is moved to another namespace - netfilter: - nft_socket: add support for cgroupsv2 - nftables: add catch-all set element - special element used to define a default action in case normal lookup missed - use net_generic infra in many modules to avoid allocating per-ns memory unnecessarily - xps: improve the xps handling to avoid potential out-of-bound accesses and use-after-free when XPS change race with other re-configuration under traffic - add a config knob to turn off per-cpu netdev refcnt to catch underflows in testing Device APIs: - add WWAN subsystem to organize the WWAN interfaces better and hopefully start driving towards more unified and vendor- independent APIs - ethtool: - add interface for reading IEEE MIB stats (incl. mlx5 and bnxt support) - allow network drivers to dump arbitrary SFP EEPROM data, current offset+length API was a poor fit for modern SFP which define EEPROM in terms of pages (incl. mlx5 support) - act_police, flow_offload: add support for packet-per-second policing (incl. offload for nfp) - psample: add additional metadata attributes like transit delay for packets sampled from switch HW (and corresponding egress and policy-based sampling in the mlxsw driver) - dsa: improve support for sandwiched LAGs with bridge and DSA - netfilter: - flowtable: use direct xmit in topologies with IP forwarding, bridging, vlans etc. - nftables: counter hardware offload support - Bluetooth: - improvements for firmware download w/ Intel devices - add support for reading AOSP vendor capabilities - add support for virtio transport driver - mac80211: - allow concurrent monitor iface and ethernet rx decap - set priority and queue mapping for injected frames - phy: add support for Clause-45 PHY Loopback - pci/iov: add sysfs MSI-X vector assignment interface to distribute MSI-X resources to VFs (incl. mlx5 support) New hardware/drivers: - dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit interfaces. - dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and BCM63xx switches - Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches - ath11k: support for QCN9074 a 802.11ax device - Bluetooth: Broadcom BCM4330 and BMC4334 - phy: Marvell 88X2222 transceiver support - mdio: add BCM6368 MDIO mux bus controller - r8152: support RTL8153 and RTL8156 (USB Ethernet) chips - mana: driver for Microsoft Azure Network Adapter (MANA) - Actions Semi Owl Ethernet MAC - can: driver for ETAS ES58X CAN/USB interfaces Pure driver changes: - add XDP support to: enetc, igc, stmmac - add AF_XDP support to: stmmac - virtio: - page_to_skb() use build_skb when there's sufficient tailroom (21% improvement for 1000B UDP frames) - support XDP even without dedicated Tx queues - share the Tx queues with the stack when necessary - mlx5: - flow rules: add support for mirroring with conntrack, matching on ICMP, GTP, flex filters and more - support packet sampling with flow offloads - persist uplink representor netdev across eswitch mode changes - allow coexistence of CQE compression and HW time-stamping - add ethtool extended link error state reporting - ice, iavf: support flow filters, UDP Segmentation Offload - dpaa2-switch: - move the driver out of staging - add spanning tree (STP) support - add rx copybreak support - add tc flower hardware offload on ingress traffic - ionic: - implement Rx page reuse - support HW PTP time-stamping - octeon: support TC hardware offloads - flower matching on ingress and egress ratelimitting. - stmmac: - add RX frame steering based on VLAN priority in tc flower - support frame preemption (FPE) - intel: add cross time-stamping freq difference adjustment - ocelot: - support forwarding of MRP frames in HW - support multiple bridges - support PTP Sync one-step timestamping - dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like learning, flooding etc. - ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350, SC7280 SoCs) - mt7601u: enable TDLS support - mt76: - add support for 802.3 rx frames (mt7915/mt7615) - mt7915 flash pre-calibration support - mt7921/mt7663 runtime power management fixes" * tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits) net: selftest: fix build issue if INET is disabled net: netrom: nr_in: Remove redundant assignment to ns net: tun: Remove redundant assignment to ret net: phy: marvell: add downshift support for M88E1240 net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255 net/sched: act_ct: Remove redundant ct get and check icmp: standardize naming of RFC 8335 PROBE constants bpf, selftests: Update array map tests for per-cpu batched ops bpf: Add batched ops support for percpu array bpf: Implement formatted output helpers with bstr_printf seq_file: Add a seq_bprintf function sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues net:nfc:digital: Fix a double free in digital_tg_recv_dep_req net: fix a concurrency bug in l2tp_tunnel_register() net/smc: Remove redundant assignment to rc mpls: Remove redundant assignment to err llc2: Remove redundant assignment to rc net/tls: Remove redundant initialization of record rds: Remove redundant assignment to nr_sig dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0 ...
2021-04-29Merge tag 'x86-mm-2021-04-29' of ↵Linus Torvalds16-295/+287
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 tlb updates from Ingo Molnar: "The x86 MM changes in this cycle were: - Implement concurrent TLB flushes, which overlaps the local TLB flush with the remote TLB flush. In testing this improved sysbench performance measurably by a couple of percentage points, especially if TLB-heavy security mitigations are active. - Further micro-optimizations to improve the performance of TLB flushes" * tag 'x86-mm-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: smp: Micro-optimize smp_call_function_many_cond() smp: Inline on_each_cpu_cond() and on_each_cpu() x86/mm/tlb: Remove unnecessary uses of the inline keyword cpumask: Mark functions as pure x86/mm/tlb: Do not make is_lazy dirty for no reason x86/mm/tlb: Privatize cpu_tlbstate x86/mm/tlb: Flush remote and local TLBs concurrently x86/mm/tlb: Open-code on_each_cpu_cond_mask() for tlb_is_not_lazy() x86/mm/tlb: Unify flush_tlb_func_local() and flush_tlb_func_remote() smp: Run functions concurrently in smp_call_function_many_cond()
2021-04-29Merge tag 'microblaze-v5.13' of git://git.monstr.eu/linux-2.6-microblazeLinus Torvalds9-82/+18
Pull Microblaze updates from Michal Simek: "No new features, just about cleaning up some code and moving to generic syscall solution used by other architectures: - Switch to generic syscall scripts - Some small fixes" * tag 'microblaze-v5.13' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: add 'fallthrough' to memcpy/memset/memmove microblaze: Fix a typo microblaze: tag highmem_setup() with __meminit microblaze: syscalls: switch to generic syscallhdr.sh microblaze: syscalls: switch to generic syscalltbl.sh
2021-04-29Merge tag 'mips_5.13' of ↵Linus Torvalds152-6022/+2682
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS updates from Thomas Bogendoerfer: - removed get_fs/set_fs - removed broken/unmaintained MIPS KVM trap and emulate support - added support for Loongson-2K1000 - fixes and cleanups * tag 'mips_5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (107 commits) MIPS: BCM63XX: Use BUG_ON instead of condition followed by BUG. MIPS: select ARCH_KEEP_MEMBLOCK unconditionally mips: Do not include hi and lo in clobber list for R6 MIPS:DTS:Correct the license for Loongson-2K MIPS:DTS:Fix label name and interrupt number of ohci for Loongson-2K MIPS: Avoid handcoded DIVU in `__div64_32' altogether lib/math/test_div64: Correct the spelling of "dividend" lib/math/test_div64: Fix error message formatting mips/bootinfo:correct some comments of fw_arg MIPS: Avoid DIVU in `__div64_32' is result would be zero MIPS: Reinstate platform `__div64_32' handler div64: Correct inline documentation for `do_div' lib/math: Add a `do_div' test module MIPS: Makefile: Replace -pg with CC_FLAGS_FTRACE MIPS: pci-legacy: revert "use generic pci_enable_resources" MIPS: Loongson64: Add kexec/kdump support MIPS: pci-legacy: use generic pci_enable_resources MIPS: pci-legacy: remove busn_resource field MIPS: pci-legacy: remove redundant info messages MIPS: pci-legacy: stop using of_pci_range_to_resource ...
2021-04-29Merge tag 'fsnotify_for_v5.13-rc1' of ↵Linus Torvalds20-161/+472
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify updates from Jan Kara: - support for limited fanotify functionality for unpriviledged users - faster merging of fanotify events - a few smaller fsnotify improvements * tag 'fsnotify_for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: shmem: allow reporting fanotify events with file handles on tmpfs fs: introduce a wrapper uuid_to_fsid() fanotify_user: use upper_32_bits() to verify mask fanotify: support limited functionality for unprivileged users fanotify: configurable limits via sysfs fanotify: limit number of event merge attempts fsnotify: use hash table for faster events merge fanotify: mix event info and pid into merge key hash fanotify: reduce event objectid to 29-bit hash fsnotify: allow fsnotify_{peek,remove}_first_event with empty queue
2021-04-29Merge tag 'for_v5.13-rc1' of ↵Linus Torvalds28-62/+168
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull quota, ext2, reiserfs updates from Jan Kara: - support for path (instead of device) based quotactl syscall (quotactl_path(2)) - ext2 conversion to kmap_local() - other minor cleanups & fixes * tag 'for_v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fs/reiserfs/journal.c: delete useless variables fs/ext2: Replace kmap() with kmap_local_page() ext2: Match up ext2_put_page() with ext2_dotdot() and ext2_find_entry() fs/ext2/: fix misspellings using codespell tool quota: report warning limits for realtime space quotas quota: wire up quotactl_path quota: Add mountpath based quota support
2021-04-29Merge tag 'xfs-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds92-1303/+1459
Pull xfs updates from Darrick Wong: "The notable user-visible addition this cycle is ability to remove space from the last AG in a filesystem. This is the first of many changes needed for full-fledged support for shrinking a filesystem. Still needed are (a) the ability to reorganize files and metadata away from the end of the fs; (b) the ability to remove entire allocation groups; (c) shrink support for realtime volumes; and (d) thorough testing of (a-c). There are a number of performance improvements in this code drop: Dave streamlined various parts of the buffer logging code and reduced the cost of various debugging checks, and added the ability to pre-create the xattr structures while creating files. Brian eliminated transaction reservations that were being held across writeback (thus reducing livelock potential. Other random pieces: Pavel fixed the repetitve warnings about deprecated mount options, I fixed online fsck to behave itself when a readonly remount comes in during scrub, and refactored various other parts of that code, Christoph contributed a lot of refactoring this cycle. The xfs_icdinode structure has been absorbed into the (incore) xfs_inode structure, and the format and flags handling around xfs_inode_fork structures has been simplified. Chandan provided a number of fixes for extent count overflow related problems that have been shaken out by debugging knobs added during 5.12. Summary: - Various minor fixes in online scrub. - Prevent metadata files from being automatically inactivated. - Validate btree heights by the computed per-btree limits. - Don't warn about remounting with deprecated mount options. - Initialize attr forks at create time if we suspect we're going to need to store them. - Reduce memory reallocation workouts in the logging code. - Fix some theoretical math calculation errors in logged buffers that span multiple discontig memory ranges but contiguous ondisk regions. - Speedups in dirty buffer bitmap handling. - Make type verifier functions more inline-happy to reduce overhead. - Reduce debug overhead in directory checking code. - Many many typo fixes. - Begin to handle the permanent loss of the very end of a filesystem. - Fold struct xfs_icdinode into xfs_inode. - Deprecate the long defunct BMV_IF_NO_DMAPI_READ from the bmapx ioctl. - Remove a broken directory block format check from online scrub. - Fix a bug where we could produce an unnecessarily tall data fork btree when creating an attr fork. - Fix scrub and readonly remounts racing. - Fix a writeback ioend log deadlock problem by dropping the behavior where we could preallocate a setfilesize transaction. - Fix some bugs in the new extent count checking code. - Fix some bugs in the attr fork preallocation code. - Refactor if_flags out of the incore inode fork data structure" * tag 'xfs-5.13-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (77 commits) xfs: remove xfs_quiesce_attr declaration xfs: remove XFS_IFEXTENTS xfs: remove XFS_IFINLINE xfs: remove XFS_IFBROOT xfs: only look at the fork format in xfs_idestroy_fork xfs: simplify xfs_attr_remove_args xfs: rename and simplify xfs_bmap_one_block xfs: move the XFS_IFEXTENTS check into xfs_iread_extents xfs: drop unnecessary setfilesize helper xfs: drop unused ioend private merge and setfilesize code xfs: open code ioend needs workqueue helper xfs: drop submit side trans alloc for append ioends xfs: fix return of uninitialized value in variable error xfs: get rid of the ip parameter to xchk_setup_* xfs: fix scrub and remount-ro protection when running scrub xfs: move the check for post-EOF mappings into xfs_can_free_eofblocks xfs: move the xfs_can_free_eofblocks call under the IOLOCK xfs: precalculate default inode attribute offset xfs: default attr fork size does not handle device inodes xfs: inode fork allocation depends on XFS_IFEXTENT flag ...
2021-04-29Merge tag 'gfs2-for-5.13' of ↵Linus Torvalds23-243/+312
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - Fix some compiler and kernel-doc warnings - Various minor cleanups and optimizations - Add a new sysfs gfs2 status file with some filesystem wide information * tag 'gfs2-for-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix fall-through warnings for Clang gfs2: Fix a number of kernel-doc warnings gfs2: Make gfs2_setattr_simple static gfs2: Add new sysfs file for gfs2 status gfs2: Silence possible null pointer dereference warning gfs2: Turn gfs2_meta_indirect_buffer into gfs2_meta_buffer gfs2: Replace gfs2_lblk_to_dblk with gfs2_get_extent gfs2: Turn gfs2_extent_map into gfs2_{get,alloc}_extent gfs2: Add new gfs2_iomap_get helper gfs2: Remove unused variable sb_format gfs2: Fix dir.c function parameter descriptions gfs2: Eliminate gh parameter from go_xmote_bh func gfs2: don't create empty buffers for NO_CREATE
2021-04-29Merge tag 'exfat-for-5.13-rc1' of ↵Linus Torvalds8-35/+206
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat updates from Namjae Jeon: - Improve write performance with dirsync mount option - Improve lookup performance - Add support for FITRIM ioctl - Fix a bug with discard option * tag 'exfat-for-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: speed up iterate/lookup by fixing start point of traversing cluster chain exfat: improve write performance when dirsync enabled exfat: add support ioctl and FITRIM function exfat: introduce bitmap_lock for cluster bitmap access exfat: fix erroneous discard when clear cluster bit
2021-04-29perf build: Defer printing detected features to the end of all feature checksArnaldo Carvalho de Melo1-0/+7
We were doing it in tools/build/Makefile.feature, after running the feature checks, but then in tools/perf/Makefile.config we can call more feature checks when we notice that some feature check failed, like when libbfd wasn't detected and we add libraries to the LDFLAGS of its feature check to try again, etc. Acked-by: Jiri Olsa <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29tools build: Allow deferring printing the results of feature detectionArnaldo Carvalho de Melo1-10/+17
By setting FEATURE_DISPLAY_DEFERRED=1 a tool may ask for the printout of the detected features in tools/build/Makefile.feature to be done later adter extra feature checks are done that are tool specific. The perf tool will do it via its tools/perf/Makefile.config, as it performs such extra feature checks. Acked-by: Jiri Olsa <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf build: Regenerate the FEATURE_DUMP file after extra feature checksJiri Olsa1-0/+6
Feature detection is done in tools/build/Makefile.feature, we may exit there with some features not detected and then, in tools/perf/Makefile.config try adding extra libraries to link and then do extra feature checks to see if we now find the feature. This is the case with the disassembler-four-args that checks if the diassembler() function in libopcodes (binutils) has a signature with one or with four arguments, as this is not ABI and they changed it at some point. This is not a problem when doing normal builds, for instance: $ make -C tools/perf O=/tmp/build/perf As we don't use what is in FEATURE-DUMP at that point, but is a problem if we pass FEATURE_DUMP=/previously-detected-features as we do in 'make -C tools/perf build-test' to reuse the feature detection in the many build combinations we test there. When that is done feature-disassembler-four-args will be set to 0, but opensuse 15.1 has the four arguments function signature in disassembler(). The build thus fails. Fix it by rewriting the FEATURE-DUMP file at the end of tools/perf/Makefile.config to register features we retested in that make file. Signed-off-by: Jiri Olsa <[email protected]> Reported-by: Arnaldo Carvalho de Melo <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf session: Dump PERF_RECORD_TIME_CONV eventLeo Yan3-1/+46
Now perf tool uses the common stub function process_event_op2_stub() for dumping TIME_CONV event, thus it doesn't output the clock parameters contained in the event. This patch adds the callback function for dumping the hardware clock parameters in TIME_CONV event. Before: # perf report -D 0x978 [0x38]: event: 79 . . ... raw event: size 56 bytes . 0000: 4f 00 00 00 00 00 38 00 15 00 00 00 00 00 00 00 O.....8......... . 0010: 00 00 40 01 00 00 00 00 86 89 0b bf df ff ff ff ..@........<BF><DF><FF><FF><FF> . 0020: d1 c1 b2 39 03 00 00 00 ff ff ff ff ff ff ff 00 <D1><C1><B2>9....<FF><FF><FF><FF><FF><FF><FF>. . 0030: 01 01 00 00 00 00 00 00 ........ 0 0 0x978 [0x38]: PERF_RECORD_TIME_CONV : unhandled! [...] After: # perf report -D 0x978 [0x38]: event: 79 . . ... raw event: size 56 bytes . 0000: 4f 00 00 00 00 00 38 00 15 00 00 00 00 00 00 00 O.....8......... . 0010: 00 00 40 01 00 00 00 00 86 89 0b bf df ff ff ff ..@........<BF><DF><FF><FF><FF> . 0020: d1 c1 b2 39 03 00 00 00 ff ff ff ff ff ff ff 00 <D1><C1><B2>9....<FF><FF><FF><FF><FF><FF><FF>. . 0030: 01 01 00 00 00 00 00 00 ........ 0 0 0x978 [0x38]: PERF_RECORD_TIME_CONV ... Time Shift 21 ... Time Muliplier 20971520 ... Time Zero 18446743935180835206 ... Time Cycles 13852918225 ... Time Mask 0xffffffffffffff ... Cap Time Zero 1 ... Cap Time Short 1 : unhandled! [...] Signed-off-by: Leo Yan <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gustavo A. R. Silva <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steve MacLean <[email protected]> Cc: Yonatan Goldschmidt <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf session: Add swap operation for event TIME_CONVLeo Yan1-1/+14
Since commit d110162cafc8 ("perf tsc: Support cap_user_time_short for event TIME_CONV"), the event PERF_RECORD_TIME_CONV has extended the data structure for clock parameters. To be backwards-compatible, this patch adds a dedicated swap operation for the event PERF_RECORD_TIME_CONV, based on checking if the event contains field "time_cycles", it can support both for the old and new event formats. Fixes: d110162cafc8 ("perf tsc: Support cap_user_time_short for event TIME_CONV") Signed-off-by: Leo Yan <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gustavo A. R. Silva <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steve MacLean <[email protected]> Cc: Yonatan Goldschmidt <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf jit: Let convert_timestamp() to be backwards-compatibleLeo Yan2-10/+22
Commit d110162cafc80dad ("perf tsc: Support cap_user_time_short for event TIME_CONV") supports the extended parameters for event TIME_CONV, but it broke the backwards compatibility, so any perf data file with old event format fails to convert timestamp. This patch introduces a helper event_contains() to check if an event contains a specific member or not. For the backwards-compatibility, if the event size confirms the extended parameters are supported in the event TIME_CONV, then copies these parameters. Committer notes: To make this compiler backwards compatible add this patch: - struct perf_tsc_conversion tc = { 0 }; + struct perf_tsc_conversion tc = { .time_shift = 0, }; Fixes: d110162cafc8 ("perf tsc: Support cap_user_time_short for event TIME_CONV") Signed-off-by: Leo Yan <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gustavo A. R. Silva <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steve MacLean <[email protected]> Cc: Yonatan Goldschmidt <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf tools: Change fields type in perf_record_time_convLeo Yan1-2/+3
C standard claims "An object declared as type _Bool is large enough to store the values 0 and 1", bool type size can be 1 byte or larger than 1 byte. Thus it's uncertian for bool type size with different compilers. This patch changes the bool type in structure perf_record_time_conv to __u8 type, and pads extra bytes for 8-byte alignment; this can give reliable structure size. Fixes: d110162cafc8 ("perf tsc: Support cap_user_time_short for event TIME_CONV") Suggested-by: Adrian Hunter <[email protected]> Signed-off-by: Leo Yan <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gustavo A. R. Silva <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steve MacLean <[email protected]> Cc: Yonatan Goldschmidt <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf tools: Enable libtraceevent dynamic linkingMichael Petlan5-2/+32
Currently we support only static linking with kernel's libtraceevent (tools/lib/traceevent). This patch adds libtraceevent package detection and support to link perf with it dynamically. The libtraceevent package status is displayed with: $ make VF=1 LIBTRACEEVENT_DYNAMIC=1 ... ... libtraceevent: [ on ] Default behavior remains the same (static linking). Committer testing: $ make LIBTRACEEVENT_DYNAMIC=1 VF=1 O=/tmp/build/perf -C tools/perf install-bin |& grep traceevent Makefile.config:1090: *** Error: No libtraceevent devel library found, please install libtraceevent-devel. Stop. $ Signed-off-by: Michael Petlan <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> LPU-Reference: [email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf Documentation: Document intel-hybrid supportJin Yao3-0/+217
Add some words and examples to help understanding of Intel hybrid perf support. Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf tests: Skip 'perf stat metrics (shadow stat) test' for hybridJin Yao1-0/+3
Currently we don't support shadow stat for hybrid. root@ssp-pwrt-002:~# ./perf stat -e cycles,instructions -a -- sleep 1 Performance counter stats for 'system wide': 12,883,109,591 cpu_core/cycles/ 6,405,163,221 cpu_atom/cycles/ 555,553,778 cpu_core/instructions/ 841,158,734 cpu_atom/instructions/ 1.002644773 seconds time elapsed Now there is no shadow stat 'insn per cycle' reported. We will support it later and now just skip the 'perf stat metrics (shadow stat) test'. Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf tests: Support 'Convert perf time to TSC' test for hybridJin Yao1-0/+12
Since for "cycles:u' on hybrid platform, it creates two "cycles". So the second evsel in evlist also needs initialization. With this patch, # ./perf test 71 71: Convert perf time to TSC : Ok Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf tests: Support 'Session topology' test for hybridJin Yao1-2/+11
Force to create one event "cpu_core/cycles/" by default, otherwise in evlist__valid_sample_type, the checking of 'if (evlist->core.nr_entries == 1)' would be failed. # ./perf test 41 41: Session topology : Ok Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf tests: Support 'Parse and process metrics' test for hybridJin Yao1-2/+6
Some events are not supported. Only pick up some cases for hybrid. # ./perf test 68 68: Parse and process metrics : Ok Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf tests: Support 'Track with sched_switch' test for hybridJin Yao1-1/+5
Since for "cycles:u' on hybrid platform, it creates two "cycles". So the number of events in evlist is not expected in next test steps. Now we just use one event "cpu_core/cycles:u/" for hybrid. # ./perf test 35 35: Track with sched_switch : Ok Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf tests: Skip 'Setup struct perf_event_attr' test for hybridJin Yao1-0/+4
For hybrid, the attr.type consists of pmu type id + original type. There will be much changes for this test. Now we temporarily skip this test case and TODO in future. Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf tests: Add hybrid cases for 'Roundtrip evsel->name' testJin Yao1-7/+12
Since for one hw event, two hybrid events are created. For example, evsel->idx evsel__name(evsel) 0 cycles 1 cycles 2 instructions 3 instructions ... So for comparing the evsel name on hybrid, the evsel->idx needs to be divided by 2. # ./perf test 14 14: Roundtrip evsel->name : Ok Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf tests: Add hybrid cases for 'Parse event definition strings' testJin Yao1-0/+171
Add basic hybrid test cases for 'Parse event definition strings' test. # perf test 6 6: Parse event definition strings : Ok Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf record: Uniquify hybrid event nameJin Yao1-0/+28
For perf-record, it would be useful to tell user the pmu which the event belongs to. For example, # perf record -a -- sleep 1 # perf report # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 106 of event 'cpu_core/cycles/' # Event count (approx.): 22043448 # # Overhead Command Shared Object Symbol # ........ ............ ....................... ............................ # ... Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf stat: Warn group events from different hybrid PMUJin Yao6-0/+62
If a group has events which are from different hybrid PMUs, shows a warning: "WARNING: events in group from different hybrid PMUs!" This is to remind the user not to put the core event and atom event into one group. Next, just disable grouping. # perf stat -e "{cpu_core/cycles/,cpu_atom/cycles/}" -a -- sleep 1 WARNING: events in group from different hybrid PMUs! WARNING: grouped events cpus do not match, disabling group: anon group { cpu_core/cycles/, cpu_atom/cycles/ } Performance counter stats for 'system wide': 5,438,125 cpu_core/cycles/ 3,914,586 cpu_atom/cycles/ 1.004250966 seconds time elapsed Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf stat: Filter out unmatched aggregation for hybrid eventJin Yao1-0/+3
perf-stat has supported some aggregation modes, such as --per-core, --per-socket and etc. While for hybrid event, it may only available on part of cpus. So for --per-core, we need to filter out the unavailable cores, for --per-socket, filter out the unavailable sockets, and so on. Before: # perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1 Performance counter stats for 'system wide': S0-D0-C0 2 479,530 cpu_core/cycles/ S0-D0-C4 2 175,007 cpu_core/cycles/ S0-D0-C8 2 166,240 cpu_core/cycles/ S0-D0-C12 2 704,673 cpu_core/cycles/ S0-D0-C16 2 865,835 cpu_core/cycles/ S0-D0-C20 2 2,958,461 cpu_core/cycles/ S0-D0-C24 2 163,988 cpu_core/cycles/ S0-D0-C28 2 164,729 cpu_core/cycles/ S0-D0-C32 0 <not counted> cpu_core/cycles/ S0-D0-C33 0 <not counted> cpu_core/cycles/ S0-D0-C34 0 <not counted> cpu_core/cycles/ S0-D0-C35 0 <not counted> cpu_core/cycles/ S0-D0-C36 0 <not counted> cpu_core/cycles/ S0-D0-C37 0 <not counted> cpu_core/cycles/ S0-D0-C38 0 <not counted> cpu_core/cycles/ S0-D0-C39 0 <not counted> cpu_core/cycles/ 1.003597211 seconds time elapsed After: # perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1 Performance counter stats for 'system wide': S0-D0-C0 2 210,428 cpu_core/cycles/ S0-D0-C4 2 444,830 cpu_core/cycles/ S0-D0-C8 2 435,241 cpu_core/cycles/ S0-D0-C12 2 423,976 cpu_core/cycles/ S0-D0-C16 2 859,350 cpu_core/cycles/ S0-D0-C20 2 1,559,589 cpu_core/cycles/ S0-D0-C24 2 163,924 cpu_core/cycles/ S0-D0-C28 2 376,610 cpu_core/cycles/ 1.003621290 seconds time elapsed Signed-off-by: Jin Yao <[email protected]> Co-developed-by: Jiri Olsa <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf stat: Add default hybrid eventsJin Yao1-0/+28
Previously if '-e' is not specified in perf stat, some software events and hardware events are added to evlist by default. Before: # perf stat -a -- sleep 1 Performance counter stats for 'system wide': 24,044.40 msec cpu-clock # 23.946 CPUs utilized 99 context-switches # 4.117 /sec 24 cpu-migrations # 0.998 /sec 3 page-faults # 0.125 /sec 7,000,244 cycles # 0.000 GHz 2,955,024 instructions # 0.42 insn per cycle 608,941 branches # 25.326 K/sec 31,991 branch-misses # 5.25% of all branches 1.004106859 seconds time elapsed Among the events, cycles, instructions, branches and branch-misses are hardware events. One hybrid platform, two hardware events are created for one hardware event. cpu_core/cycles/, cpu_atom/cycles/, cpu_core/instructions/, cpu_atom/instructions/, cpu_core/branches/, cpu_atom/branches/, cpu_core/branch-misses/, cpu_atom/branch-misses/ These events would be added to evlist on hybrid platform. Since parse_events() has been supported to create two hardware events for one event on hybrid platform, so we just use parse_events(evlist, "cycles,instructions,branches,branch-misses") to create the default events and add them to evlist. After: # perf stat -a -- sleep 1 Performance counter stats for 'system wide': 24,043.99 msec cpu-clock # 23.991 CPUs utilized 139 context-switches # 5.781 /sec 25 cpu-migrations # 1.040 /sec 6 page-faults # 0.250 /sec 10,381,751 cpu_core/cycles/ # 431.782 K/sec 1,264,216 cpu_atom/cycles/ # 52.579 K/sec 3,406,958 cpu_core/instructions/ # 141.697 K/sec 414,588 cpu_atom/instructions/ # 17.243 K/sec 705,149 cpu_core/branches/ # 29.327 K/sec 82,358 cpu_atom/branches/ # 3.425 K/sec 40,821 cpu_core/branch-misses/ # 1.698 K/sec 9,086 cpu_atom/branch-misses/ # 377.891 /sec 1.002228863 seconds time elapsed We can see two events are created for one hardware event. One TODO is, the shadow stats looks a bit different, now it's just 'M/sec'. The perf_stat__update_shadow_stats and perf_stat__print_shadow_stats need to be improved in future if we want to get the original shadow stats. Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf record: Create two hybrid 'cycles' events by defaultJin Yao7-9/+77
When evlist is empty, for example no '-e' specified in perf record, one default 'cycles' event is added to evlist. While on hybrid platform, it needs to create two default 'cycles' events. One is for cpu_core, the other is for cpu_atom. This patch actually calls evsel__new_cycles() two times to create two 'cycles' events. # ./perf record -vv -a -- sleep 1 ... ------------------------------------------------------------ perf_event_attr: size 120 config 0x400000000 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|ID|CPU|PERIOD read_format ID disabled 1 inherit 1 freq 1 precise_ip 3 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5 sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 6 sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 7 sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 9 sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 10 sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 11 sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 12 sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 13 sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 14 sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 15 sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 16 sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 17 sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 18 sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 19 sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 20 sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 21 ------------------------------------------------------------ perf_event_attr: size 120 config 0x800000000 { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|ID|CPU|PERIOD read_format ID disabled 1 inherit 1 freq 1 precise_ip 3 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 22 sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 23 sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 24 sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 25 sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 26 sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 27 sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 28 sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 29 ------------------------------------------------------------ We have to create evlist-hybrid.c otherwise due to the symbol dependency the perf test python would be failed. Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf parse-events: Support event inside hybrid pmuJin Yao1-0/+63
On hybrid platform, user may want to enable events on one pmu. Following syntax are supported: cpu_core/<event>/ cpu_atom/<event>/ But the syntax doesn't work for cache event. Before: # perf stat -e cpu_core/LLC-loads/ -a -- sleep 1 event syntax error: 'cpu_core/LLC-loads/' \___ unknown term 'LLC-loads' for pmu 'cpu_core' Cache events are a bit complex. We can't create aliases for them. We use another solution. For example, if we use "cpu_core/LLC-loads/", in parse_events_add_pmu(), term->config is "LLC-loads". Then we create a new parser to scan "LLC-loads". The parse_events_add_cache() would be called during parsing. The parse_state->hybrid_pmu_name is used to identify the pmu where the event should be enabled on. After: # perf stat -e cpu_core/LLC-loads/ -a -- sleep 1 Performance counter stats for 'system wide': 24,593 cpu_core/LLC-loads/ 1.003911601 seconds time elapsed Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf parse-events: Compare with hybrid pmu nameJin Yao5-8/+34
On hybrid platform, user may want to enable event only on one pmu. Following syntax will be supported: cpu_core/<event>/ cpu_atom/<event>/ For hardware event, hardware cache event and raw event, two events are created by default. We pass the specified pmu name in parse_state and it would be checked before event creation. So next only the event with the specified pmu would be created. Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf parse-events: Create two hybrid raw eventsJin Yao1-1/+37
On hybrid platform, same raw event is possible to be available on both cpu_core pmu and cpu_atom pmu. It's supported to create two raw events for one event encoding. For raw events, the attr.type is PMU type. # perf stat -e r3c -a -vv -- sleep 1 Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 4 size 120 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: type 4 size 120 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19 ------------------------------------------------------------ perf_event_attr: type 8 size 120 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: type 8 size 120 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27 r3c: 0: 434449 1001412521 1001412521 r3c: 1: 173162 1001482031 1001482031 r3c: 2: 231710 1001524974 1001524974 r3c: 3: 110012 1001563523 1001563523 r3c: 4: 191517 1001593221 1001593221 r3c: 5: 956458 1001628147 1001628147 r3c: 6: 416969 1001715626 1001715626 r3c: 7: 1047527 1001596650 1001596650 r3c: 8: 103877 1001633520 1001633520 r3c: 9: 70571 1001637898 1001637898 r3c: 10: 550284 1001714398 1001714398 r3c: 11: 1257274 1001738349 1001738349 r3c: 12: 107797 1001801432 1001801432 r3c: 13: 67471 1001836281 1001836281 r3c: 14: 286782 1001923161 1001923161 r3c: 15: 815509 1001952550 1001952550 r3c: 0: 95994 1002071117 1002071117 r3c: 1: 105570 1002142438 1002142438 r3c: 2: 115921 1002189147 1002189147 r3c: 3: 72747 1002238133 1002238133 r3c: 4: 103519 1002276753 1002276753 r3c: 5: 121382 1002315131 1002315131 r3c: 6: 80298 1002248050 1002248050 r3c: 7: 466790 1002278221 1002278221 r3c: 6821369 16026754282 16026754282 r3c: 1162221 8017758990 8017758990 Performance counter stats for 'system wide': 6,821,369 cpu_core/r3c/ 1,162,221 cpu_atom/r3c/ 1.002289965 seconds time elapsed Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf parse-events: Create two hybrid cache eventsJin Yao3-1/+37
For cache events, they have pre-defined configs. The kernel needs to know where the cache event comes from (e.g. from cpu_core pmu or from cpu_atom pmu). But the perf type PERF_TYPE_HW_CACHE can't carry pmu information. Now the type PERF_TYPE_HW_CACHE is extended to be PMU aware type. The PMU type ID is stored at attr.config[63:32]. When enabling a hybrid cache event without specified pmu, such as, 'perf stat -e LLC-loads -a', two events are created automatically. One is for atom, the other is for core. # perf stat -e LLC-loads -a -vv -- sleep 1 Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x400000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x400000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19 ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x800000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: type 3 size 120 config 0x800000002 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27 LLC-loads: 0: 1507 1001800280 1001800280 LLC-loads: 1: 666 1001812250 1001812250 LLC-loads: 2: 3353 1001813453 1001813453 LLC-loads: 3: 514 1001848795 1001848795 LLC-loads: 4: 627 1001952832 1001952832 LLC-loads: 5: 4399 1001451154 1001451154 LLC-loads: 6: 1240 1001481052 1001481052 LLC-loads: 7: 478 1001520348 1001520348 LLC-loads: 8: 691 1001551236 1001551236 LLC-loads: 9: 310 1001578945 1001578945 LLC-loads: 10: 1018 1001594354 1001594354 LLC-loads: 11: 3656 1001622355 1001622355 LLC-loads: 12: 882 1001661416 1001661416 LLC-loads: 13: 506 1001693963 1001693963 LLC-loads: 14: 3547 1001721013 1001721013 LLC-loads: 15: 1399 1001734818 1001734818 LLC-loads: 0: 1314 1001793826 1001793826 LLC-loads: 1: 2857 1001752764 1001752764 LLC-loads: 2: 646 1001830694 1001830694 LLC-loads: 3: 1612 1001864861 1001864861 LLC-loads: 4: 2244 1001912381 1001912381 LLC-loads: 5: 1255 1001943889 1001943889 LLC-loads: 6: 4624 1002021109 1002021109 LLC-loads: 7: 2703 1001959302 1001959302 LLC-loads: 24793 16026838264 16026838264 LLC-loads: 17255 8015078826 8015078826 Performance counter stats for 'system wide': 24,793 cpu_core/LLC-loads/ 17,255 cpu_atom/LLC-loads/ 1.001970988 seconds time elapsed 0x4 in 0x400000002 indicates the cpu_core pmu. 0x8 in 0x800000002 indicates the cpu_atom pmu. Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf parse-events: Create two hybrid hardware eventsJin Yao5-0/+141
Current hardware events has special perf types PERF_TYPE_HARDWARE. But it doesn't pass the PMU type in the user interface. For a hybrid system, the perf kernel doesn't know which PMU the events belong to. So now this type is extended to be PMU aware type. The PMU type ID is stored at attr.config[63:32]. PMU type ID is retrieved from sysfs. root@lkp-adl-d01:/sys/devices/cpu_atom# cat type 8 root@lkp-adl-d01:/sys/devices/cpu_core# cat type 4 When enabling a hybrid hardware event without specified pmu, such as, 'perf stat -e cycles -a', two events are created automatically. One is for atom, the other is for core. # perf stat -e cycles -a -vv -- sleep 1 Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: size 120 config 0x400000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: size 120 config 0x400000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 19 ------------------------------------------------------------ perf_event_attr: size 120 config 0x800000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 20 ------------------------------------------------------------ ... ------------------------------------------------------------ perf_event_attr: size 120 config 0x800000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 27 cycles: 0: 836272 1001525722 1001525722 cycles: 1: 628564 1001580453 1001580453 cycles: 2: 872693 1001605997 1001605997 cycles: 3: 70417 1001641369 1001641369 cycles: 4: 88593 1001726722 1001726722 cycles: 5: 470495 1001752993 1001752993 cycles: 6: 484733 1001840440 1001840440 cycles: 7: 1272477 1001593105 1001593105 cycles: 8: 209185 1001608616 1001608616 cycles: 9: 204391 1001633962 1001633962 cycles: 10: 264121 1001661745 1001661745 cycles: 11: 826104 1001689904 1001689904 cycles: 12: 89935 1001728861 1001728861 cycles: 13: 70639 1001756757 1001756757 cycles: 14: 185266 1001784810 1001784810 cycles: 15: 171094 1001825466 1001825466 cycles: 0: 129624 1001854843 1001854843 cycles: 1: 122533 1001840421 1001840421 cycles: 2: 90055 1001882506 1001882506 cycles: 3: 139607 1001896463 1001896463 cycles: 4: 141791 1001907838 1001907838 cycles: 5: 530927 1001883880 1001883880 cycles: 6: 143246 1001852529 1001852529 cycles: 7: 667769 1001872626 1001872626 cycles: 6744979 16026956922 16026956922 cycles: 1965552 8014991106 8014991106 Performance counter stats for 'system wide': 6,744,979 cpu_core/cycles/ 1,965,552 cpu_atom/cycles/ 1.001882711 seconds time elapsed 0x4 in 0x400000000 indicates the cpu_core pmu. 0x8 in 0x800000000 indicates the cpu_atom pmu. Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>