aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2024-02-22selftests: add eventfd selftestsWen Yang3-0/+195
This adds the promised selftest for eventfd. It will verify the flags of eventfd2, including EFD_CLOEXEC, EFD_NONBLOCK and EFD_SEMAPHORE. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Wen Yang <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Javier Martinez Canillas <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Pengfei Xu <[email protected]> Cc: Miklos Szeredi <[email protected]> Cc: Andrei Vagin <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Steven Rostedt <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski75-338/+583
Cross-merge networking fixes after downstream PR. Conflicts: net/ipv4/udp.c f796feabb9f5 ("udp: add local "peek offset enabled" flag") 56667da7399e ("net: implement lockless setsockopt(SO_PEEK_OFF)") Adjacent changes: net/unix/garbage.c aa82ac51d633 ("af_unix: Drop oob_skb ref before purging queue in GC.") 11498715f266 ("af_unix: Remove io_uring code for GC.") Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-22selftests: add zswapin and no zswap testsNhat Pham1-2/+118
Add a selftest to cover the zswapin code path, allocating more memory than the cgroup limit to trigger swapout/zswapout, then reading the pages back in memory several times. This is inspired by a recently encountered kernel crash on the zswapin path in our internal kernel, which went undetected because of a lack of test coverage for this path. Add a selftest to verify that when memory.zswap.max = 0, no pages can go to the zswap pool for the cgroup. [[email protected]: remove redundant comment, add success checks] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Nhat Pham <[email protected]> Suggested-by: Rik van Riel <[email protected]> Suggested-by: Yosry Ahmed <[email protected]> Acked-by: Yosry Ahmed <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Zefan Li <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests: fix the zswap invasive shrink testNhat Pham1-1/+1
The zswap no invasive shrink selftest breaks because we rename the zswap writeback counter (see [1]). Fix the test. [1]: https://patchwork.kernel.org/project/linux-kselftest/patch/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Fixes: a697dc2be925 ("selftests: cgroup: update per-memcg zswap writeback selftest") Signed-off-by: Nhat Pham <[email protected]> Acked-by: Yosry Ahmed <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Zefan Li <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/bpf: Test case for lacking CFI stub functions.Kui-Feng Lee6-3/+151
Ensure struct_ops rejects the registration of struct_ops types without proper CFI stub functions. bpf_test_no_cfi.ko is a module that attempts to register a struct_ops type called "bpf_test_no_cfi_ops" with cfi_stubs of NULL and non-NULL value. The NULL one should fail, and the non-NULL one should succeed. The module can only be loaded successfully if these registrations yield the expected results. Signed-off-by: Kui-Feng Lee <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2024-02-22Merge tag 'for-linus-iommufd' of ↵Linus Torvalds3-31/+91
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd fixes from Jason Gunthorpe: - Fix dirty tracking bitmap collection when using reporting bitmaps that are not neatly aligned to u64's or match the IO page table radix tree layout. - Add self tests to cover the cases that were found to be broken. - Add missing enforcement of invalidation type in the uapi. - Fix selftest config generation * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: selftests/iommu: fix the config fragment iommufd: Reject non-zero data_type if no data_len is provided iommufd/iova_bitmap: Consider page offset for the pages to be pinned iommufd/selftest: Add mock IO hugepages tests iommufd/selftest: Hugepage mock domain support iommufd/selftest: Refactor mock_domain_read_and_clear_dirty() iommufd/selftest: Refactor dirty bitmap tests iommufd/iova_bitmap: Handle recording beyond the mapped pages iommufd/selftest: Test u64 unaligned bitmaps iommufd/iova_bitmap: Switch iova_bitmap::bitmap to an u8 array iommufd/iova_bitmap: Bounds check mapped::pages access
2024-02-22tools/power x86_energy_perf_policy: Fix file leak in get_pkg_num()Samasth Norway Ananda1-0/+1
In function get_pkg_num() if fopen_or_die() succeeds it returns a file pointer to be used. But fclose() is never called before returning from the function. Signed-off-by: Samasth Norway Ananda <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2024-02-22selftests/mm: log a consistent test name for check_compactionMark Brown1-16/+19
Every test result report in the compaction test prints a distinct log messae, and some of the reports print a name that varies at runtime. This causes problems for automation since a lot of automation software uses the printed string as the name of the test, if the name varies from run to run and from pass to fail then the automation software can't identify that a test changed result or that the same tests are being run. Refactor the logging to use a consistent name when printing the result of the test, printing the existing messages as diagnostic information instead so they are still available for people trying to interpret the results. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Cc: Muhammad Usama Anjum <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: log skipped compaction test as a skipMark Brown1-1/+1
Patch series "selftests/mm: Output cleanups for the compaction test". A couple of small updates for the check_compaction selftest which make it play more nicely with test automation systems. This patch (of 2): When the compaction test is run it checks to make sure that prerequistives the test requires are available and skips the tests if not. When this happens we log the test as a pass rather than a skip, log as a skip so that the distinction is clear and automation can see unexpected skips. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Cc: Muhammad Usama Anjum <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/damon/_chk_dependency: get debugfs mount point from /proc/mountsSeongJae Park1-1/+8
DAMON debugfs selftests dependency checker assumes debugfs would be mounted at /sys/kernel/debug. That would be ok for many cases, but some systems might mounted the file system on some different places. Parse the real mount point using /proc/mounts file. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: SeongJae Park <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/damon: add a test for the pid leak of dbgfs_target_ids_write()SeongJae Park4-0/+93
Commit ebb3f994dd92 ("mm/damon/dbgfs: fix 'struct pid' leaks in 'dbgfs_target_ids_write()'") fixes a pid leak bug in DAMON debugfs interface, namely dbgfs_target_ids_write() function. Add a selftest for the issue to prevent the problem from mistakenly recurring. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: SeongJae Park <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/damon: add a test for a race between target_ids_read() and ↵SeongJae Park4-0/+97
dbgfs_before_terminate() commit 34796417964b ("mm/damon/dbgfs: protect targets destructions with kdamond_lock") fixed a race of DAMON debugfs interface. Specifically, the race was happening between target_ids_read() and dbgfs_before_terminate(). Add a test for the issue to prevent the problem from accidentally recurring. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: SeongJae Park <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/damon: add a test for DAMOS apply intervalsSeongJae Park2-1/+68
Add a selftest for DAMOS apply intervals. It runs two schemes having different apply interval agains an artificial memory access workload, and check if the scheme with smaller apply interval was applied more frequently. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: SeongJae Park <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/damon: add a test for DAMOS quotaSeongJae Park2-0/+68
Add a selftest for verifying the DAMOS quota feature. The test is very similar to sysfs_update_schemes_tried_regions_wss_estimation.py. It starts an artificial workload of 20 MiB working set, run DAMON to find the working set size, but with 1 MiB/100 ms size quota. Then, it collect the DAMON-found working set size every 100 ms and check if the quota was always applied as expected. For the confirmation, the tests shows the stat-applied region size and the qt_exceeds stat. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: SeongJae Park <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/damon/_damon_sysfs: support DAMOS apply intervalSeongJae Park1-1/+8
Update the test-purpose DAMON sysfs control Python module to support DAMOS apply interval. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: SeongJae Park <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/damon/_damon_sysfs: support DAMOS statsSeongJae Park1-0/+32
Update the test-purpose DAMON sysfs control Python module to support DAMOS stats. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: SeongJae Park <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/damon/_damon_sysfs: support DAMOS quotaSeongJae Park1-9/+33
Patch series "selftests/damon: add more tests for core functionalities and corner cases". Continue DAMON selftests' test coverage improvement works with a trivial improvement of the test code itself. The sequence of the patches in patchset is as follows. The first five patches add two DAMON core functionalities tests. Those begins with three patches (patches 1-3) that update the test-purpose DAMON sysfs interface wrapper to support DAMOS quota, stats, and apply interval features, respectively. The fourth patch implements and adds a selftest for DAMOS quota feature, using the DAMON sysfs interface wrapper's newly added support of the quota and the stats feature. The fifth patch further implements and adds a selftest for DAMOS apply interval using the DAMON sysfs interface wrapper's newly added support of the apply interval and the stats feature. Two patches (patches 6 and 7) for implementing and adding two corner cases handling selftests follow. Those try to avoid two previously fixed bugs from recurring. Finally, a patch for making DAMON debugfs selftests dependency checker to use /proc/mounts instead of the hard-coded mount point assumption follows. This patch (of 8): Update the test-purpose DAMON sysfs control Python module to support DAMOS quota. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: SeongJae Park <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: run_vmtests.sh: add hugetlb_madv_vs_mapBreno Leitao1-0/+1
hugetlb_madv_vs_map selftest was not part of the mm test-suite since we didn't have a fix for the problem it found. Now that the problem is already fixed (see previous commit), let's enable this selftest in the default test-suite. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Breno Leitao <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Lorenzo Stoakes <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Muchun Song <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: run_vmtests.sh: add hugetlb test categoryBreno Leitao1-0/+2
The usage of run_vmtests.sh does not include hugetlb, which is a valid test category. Add the 'hugetlb' to the usage of run_vmtests.sh. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Breno Leitao <[email protected]> Reviewed-by: Muhammad Usama Anjum <[email protected]> Reviewed-by: Joel Savitz <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: virtual_address_range: conform to TAP format outputMuhammad Usama Anjum1-22/+22
Conform the layout, informational and status messages to TAP. No functional change is intended other than the layout of output messages. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muhammad Usama Anjum <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: transhuge-stress: conform to TAP format outputMuhammad Usama Anjum2-17/+25
Conform the layout, informational and status messages to TAP. No functional change is intended other than the layout of output messages. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muhammad Usama Anjum <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: thuge-gen: conform to TAP format outputMuhammad Usama Anjum1-72/+75
Conform the layout, informational and status messages to TAP. No functional change is intended other than the layout of output messages. Also remove unneeded logging which isn't enabled. Skip a hugepage size if it has less free pages to avoid unnecessary failures. For examples, some systems may not have 1GB hugepage free. So skip 1GB for testing in this test instead of failing the entire test. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muhammad Usama Anjum <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: split_huge_page_test: conform test to TAP format outputMuhammad Usama Anjum1-92/+69
Conform the layout, informational and status messages to TAP. No functional change is intended other than the layout of output messages. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muhammad Usama Anjum <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: mremap_dontunmap: conform test to TAP format outputMuhammad Usama Anjum1-12/+20
Conform the layout, informational and status messages to TAP. No functional change is intended other than the layout of output messages. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muhammad Usama Anjum <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: mrelease_test: conform test to TAP format outputMuhammad Usama Anjum1-47/+33
Conform the layout, informational and status messages to TAP. No functional change is intended other than the layout of output messages. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muhammad Usama Anjum <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: mlock2-tests: conform test to TAP format outputMuhammad Usama Anjum2-175/+118
Conform the layout, informational and status messages to TAP. No functional change is intended other than the layout of output messages. I've done some cleanups as well. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muhammad Usama Anjum <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: mlock-random-test: conform test to TAP format outputMuhammad Usama Anjum1-82/+54
Conform the layout, informational and status messages to TAP. No functional change is intended other than the layout of output messages. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muhammad Usama Anjum <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: map_populate: conform test to TAP format outputMuhammad Usama Anjum1-14/+23
Conform the layout, informational and status messages to TAP. No functional change is intended other than the layout of output messages. Minor cleanups have also been included. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muhammad Usama Anjum <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: map_hugetlb: conform test to TAP format outputMuhammad Usama Anjum1-22/+20
Conform the layout, informational and status messages to TAP. No functional change is intended other than the layout of output messages. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muhammad Usama Anjum <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: map_fixed_noreplace: conform test to TAP format outputMuhammad Usama Anjum1-65/+31
Patch series "conform tests to TAP format output", v2. This patch (of 12): Conform the layout, informational and status messages to TAP. No functional change is intended other than the layout of output messages. While at it, convert commenting style from // to /**/. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Muhammad Usama Anjum <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftets/damon: prepare for monitor_on file renamingSeongJae Park3-4/+26
Following change will rename 'monitor_on' DAMON debugfs file to 'monitor_on_DEPRECATED', to make the deprecation unignorable in runtime. Since it could make DAMON selftests fail and disturb future bisects, update DAMON selftests to support the change. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: SeongJae Park <[email protected]> Cc: Alex Shi <[email protected]> Cc: Hu Haowen <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Yanteng Si <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests/mm: new test that steals pagesBreno Leitao3-0/+126
This test stresses the race between of madvise(DONTNEED), a page fault and a parallel huge page mmap, which should fail due to lack of available page available for mapping. This test case must run on a system with one and only one huge page available. # echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages During setup, the test allocates the only available page, and starts three threads: - thread 1: * madvise(MADV_DONTNEED) on the allocated huge page - thread 2: * Write to the allocated huge page - thread 3: * Tries to allocated (steal) an extra huge page (which is not available) thread 3 should never succeed in the allocation, since the only huge page was never unmapped, and should be reserved. Touching the old page after thread3 allocation will raise a SIGBUS. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Breno Leitao <[email protected]> Cc: Mike Rapoport (IBM) <[email protected]> Cc: Muchun Song <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Vegard Nossum <[email protected]> Cc: Yang Shi <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22selftests: mm: perform some system cleanup before using hugepagesNico Pache1-0/+9
When running with CATEGORY= (thp | hugetlb) we see a large numbers of tests failing. These failures are due to not being able to allocate a hugepage and normally occur on memory contrainted systems or when using large page sizes. drop_cache and compact_memory before the tests for a higher chance at a successful hugepage allocation. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Nico Pache <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22bpf: Clarify batch lookup/lookup_and_delete semanticsMartin Kelly2-6/+17
The batch lookup and lookup_and_delete APIs have two parameters, in_batch and out_batch, to facilitate iterative lookup/lookup_and_deletion operations for supported maps. Except NULL for in_batch at the start of these two batch operations, both parameters need to point to memory equal or larger than the respective map key size, except for various hashmaps (hash, percpu_hash, lru_hash, lru_percpu_hash) where the in_batch/out_batch memory size should be at least 4 bytes. Document these semantics to clarify the API. Signed-off-by: Martin Kelly <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2024-02-22selftests/memfd: delete unused declarationsGreg Thelen1-10/+0
Commit 32d118ad50a5 ("selftests/memfd: add tests for F_SEAL_EXEC"): - added several unused 'nbytes' local variables Commit 6469b66e3f5a ("selftests: improve vm.memfd_noexec sysctl tests"): - orphaned 'newpid_thread_fn2()' forward declaration - orphaned 'join_newpid_thread()' forward declaration - added unused 'pid' local in sysctl_simple_child() - orphaned 'fd' local in sysctl_simple_child() - added unused 'fd' in sysctl_nested_child() Delete the unused locals and forward declarations. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Thelen <[email protected]> Cc: Aleksa Sarai <[email protected]> Cc: Daniel Verkamp <[email protected]> Cc: Jeff Xu <[email protected]> Cc: Kees Cook <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22tools/mm: add thpmaps script to dump THP usage infoRyan Roberts2-4/+680
With the proliferation of large folios for file-backed memory, and more recently the introduction of multi-size THP for anonymous memory, it is becoming useful to be able to see exactly how large folios are mapped into processes. For some architectures (e.g. arm64), if most memory is mapped using contpte-sized and -aligned blocks, TLB usage can be optimized so it's useful to see where these requirements are and are not being met. thpmaps is a Python utility that reads /proc/<pid>/smaps, /proc/<pid>/pagemap and /proc/kpageflags to print information about how transparent huge pages (both file and anon) are mapped to a specified process or cgroup. It aims to help users debug and optimize their workloads. In future we may wish to introduce stats directly into the kernel (e.g. smaps or similar), but for now this provides a short term solution without the need to introduce any new ABI. Run with help option for a full listing of the arguments: # ./thpmaps --help --8<-- usage: thpmaps [-h] [--pid pid | --cgroup path] [--rollup] [--cont size[KMG]] [--inc-smaps] [--inc-empty] [--periodic sleep_ms] Prints information about how transparent huge pages are mapped, either system-wide, or for a specified process or cgroup. When run with --pid, the user explicitly specifies the set of pids to scan. e.g. "--pid 10 [--pid 134 ...]". When run with --cgroup, the user passes either a v1 or v2 cgroup and all pids that belong to the cgroup subtree are scanned. When run with neither --pid nor --cgroup, the full set of pids on the system is gathered from /proc and scanned as if the user had provided "--pid 1 --pid 2 ...". A default set of statistics is always generated for THP mappings. However, it is also possible to generate additional statistics for "contiguous block mappings" where the block size is user-defined. Statistics are maintained independently for anonymous and file-backed (pagecache) memory and are shown both in kB and as a percentage of either total anonymous or total file-backed memory as appropriate. THP Statistics -------------- Statistics are always generated for fully- and contiguously-mapped THPs whose mapping address is aligned to their size, for each <size> supported by the system. Separate counters describe THPs mapped by PTE vs those mapped by PMD. (Although note a THP can only be mapped by PMD if it is PMD-sized): - anon-thp-pte-aligned-<size>kB - file-thp-pte-aligned-<size>kB - anon-thp-pmd-aligned-<size>kB - file-thp-pmd-aligned-<size>kB Similarly, statistics are always generated for fully- and contiguously- mapped THPs whose mapping address is *not* aligned to their size, for each <size> supported by the system. Due to the unaligned mapping, it is impossible to map by PMD, so there are only PTE counters for this case: - anon-thp-pte-unaligned-<size>kB - file-thp-pte-unaligned-<size>kB Statistics are also always generated for mapped pages that belong to a THP but where the is THP is *not* fully- and contiguously- mapped. These "partial" mappings are all counted in the same counter regardless of the size of the THP that is partially mapped: - anon-thp-pte-partial - file-thp-pte-partial Contiguous Block Statistics --------------------------- An optional, additional set of statistics is generated for every contiguous block size specified with `--cont <size>`. These statistics show how much memory is mapped in contiguous blocks of <size> and also aligned to <size>. A given contiguous block must all belong to the same THP, but there is no requirement for it to be the *whole* THP. Separate counters describe contiguous blocks mapped by PTE vs those mapped by PMD: - anon-cont-pte-aligned-<size>kB - file-cont-pte-aligned-<size>kB - anon-cont-pmd-aligned-<size>kB - file-cont-pmd-aligned-<size>kB As an example, if monitoring 64K contiguous blocks (--cont 64K), there are a number of sources that could provide such blocks: a fully- and contiguously-mapped 64K THP that is aligned to a 64K boundary would provide 1 block. A fully- and contiguously-mapped 128K THP that is aligned to at least a 64K boundary would provide 2 blocks. Or a 128K THP that maps its first 100K, but contiguously and starting at a 64K boundary would provide 1 block. A fully- and contiguously-mapped 2M THP would provide 32 blocks. There are many other possible permutations. options: -h, --help show this help message and exit --pid pid Process id of the target process. Maybe issued multiple times to scan multiple processes. --pid and --cgroup are mutually exclusive. If neither are provided, all processes are scanned to provide system-wide information. --cgroup path Path to the target cgroup in sysfs. Iterates over every pid in the cgroup and its children. --pid and --cgroup are mutually exclusive. If neither are provided, all processes are scanned to provide system-wide information. --rollup Sum the per-vma statistics to provide a summary over the whole system, process or cgroup. --cont size[KMG] Adds stats for memory that is mapped in contiguous blocks of <size> and also aligned to <size>. May be issued multiple times to track multiple sized blocks. Useful to infer e.g. arm64 contpte and hpa mappings. Size must be a power-of-2 number of pages. --inc-smaps Include all numerical, additive /proc/<pid>/smaps stats in the output. --inc-empty Show all statistics including those whose value is 0. --periodic sleep_ms Run in a loop, polling every sleep_ms milliseconds. Requires root privilege to access pagemap and kpageflags. --8<-- Example command to summarise fully and partially mapped THPs and 64K contiguous blocks over all VMAs in all processes in the system (--inc-empty forces printing stats that are 0): # ./thpmaps --cont 64K --rollup --inc-empty --8<-- anon-thp-pmd-aligned-2048kB: 139264 kB ( 6%) file-thp-pmd-aligned-2048kB: 0 kB ( 0%) anon-thp-pte-aligned-16kB: 0 kB ( 0%) anon-thp-pte-aligned-32kB: 0 kB ( 0%) anon-thp-pte-aligned-64kB: 72256 kB ( 3%) anon-thp-pte-aligned-128kB: 0 kB ( 0%) anon-thp-pte-aligned-256kB: 0 kB ( 0%) anon-thp-pte-aligned-512kB: 0 kB ( 0%) anon-thp-pte-aligned-1024kB: 0 kB ( 0%) anon-thp-pte-aligned-2048kB: 0 kB ( 0%) anon-thp-pte-unaligned-16kB: 0 kB ( 0%) anon-thp-pte-unaligned-32kB: 0 kB ( 0%) anon-thp-pte-unaligned-64kB: 0 kB ( 0%) anon-thp-pte-unaligned-128kB: 0 kB ( 0%) anon-thp-pte-unaligned-256kB: 0 kB ( 0%) anon-thp-pte-unaligned-512kB: 0 kB ( 0%) anon-thp-pte-unaligned-1024kB: 0 kB ( 0%) anon-thp-pte-unaligned-2048kB: 0 kB ( 0%) anon-thp-pte-partial: 63232 kB ( 3%) file-thp-pte-aligned-16kB: 809024 kB (47%) file-thp-pte-aligned-32kB: 43168 kB ( 3%) file-thp-pte-aligned-64kB: 98496 kB ( 6%) file-thp-pte-aligned-128kB: 17536 kB ( 1%) file-thp-pte-aligned-256kB: 0 kB ( 0%) file-thp-pte-aligned-512kB: 0 kB ( 0%) file-thp-pte-aligned-1024kB: 0 kB ( 0%) file-thp-pte-aligned-2048kB: 0 kB ( 0%) file-thp-pte-unaligned-16kB: 21712 kB ( 1%) file-thp-pte-unaligned-32kB: 704 kB ( 0%) file-thp-pte-unaligned-64kB: 896 kB ( 0%) file-thp-pte-unaligned-128kB: 44928 kB ( 3%) file-thp-pte-unaligned-256kB: 0 kB ( 0%) file-thp-pte-unaligned-512kB: 0 kB ( 0%) file-thp-pte-unaligned-1024kB: 0 kB ( 0%) file-thp-pte-unaligned-2048kB: 0 kB ( 0%) file-thp-pte-partial: 9252 kB ( 1%) anon-cont-pmd-aligned-64kB: 139264 kB ( 6%) file-cont-pmd-aligned-64kB: 0 kB ( 0%) anon-cont-pte-aligned-64kB: 100672 kB ( 4%) file-cont-pte-aligned-64kB: 161856 kB ( 9%) --8<-- Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Tested-by: Barry Song <[email protected]> Cc: Alistair Popple <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: John Hubbard <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: William Kucharski <[email protected]> Cc: Zenghui Yu <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-22Merge tag 'net-6.8.0-rc6' of ↵Linus Torvalds16-97/+345
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - regressions: - af_unix: fix another unix GC hangup Previous releases - regressions: - core: fix a possible AF_UNIX deadlock - bpf: fix NULL pointer dereference in sk_psock_verdict_data_ready() - netfilter: nft_flow_offload: release dst in case direct xmit path is used - bridge: switchdev: ensure MDB events are delivered exactly once - l2tp: pass correct message length to ip6_append_data - dccp/tcp: unhash sk from ehash for tb2 alloc failure after check_estalblished() - tls: fixes for record type handling with PEEK - devlink: fix possible use-after-free and memory leaks in devlink_init() Previous releases - always broken: - bpf: fix an oops when attempting to read the vsyscall page through bpf_probe_read_kernel - sched: act_mirred: use the backlog for mirred ingress - netfilter: nft_flow_offload: fix dst refcount underflow - ipv6: sr: fix possible use-after-free and null-ptr-deref - mptcp: fix several data races - phonet: take correct lock to peek at the RX queue Misc: - handful of fixes and reliability improvements for selftests" * tag 'net-6.8.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits) l2tp: pass correct message length to ip6_append_data net: phy: realtek: Fix rtl8211f_config_init() for RTL8211F(D)(I)-VD-CG PHY selftests: ioam: refactoring to align with the fix Fix write to cloned skb in ipv6_hop_ioam() phonet/pep: fix racy skb_queue_empty() use phonet: take correct lock to peek at the RX queue net: sparx5: Add spinlock for frame transmission from CPU net/sched: flower: Add lock protection when remove filter handle devlink: fix port dump cmd type net: stmmac: Fix EST offset for dwmac 5.10 tools: ynl: don't leak mcast_groups on init error tools: ynl: make sure we always pass yarg to mnl_cb_run net: mctp: put sock on tag allocation failure netfilter: nf_tables: use kzalloc for hook allocation netfilter: nf_tables: register hooks last when adding new chain/flowtable netfilter: nft_flow_offload: release dst in case direct xmit path is used netfilter: nft_flow_offload: reset dst in route object after setting up flow netfilter: nf_tables: set dormant flag on hook register failure selftests: tls: add test for peeking past a record of a different type selftests: tls: add test for merging of same-type control messages ...
2024-02-22perf tests: Add option to run tests in parallelIan Rogers1-99/+215
By default tests are forked, add an option (-p or --parallel) so that the forked tests are all started in parallel and then their output gathered serially. This is opt-in as running in parallel can cause test flakes. Rather than fork within the code, the start_command/finish_command from libsubcmd are used. This changes how stderr and stdout are handled. The child stderr and stdout are always read to avoid the child blocking. If verbose is 1 (-v) then if the test fails the child stdout and stderr are displayed. If the verbose is >1 (e.g. -vv) then the stdout and stderr from the child are immediately displayed. An unscientific test on my laptop shows the wall clock time for perf test without parallel being 5 minutes 21 seconds and with parallel (-p) being 1 minute 50 seconds. Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Justin Stitt <[email protected]> Cc: Bill Wendling <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Kan Liang <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-22perf tests: Run time generate shell test suitesIan Rogers3-149/+80
Rather than special shell test logic, do a single pass to create an array of test suites. Hold the shell test file name in the test suite priv field. This makes the special shell test logic in builtin-test.c redundant so remove it. Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Justin Stitt <[email protected]> Cc: Bill Wendling <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Kan Liang <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-22perf tests: Use scandirat for shell script findingIan Rogers3-71/+95
Avoid filename appending buffers by using openat, faccessat and scandirat more widely. Turn the script's path back to a file name using readlink from /proc/<pid>/fd/<fd>. Read the script's description using api/io.h to avoid fdopen conversions. Whilst reading perform additional sanity checks on the script's contents. Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Justin Stitt <[email protected]> Cc: Bill Wendling <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Kan Liang <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-22perf test: Rename builtin-test-list and add missed header guardIan Rogers4-3/+7
builtin-test-list is primarily concerned with shell script tests. Rename the file to better reflect this and add a missed header guard. Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Justin Stitt <[email protected]> Cc: Bill Wendling <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Kan Liang <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-22tools subcmd: Add a no exec function call optionIan Rogers2-0/+4
Tools like perf fork tests in case they crash, but they don't want to exec a full binary. Add an option to call a function rather than do an exec. The child process exits with the result of the function call and is passed the struct of the run_command, things like container_of can then allow the child process function to determine additional arguments. Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Justin Stitt <[email protected]> Cc: Bill Wendling <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Kan Liang <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-22perf tests: Avoid fork in perf_has_symbol testIan Rogers1-1/+1
perf test -vv Symbols is used to indentify symbols within the perf binary. Add the -F flag so that the test command doesn't fork the test before running. This removes a little overhead. Acked-by: Adrian Hunter <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Justin Stitt <[email protected]> Cc: Bill Wendling <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Kan Liang <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-22perf list: Add scandirat compatibility functionIan Rogers3-9/+31
scandirat is used during the printing of tracepoint events but may be missing from certain libcs. Add a compatibility implementation that uses the symlink of an fd in /proc as a path for the reliably present scandir. Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Justin Stitt <[email protected]> Cc: Bill Wendling <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Kan Liang <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-22perf thread_map: Skip exited threads when scanning /procIan Rogers1-5/+4
Scanning /proc is inherently racy. Scanning /proc/pid/task within that is also racy as the pid can terminate. Rather than failing in __thread_map__new_all_cpus, skip pids for such failures. Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Justin Stitt <[email protected]> Cc: Bill Wendling <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Kan Liang <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-22perf list: fix short description for some cache eventsThomas Richter1-31/+31
Correct the short description of the following events: DCW_REQ, DCW_REQ_CHIP_HIT, DCW_REQ_DRAWER_HIT, DCW_REQ_IV, DCW_ON_CHIP, DCW_ON_CHIP_IV, DCW_ON_CHIP_CHIP_HIT, DCW_ON_CHIP_DRAWER_HIT, CW_ON_MODULE, DCW_ON_DRAWER, DCW_OFF_DRAWER, IDCW_ON_MODULE_IV, IDCW_ON_MODULE_CHIP_HIT, IDCW_ON_MODULE_DRAWER_HIT, IDCW_ON_DRAWER_IV, IDCW_ON_DRAWER_CHIP_HIT, IDCW_ON_DRAWER_DRAWER_HIT, IDCW_OFF_DRAWER_IV, IDCW_OFF_DRAWER_CHIP_HIT, IDCW_OFF_DRAWER_DRAWER_HIT, ICW_REQ, ICW_REQ_IV, CW_REQ_CHIP_HIT, ICW_REQ_DRAWER_HIT, ICW_ON_CHIP, ICW_ON_CHIP_IV, ICW_ON_CHIP_CHIP_HIT, ICW_ON_CHIP_DRAWER_HIT, ICW_ON_MODULE and ICW_OFF_DRAWER. The second Cache should be L2-Cache. Output before (display diff of the first four events) # perf list -d DCW_REQ [Directory Write Level 1 Data Cache from Cache. Unit: cpum_cf] DCW_REQ_CHIP_HIT [Directory Write Level 1 Data Cache from Cache with Chip HP \ Hit. Unit: cpum_cf] DCW_REQ_DRAWER_HIT [Directory Write Level 1 Data Cache from Cache with Drawer \ HP Hit. Unit: cpum_cf] DCW_REQ_IV [Directory Write Level 1 Data Cache from Cache with Intervention. \ Unit: cpum_cf] Output after: # perf list -d DCW_REQ [Directory Write Level 1 Data Cache from L2-Cache. Unit: cpum_cf] DCW_REQ_CHIP_HIT [Directory Write Level 1 Data Cache from L2-Cache with Chip HP \ Hit. Unit: cpum_cf] DCW_REQ_DRAWER_HIT [Directory Write Level 1 Data Cache from L2-Cache with Drawer \ HP Hit. Unit: cpum_cf] DCW_REQ_IV [Directory Write Level 1 Data Cache from L2-Cache with \ Intervention. Unit: cpum_cf] Fixes: 7f76b3113068 ("perf list: Add IBM z16 event description for s390") Reported-by: Andreas Krebbel <[email protected]> Signed-off-by: Thomas Richter <[email protected]> Acked-by: Andreas Krebbel <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-22perf stat: Fix metric-only aggregation indexIan Rogers1-2/+7
Aggregation index was being computed using the evsel's cpumap which may have a different (typically the same or fewer) entries. Before: ``` $ perf stat --metric-only -A -M memory_bandwidth_total -a sleep 1 Performance counter stats for 'system wide': MB/s memory_bandwidth_total MB/s memory_bandwidth_total MB/s memory_bandwidth_total MB/s memory_bandwidth_total MB/s memory_bandwidth_total MB/s memory_bandwidth_total CPU0 12.8 0.0 12.9 12.7 0.0 12.6 CPU1 1.007806367 seconds time elapsed ``` After: ``` $ perf stat --metric-only -A -M memory_bandwidth_total -a sleep 1 Performance counter stats for 'system wide': MB/s memory_bandwidth_total MB/s memory_bandwidth_total MB/s memory_bandwidth_total MB/s memory_bandwidth_total MB/s memory_bandwidth_total MB/s memory_bandwidth_total CPU0 15.4 0.0 15.3 15.0 0.0 14.9 CPU18 0.0 0.0 13.5 5.2 0.0 11.9 1.007858736 seconds time elapsed ``` Signed-off-by: Ian Rogers <[email protected]> | Acked-by: Namhyung Kim <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Kaige Ye <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Kan Liang <[email protected]> Cc: John Garry <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-22perf metrics: Compute unmerged uncore metrics individuallyIan Rogers2-4/+29
When merging counts from multiple uncore PMUs the metric is only computed for the metric leader. When merging/aggregation is disabled, prior to this patch just the leader's metric would be computed. Fix this by computing the metric for each PMU. On a SkylakeX: Before: ``` $ perf stat -A -M memory_bandwidth_total -a sleep 1 Performance counter stats for 'system wide': CPU0 82,217 UNC_M_CAS_COUNT.RD [uncore_imc_0] # 9.2 MB/s memory_bandwidth_total CPU18 0 UNC_M_CAS_COUNT.RD [uncore_imc_0] # 0.0 MB/s memory_bandwidth_total CPU0 61,395 UNC_M_CAS_COUNT.WR [uncore_imc_0] CPU18 0 UNC_M_CAS_COUNT.WR [uncore_imc_0] CPU0 0 UNC_M_CAS_COUNT.RD [uncore_imc_1] CPU18 0 UNC_M_CAS_COUNT.RD [uncore_imc_1] CPU0 0 UNC_M_CAS_COUNT.WR [uncore_imc_1] CPU18 0 UNC_M_CAS_COUNT.WR [uncore_imc_1] CPU0 81,570 UNC_M_CAS_COUNT.RD [uncore_imc_2] CPU18 113,886 UNC_M_CAS_COUNT.RD [uncore_imc_2] CPU0 62,330 UNC_M_CAS_COUNT.WR [uncore_imc_2] CPU18 66,942 UNC_M_CAS_COUNT.WR [uncore_imc_2] CPU0 75,489 UNC_M_CAS_COUNT.RD [uncore_imc_3] CPU18 27,958 UNC_M_CAS_COUNT.RD [uncore_imc_3] CPU0 55,864 UNC_M_CAS_COUNT.WR [uncore_imc_3] CPU18 38,727 UNC_M_CAS_COUNT.WR [uncore_imc_3] CPU0 0 UNC_M_CAS_COUNT.RD [uncore_imc_4] CPU18 0 UNC_M_CAS_COUNT.RD [uncore_imc_4] CPU0 0 UNC_M_CAS_COUNT.WR [uncore_imc_4] CPU18 0 UNC_M_CAS_COUNT.WR [uncore_imc_4] CPU0 75,423 UNC_M_CAS_COUNT.RD [uncore_imc_5] CPU18 104,527 UNC_M_CAS_COUNT.RD [uncore_imc_5] CPU0 57,596 UNC_M_CAS_COUNT.WR [uncore_imc_5] CPU18 56,777 UNC_M_CAS_COUNT.WR [uncore_imc_5] CPU0 1,003,440,851 ns duration_time 1.003440851 seconds time elapsed ``` After: ``` $ perf stat -A -M memory_bandwidth_total -a sleep 1 Performance counter stats for 'system wide': CPU0 88,968 UNC_M_CAS_COUNT.RD [uncore_imc_0] # 9.5 MB/s memory_bandwidth_total CPU18 0 UNC_M_CAS_COUNT.RD [uncore_imc_0] # 0.0 MB/s memory_bandwidth_total CPU0 59,498 UNC_M_CAS_COUNT.WR [uncore_imc_0] CPU18 0 UNC_M_CAS_COUNT.WR [uncore_imc_0] CPU0 0 UNC_M_CAS_COUNT.RD [uncore_imc_1] # 0.0 MB/s memory_bandwidth_total CPU18 0 UNC_M_CAS_COUNT.RD [uncore_imc_1] # 0.0 MB/s memory_bandwidth_total CPU0 0 UNC_M_CAS_COUNT.WR [uncore_imc_1] CPU18 0 UNC_M_CAS_COUNT.WR [uncore_imc_1] CPU0 88,635 UNC_M_CAS_COUNT.RD [uncore_imc_2] # 9.5 MB/s memory_bandwidth_total CPU18 117,975 UNC_M_CAS_COUNT.RD [uncore_imc_2] # 11.5 MB/s memory_bandwidth_total CPU0 60,829 UNC_M_CAS_COUNT.WR [uncore_imc_2] CPU18 62,105 UNC_M_CAS_COUNT.WR [uncore_imc_2] CPU0 82,238 UNC_M_CAS_COUNT.RD [uncore_imc_3] # 8.7 MB/s memory_bandwidth_total CPU18 22,906 UNC_M_CAS_COUNT.RD [uncore_imc_3] # 3.6 MB/s memory_bandwidth_total CPU0 53,959 UNC_M_CAS_COUNT.WR [uncore_imc_3] CPU18 32,990 UNC_M_CAS_COUNT.WR [uncore_imc_3] CPU0 0 UNC_M_CAS_COUNT.RD [uncore_imc_4] # 0.0 MB/s memory_bandwidth_total CPU18 0 UNC_M_CAS_COUNT.RD [uncore_imc_4] # 0.0 MB/s memory_bandwidth_total CPU0 0 UNC_M_CAS_COUNT.WR [uncore_imc_4] CPU18 0 UNC_M_CAS_COUNT.WR [uncore_imc_4] CPU0 83,595 UNC_M_CAS_COUNT.RD [uncore_imc_5] # 8.9 MB/s memory_bandwidth_total CPU18 110,151 UNC_M_CAS_COUNT.RD [uncore_imc_5] # 10.5 MB/s memory_bandwidth_total CPU0 56,540 UNC_M_CAS_COUNT.WR [uncore_imc_5] CPU18 53,816 UNC_M_CAS_COUNT.WR [uncore_imc_5] CPU0 1,003,353,416 ns duration_time ``` Signed-off-by: Ian Rogers <[email protected]> | Acked-by: Namhyung Kim <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Kaige Ye <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: John Garry <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-22perf stat: Pass fewer metric argumentsIan Rogers1-20/+18
Pass metric_expr and evsel rather than specific variables from the struct, thereby reducing the number of arguments. This will enable later fixes. To reduce the size of the diff, local variables are added to match the previous parameter names. This isn't done in the case of "name" as evsel->name is more intention revealing. A whitespace issue is also addressed. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Kaige Ye <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: John Garry <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-22selftests/bpf: update tcp_custom_syncookie to use scalar packet offsetEduard Zingerman1-30/+53
This commit updates tcp_custom_syncookie.c:tcp_parse_option() to use explicit packet offset (ctx->off) for packet access instead of ever moving pointer (ctx->ptr), this reduces verification complexity: - the tcp_parse_option() is passed as a callback to bpf_loop(); - suppose a checkpoint is created each time at function entry; - the ctx->ptr is tracked by verifier as PTR_TO_PACKET; - the ctx->ptr is incremented in tcp_parse_option(), thus umax_value field tracked for it is incremented as well; - on each next iteration of tcp_parse_option() checkpoint from a previous iteration can't be reused for state pruning, because PTR_TO_PACKET registers are considered equivalent only if old->umax_value >= cur->umax_value; - on the other hand, the ctx->off is a SCALAR, subject to widen_imprecise_scalars(); - it's exact bounds are eventually forgotten and it is tracked as unknown scalar at entry to tcp_parse_option(); - hence checkpoints created at the start of the function eventually converge. The change is similar to one applied in [0] to xdp_synproxy_kern.c. Comparing before and after with veristat yields following results: File Insns (A) Insns (B) Insns (DIFF) ------------------------------- --------- --------- ----------------- test_tcp_custom_syncookie.bpf.o 466657 12423 -454234 (-97.34%) [0] commit 977bc146d4eb ("selftests/bpf: track tcp payload offset as scalar in xdp_synproxy") Acked-by: Yonghong Song <[email protected]> Signed-off-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>