aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2022-11-11Merge tag 'perf-tools-fixes-for-v6.1-2-2022-11-10' of ↵Linus Torvalds4-4/+12
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix 'perf stat' crash with --per-node --metric-only in CSV mode, due to the AGGR_NODE slot in the 'aggr_header_csv' array not being set. - Fix printing prefix in CSV output of 'perf stat' metrics in interval mode (-I), where an extra separator was being added to the start of some lines. - Fix skipping branch stack sampling 'perf test' entry, that was using both --branch-any and --branch-filter, which can't be used together. * tag 'perf-tools-fixes-for-v6.1-2-2022-11-10' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf tools: Add the include/perf/ directory to .gitignore perf test: Fix skipping branch stack sampling test perf stat: Fix printing os->prefix in CSV metrics output perf stat: Fix crash with --per-node --metric-only in CSV mode
2022-11-11Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-15/+83
Pull kvm "This is a pretty large diffstat for this time of the release. The main culprit is a reorganization of the AMD assembly trampoline, allowing percpu variables to be accessed early. This is needed for the return stack depth tracking retbleed mitigation that will be in 6.2, but it also makes it possible to tighten the IBRS restore on vmexit. The latter change is a long tail of the spectrev2/retbleed patches (the corresponding Intel change was simpler and went in already last June), which is why I am including it right now instead of sharing a topic branch with tip. Being assembly and being rich in comments makes the line count balloon a bit, but I am pretty confident in the change (famous last words) because the reorganization actually makes everything simpler and more understandable than before. It has also had external review and has been tested on the aforementioned 6.2 changes, which explode quite brutally without the fix. Apart from this, things are pretty normal. s390: - PCI fix - PV clock fix x86: - Fix clash between PMU MSRs and other MSRs - Prepare SVM assembly trampoline for 6.2 retbleed mitigation and for... - ... tightening IBRS restore on vmexit, moving it before the first RET or indirect branch - Fix log level for VMSA dump - Block all page faults during kvm_zap_gfn_range() Tools: - kvm_stat: fix incorrect detection of debugfs - kvm_stat: update vmexit definitions" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86/mmu: Block all page faults during kvm_zap_gfn_range() KVM: x86/pmu: Limit the maximum number of supported AMD GP counters KVM: x86/pmu: Limit the maximum number of supported Intel GP counters KVM: x86/pmu: Do not speculatively query Intel GP PMCs that don't exist yet KVM: SVM: Only dump VMSA to klog at KERN_DEBUG level tools/kvm_stat: update exit reasons for vmx/svm/aarch64/userspace tools/kvm_stat: fix incorrect detection of debugfs x86, KVM: remove unnecessary argument to x86_virt_spec_ctrl and callers KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly KVM: SVM: restore host save area from assembly KVM: SVM: move guest vmsave/vmload back to assembly KVM: SVM: do not allocate struct svm_cpu_data dynamically KVM: SVM: remove dead field from struct svm_cpu_data KVM: SVM: remove unused field from struct vcpu_svm KVM: SVM: retrieve VMCB from assembly KVM: SVM: adjust register allocation for __svm_vcpu_run() KVM: SVM: replace regs argument of __svm_vcpu_run() with vcpu_svm KVM: x86: use a separate asm-offsets.c file KVM: s390: pci: Fix allocation size of aift kzdev elements KVM: s390: pv: don't allow userspace to set the clock under PV
2022-11-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski13-37/+398
drivers/net/can/pch_can.c ae64438be192 ("can: dev: fix skb drop check") 1dd1b521be85 ("can: remove obsolete PCH CAN driver") https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jakub Kicinski <[email protected]>
2022-11-10Merge tag 'net-6.1-rc5' of ↵Linus Torvalds8-10/+99
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, wifi, can and bpf. Current release - new code bugs: - can: af_can: can_exit(): add missing dev_remove_pack() of canxl_packet Previous releases - regressions: - bpf, sockmap: fix the sk->sk_forward_alloc warning - wifi: mac80211: fix general-protection-fault in ieee80211_subif_start_xmit() - can: af_can: fix NULL pointer dereference in can_rx_register() - can: dev: fix skb drop check, avoid o-o-b access - nfnetlink: fix potential dead lock in nfnetlink_rcv_msg() Previous releases - always broken: - bpf: fix wrong reg type conversion in release_reference() - gso: fix panic on frag_list with mixed head alloc types - wifi: brcmfmac: fix buffer overflow in brcmf_fweh_event_worker() - wifi: mac80211: set TWT Information Frame Disabled bit as 1 - eth: macsec offload related fixes, make sure to clear the keys from memory - tun: fix memory leaks in the use of napi_get_frags - tun: call napi_schedule_prep() to ensure we own a napi - tcp: prohibit TCP_REPAIR_OPTIONS if data was already sent - ipv6: addrlabel: fix infoleak when sending struct ifaddrlblmsg to network - tipc: fix a msg->req tlv length check - sctp: clear out_curr if all frag chunks of current msg are pruned, avoid list corruption - mctp: fix an error handling path in mctp_init(), avoid leaks" * tag 'net-6.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits) eth: sp7021: drop free_netdev() from spl2sw_init_netdev() MAINTAINERS: Move Vivien to CREDITS net: macvlan: fix memory leaks of macvlan_common_newlink ethernet: tundra: free irq when alloc ring failed in tsi108_open() net: mv643xx_eth: disable napi when init rxq or txq failed in mv643xx_eth_open() ethernet: s2io: disable napi when start nic failed in s2io_card_up() net: atlantic: macsec: clear encryption keys from the stack net: phy: mscc: macsec: clear encryption keys when freeing a flow stmmac: dwmac-loongson: fix missing of_node_put() while module exiting stmmac: dwmac-loongson: fix missing pci_disable_device() in loongson_dwmac_probe() stmmac: dwmac-loongson: fix missing pci_disable_msi() while module exiting cxgb4vf: shut down the adapter when t4vf_update_port_info() failed in cxgb4vf_open() mctp: Fix an error handling path in mctp_init() stmmac: intel: Update PCH PTP clock rate from 200MHz to 204.8MHz net: cxgb3_main: disable napi when bind qsets failed in cxgb_up() net: cpsw: disable napi in cpsw_ndo_open() iavf: Fix VF driver counting VLAN 0 filters ice: Fix spurious interrupt during removal of trusted VF net/mlx5e: TC, Fix slab-out-of-bounds in parse_tc_actions net/mlx5e: E-Switch, Fix comparing termination table instance ...
2022-11-10KVM: selftests: aarch64: Add mix of tests into page_fault_testRicardo Koller1-0/+155
Add some mix of tests into page_fault_test: memory regions with all the pairwise combinations of read-only, userfaultfd, and dirty-logging. For example, writing into a read-only region which has a hole handled with userfaultfd. Signed-off-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: aarch64: Add readonly memslot tests into page_fault_testRicardo Koller1-1/+101
Add some readonly memslot tests into page_fault_test. Mark the data and/or page-table memory regions as readonly, perform some accesses, and check that the right fault is triggered when expected (e.g., a store with no write-back should lead to an mmio exit). Signed-off-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: aarch64: Add dirty logging tests into page_fault_testRicardo Koller1-0/+76
Add some dirty logging tests into page_fault_test. Mark the data and/or page-table memory regions for dirty logging, perform some accesses, and check that the dirty log bits are set or clean when expected. Signed-off-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: aarch64: Add userfaultfd tests into page_fault_testRicardo Koller1-0/+187
Add some userfaultfd tests into page_fault_test. Punch holes into the data and/or page-table memslots, perform some accesses, and check that the faults are taken (or not taken) when expected. Signed-off-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: aarch64: Add aarch64/page_fault_testRicardo Koller4-0/+604
Add a new test for stage 2 faults when using different combinations of guest accesses (e.g., write, S1PTW), backing source type (e.g., anon) and types of faults (e.g., read on hugetlbfs with a hole). The next commits will add different handling methods and more faults (e.g., uffd and dirty logging). This first commit starts by adding two sanity checks for all types of accesses: AF setting by the hw, and accessing memslots with holes. Signed-off-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: Use the right memslot for code, page-tables, and data ↵Ricardo Koller7-40/+65
allocations Now that kvm_vm allows specifying different memslots for code, page tables, and data, use the appropriate memslot when making allocations in common/libraty code. Change them accordingly: - code (allocated by lib/elf) use the CODE memslot - stacks, exception tables, and other core data pages (like the TSS in x86) use the DATA memslot - page tables and the PGD use the PT memslot - test data (anything allocated with vm_vaddr_alloc()) uses the TEST_DATA memslot No functional change intended. All allocators keep using memslot #0. Cc: Sean Christopherson <[email protected]> Cc: Andrew Jones <[email protected]> Signed-off-by: Ricardo Koller <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: Fix alignment in virt_arch_pgd_alloc() and vm_vaddr_alloc()Ricardo Koller2-24/+30
Refactor virt_arch_pgd_alloc() and vm_vaddr_alloc() in both RISC-V and aarch64 to fix the alignment of parameters in a couple of calls. This will make it easier to fix the alignment in a future commit that adds an extra parameter (that happens to be very long). No functional change intended. Suggested-by: Sean Christopherson <[email protected]> Signed-off-by: Ricardo Koller <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: Add vm->memslots[] and enum kvm_mem_region_typeRicardo Koller2-10/+34
The vm_create() helpers are hardcoded to place most page types (code, page-tables, stacks, etc) in the same memslot #0, and always backed with anonymous 4K. There are a couple of issues with that. First, tests willing to differ a bit, like placing page-tables in a different backing source type must replicate much of what's already done by the vm_create() functions. Second, the hardcoded assumption of memslot #0 holding most things is spread everywhere; this makes it very hard to change. Fix the above issues by having selftests specify how they want memory to be laid out. Start by changing ____vm_create() to not create memslot #0; a test (to come) will specify all memslots used by the VM. Then, add the vm->memslots[] array to specify the right memslot for different memory allocators, e.g.,: lib/elf should use the vm->[MEM_REGION_CODE] memslot. This will be used as a way to specify the page-tables memslots (to be backed by huge pages for example). There is no functional change intended. The current commit lays out memory exactly as before. A future commit will change the allocators to get the region they should be using, e.g.,: like the page table allocators using the pt memslot. Cc: Sean Christopherson <[email protected]> Cc: Andrew Jones <[email protected]> Signed-off-by: Ricardo Koller <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: Stash backing_src_type in struct userspace_mem_regionRicardo Koller2-0/+2
Add the backing_src_type into struct userspace_mem_region. This struct already stores a lot of info about memory regions, except the backing source type. This info will be used by a future commit in order to determine the method for punching a hole. Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10tools: Copy bitfield.h from the kernel sourcesRicardo Koller1-0/+176
Copy bitfield.h from include/linux/bitfield.h. A subsequent change will make use of some FIELD_{GET,PREP} macros defined in this header. The header was copied as-is, no changes needed. Cc: Jakub Kicinski <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: aarch64: Construct DEFAULT_MAIR_EL1 using sysreg.h macrosRicardo Koller2-7/+20
Define macros for memory type indexes and construct DEFAULT_MAIR_EL1 with macros from asm/sysreg.h. The index macros can then be used when constructing PTEs (instead of using raw numbers). Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: Add missing close and munmap in __vm_mem_region_delete()Ricardo Koller1-0/+6
Deleting a memslot (when freeing a VM) is not closing the backing fd, nor it's unmapping the alias mapping. Fix by adding the missing close and munmap. Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Reviewed-by: Ben Gardon <[email protected]> Signed-off-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: aarch64: Add virt_get_pte_hva() library functionRicardo Koller2-3/+12
Add a library function to get the PTE (a host virtual address) of a given GVA. This will be used in a future commit by a test to clear and check the access flag of a particular page. Reviewed-by: Oliver Upton <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: Add a userfaultfd libraryRicardo Koller4-198/+262
Move the generic userfaultfd code out of demand_paging_test.c into a common library, userfaultfd_util. This library consists of a setup and a stop function. The setup function starts a thread for handling page faults using the handler callback function. This setup returns a uffd_desc object which is then used in the stop function (to wait and destroy the threads). Reviewed-by: Oliver Upton <[email protected]> Reviewed-by: Ben Gardon <[email protected]> Signed-off-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: arm64: selftests: Test with every breakpoint/watchpointReiji Watanabe1-12/+42
Currently, the debug-exceptions test always uses only {break,watch}point#0 and the highest numbered context-aware breakpoint. Modify the test to use all {break,watch}points and context-aware breakpoints supported on the system. Signed-off-by: Reiji Watanabe <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: arm64: selftests: Add a test case for a linked watchpointReiji Watanabe1-0/+35
Currently, the debug-exceptions test doesn't have a test case for a linked watchpoint. Add a test case for the linked watchpoint to the test. The new test case uses the highest numbered context-aware breakpoint (for Context ID match), and the watchpoint#0, which is linked to the context-aware breakpoint. Signed-off-by: Reiji Watanabe <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: arm64: selftests: Add a test case for a linked breakpointReiji Watanabe1-6/+57
Currently, the debug-exceptions test doesn't have a test case for a linked breakpoint. Add a test case for the linked breakpoint to the test. The new test case uses a pair of breakpoints. One is the higiest numbered context-aware breakpoint (for Context ID match), and the other one is the breakpoint#0 (for Address Match), which is linked to the context-aware breakpoint. Signed-off-by: Reiji Watanabe <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: arm64: selftests: Change debug_version() to take ID_AA64DFR0_EL1Reiji Watanabe1-5/+4
Change debug_version() to take the ID_AA64DFR0_EL1 value instead of vcpu as an argument, and change its callsite to read ID_AA64DFR0_EL1 (and pass it to debug_version()). Subsequent patches will reuse the register value in the callsite. No functional change intended. Signed-off-by: Reiji Watanabe <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: arm64: selftests: Stop unnecessary test stage tracking of debug-exceptionsReiji Watanabe1-37/+9
Currently, debug-exceptions test unnecessarily tracks some test stages using GUEST_SYNC(). The code for it needs to be updated as test cases are added or removed. Stop doing the unnecessary stage tracking, as they are not so useful and are a bit pain to maintain. Signed-off-by: Reiji Watanabe <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: arm64: selftests: Add helpers to enable debug exceptionsReiji Watanabe1-12/+13
Add helpers to enable breakpoint and watchpoint exceptions. No functional change intended. Signed-off-by: Reiji Watanabe <[email protected]> Reviewed-by: Ricardo Koller <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: arm64: selftests: Remove the hard-coded {b,w}pn#0 from debug-exceptionsReiji Watanabe1-18/+32
Remove the hard-coded {break,watch}point #0 from the guest_code() in debug-exceptions to allow {break,watch}point number to be specified. Change reset_debug_state() to zeroing all dbg{b,w}{c,v}r_el0 registers so that guest_code() can use the function to reset those registers even when non-zero {break,watch}points are specified for guest_code(). Subsequent patches will add test cases for non-zero {break,watch}points. Signed-off-by: Reiji Watanabe <[email protected]> Reviewed-by: Ricardo Koller <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: arm64: selftests: Add write_dbg{b,w}{c,v}r helpers in debug-exceptionsReiji Watanabe1-4/+68
Introduce helpers in the debug-exceptions test to write to dbg{b,w}{c,v}r registers. Those helpers will be useful for test cases that will be added to the test in subsequent patches. No functional change intended. Signed-off-by: Reiji Watanabe <[email protected]> Reviewed-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: arm64: selftests: Use FIELD_GET() to extract ID register fieldsReiji Watanabe3-5/+8
Use FIELD_GET() macro to extract ID register fields for existing aarch64 selftests code. No functional change intended. Signed-off-by: Reiji Watanabe <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: memslot_perf_test: Report optimal memory slotsGavin Shan1-4/+41
The memory area in each slot should be aligned to host page size. Otherwise, the test will fail. For example, the following command fails with the following messages with 64KB-page-size-host and 4KB-pae-size-guest. It's not user friendly to abort the test. Lets do something to report the optimal memory slots, instead of failing the test. # ./memslot_perf_test -v -s 1000 Number of memory slots: 999 Testing map performance with 1 runs, 5 seconds each Adding slots 1..999, each slot with 8 pages + 216 extra pages last ==== Test Assertion Failure ==== lib/kvm_util.c:824: vm_adjust_num_guest_pages(vm->mode, npages) == npages pid=19872 tid=19872 errno=0 - Success 1 0x00000000004065b3: vm_userspace_mem_region_add at kvm_util.c:822 2 0x0000000000401d6b: prepare_vm at memslot_perf_test.c:273 3 (inlined by) test_execute at memslot_perf_test.c:756 4 (inlined by) test_loop at memslot_perf_test.c:994 5 (inlined by) main at memslot_perf_test.c:1073 6 0x0000ffff7ebb4383: ?? ??:0 7 0x00000000004021ff: _start at :? Number of guest pages is not compatible with the host. Try npages=16 Report the optimal memory slots instead of failing the test when the memory area in each slot isn't aligned to host page size. With this applied, the optimal memory slots is reported. # ./memslot_perf_test -v -s 1000 Number of memory slots: 999 Testing map performance with 1 runs, 5 seconds each Memslot count too high for this test, decrease the cap (max is 514) Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Maciej S. Szmigiero <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: memslot_perf_test: Consolidate memoryGavin Shan1-17/+26
The addresses and sizes passed to vm_userspace_mem_region_add() and madvise() should be aligned to host page size, which can be 64KB on aarch64. So it's wrong by passing additional fixed 4KB memory area to various tests. Fix it by passing additional fixed 64KB memory area to various tests. We also add checks to ensure that none of host/guest page size exceeds 64KB. MEM_TEST_MOVE_SIZE is fixed up to 192KB either. With this, the following command works fine on 64KB-page-size-host and 4KB-page-size-guest. # ./memslot_perf_test -v -s 512 Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Maciej S. Szmigiero <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: memslot_perf_test: Support variable guest page sizeGavin Shan1-81/+129
The test case is obviously broken on aarch64 because non-4KB guest page size is supported. The guest page size on aarch64 could be 4KB, 16KB or 64KB. This supports variable guest page size, mostly for aarch64. - The host determines the guest page size when virtual machine is created. The value is also passed to guest through the synchronization area. - The number of guest pages are unknown until the virtual machine is to be created. So all the related macros are dropped. Instead, their values are dynamically calculated based on the guest page size. - The static checks on memory sizes and pages becomes dependent on guest page size, which is unknown until the virtual machine is about to be created. So all the static checks are converted to dynamic checks, done in check_memory_sizes(). - As the address passed to madvise() should be aligned to host page, the size of page chunk is automatically selected, other than one page. - MEM_TEST_MOVE_SIZE has fixed and non-working 64KB. It will be consolidated in next patch. However, the comments about how it's calculated has been correct. - All other changes included in this patch are almost mechanical replacing '4096' with 'guest_page_size'. Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Maciej S. Szmigiero <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: memslot_perf_test: Probe memory slots for onceGavin Shan1-13/+19
prepare_vm() is called in every iteration and run. The allowed memory slots (KVM_CAP_NR_MEMSLOTS) are probed for multiple times. It's not free and unnecessary. Move the probing logic for the allowed memory slots to parse_args() for once, which is upper layer of prepare_vm(). No functional change intended. Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Maciej S. Szmigiero <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: memslot_perf_test: Consolidate loop conditions in prepare_vm()Gavin Shan1-6/+5
There are two loops in prepare_vm(), which have different conditions. 'slot' is treated as meory slot index in the first loop, but index of the host virtual address array in the second loop. It makes it a bit hard to understand the code. Change the usage of 'slot' in the second loop, to treat it as the memory slot index either. No functional change intended. Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Maciej S. Szmigiero <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: memslot_perf_test: Use data->nslots in prepare_vm()Gavin Shan1-5/+5
In prepare_vm(), 'data->nslots' is assigned with 'max_mem_slots - 1' at the beginning, meaning they are interchangeable. Use 'data->nslots' isntead of 'max_mem_slots - 1'. With this, it becomes easier to move the logic of probing number of slots into upper layer in subsequent patches. No functional change intended. Signed-off-by: Gavin Shan <[email protected]> Reviewed-by: Maciej S. Szmigiero <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10perf vendor events: Add Arm Neoverse V2 PMU eventsJames Clark10-1/+2
Rename the neoverse-n2 folder to make it clear that it includes V2, and add V2 to mapfile.csv. V2 has the same events as N2, visible by running the following command in the ARM-software/data github repo [1]: diff pmu/neoverse-v2.json pmu/neoverse-n2.json | grep code Testing: $ perf test pmu 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok [1]: https://github.com/ARM-software/data Reviewed-by: Nick Forrington <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Al Grant <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-11-10perf print-events: Remove redundant comparison with zeroKang Minchul1-4/+2
Since variable npmus is unsigned int, comparing with 0 is unnecessary. Signed-off-by: Kang Minchul <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-11-10perf data: Add tracepoint fields when converting to JSONDmitrii Dolgov1-0/+20
When converting recorded data into JSON format, perf data omits probe variables. Add them to the output in the format "field name": "field value" using tep_print_field: $ perf data convert --to-json output.json // output.json { "linux-perf-json-version": 1, "headers": { ... }, "samples": [ { "timestamp": 29182079082999, "pid": 309194, [...] "__probe_ip": "0x93ee35", "query_string_string": "select 2;", "nxids": "0" } ] } Signed-off-by: Dmitrii Dolgov <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-11-10perf lock: Allow concurrent record and reportNamhyung Kim2-23/+60
To support live monitoring of kernel lock contention without BPF, it should support something like below: # perf lock record -a -o- sleep 1 | perf lock contention -i- contended total wait max wait avg wait type caller 2 10.27 us 6.17 us 5.13 us spinlock load_balance+0xc03 1 5.29 us 5.29 us 5.29 us rwlock:W ep_scan_ready_list+0x54 1 4.12 us 4.12 us 4.12 us spinlock smpboot_thread_fn+0x116 1 3.28 us 3.28 us 3.28 us mutex pipe_read+0x50 To do that, it needs to handle HEAD_ATTR, HEADER_EVENT_UPDATE and HEADER_TRACING_DATA which are generated only for the pipe mode. And setting event handler also should be delayed until it gets the event information. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-11-10perf trace: Add augmenter for clock_gettime's rqtp timespec argArnaldo Carvalho de Melo5-0/+55
One more before going the BTF way: # perf trace -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o,*nanosleep ? pool-gsd-smart/2893 ... [continued]: clock_nanosleep()) = 0 ? gpm/1042 ... [continued]: clock_nanosleep()) = 0 1.232 pool-gsd-smart/2893 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f64d7ffec50) ... 1.232 pool-gsd-smart/2893 ... [continued]: clock_nanosleep()) = 0 327.329 gpm/1042 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffddfd1cf20) ... 1002.482 pool-gsd-smart/2893 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f64d7ffec50) = 0 327.329 gpm/1042 ... [continued]: clock_nanosleep()) = 0 2003.947 pool-gsd-smart/2893 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f64d7ffec50) ... 2003.947 pool-gsd-smart/2893 ... [continued]: clock_nanosleep()) = 0 2327.858 gpm/1042 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffddfd1cf20) ... ? crond/1384 ... [continued]: clock_nanosleep()) = 0 3005.382 pool-gsd-smart/2893 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f64d7ffec50) ... 3005.382 pool-gsd-smart/2893 ... [continued]: clock_nanosleep()) = 0 3675.633 crond/1384 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffcc02b66b0) ... ^C# Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-11-10KVM: selftests: Automate choosing dirty ring size in dirty_log_testGavin Shan1-4/+22
In the dirty ring case, we rely on vcpu exit due to full dirty ring state. On ARM64 system, there are 4096 host pages when the host page size is 64KB. In this case, the vcpu never exits due to the full dirty ring state. The similar case is 4KB page size on host and 64KB page size on guest. The vcpu corrupts same set of host pages, but the dirty page information isn't collected in the main thread. This leads to infinite loop as the following log shows. # ./dirty_log_test -M dirty-ring -c 65536 -m 5 Setting log mode to: 'dirty-ring' Test iterations: 32, interval: 10 (ms) Testing guest mode: PA-bits:40, VA-bits:48, 4K pages guest physical test memory offset: 0xffbffe0000 vcpu stops because vcpu is kicked out... Notifying vcpu to continue vcpu continues now. Iteration 1 collected 576 pages <No more output afterwards> Fix the issue by automatically choosing the best dirty ring size, to ensure vcpu exit due to full dirty ring state. The option '-c' becomes a hint to the dirty ring count, instead of the value of it. Signed-off-by: Gavin Shan <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: Clear dirty ring states between two modes in dirty_log_testGavin Shan1-10/+17
There are two states, which need to be cleared before next mode is executed. Otherwise, we will hit failure as the following messages indicate. - The variable 'dirty_ring_vcpu_ring_full' shared by main and vcpu thread. It's indicating if the vcpu exit due to full ring buffer. The value can be carried from previous mode (VM_MODE_P40V48_4K) to current one (VM_MODE_P40V48_64K) when VM_MODE_P40V48_16K isn't supported. - The current ring buffer index needs to be reset before next mode (VM_MODE_P40V48_64K) is executed. Otherwise, the stale value is carried from previous mode (VM_MODE_P40V48_4K). # ./dirty_log_test -M dirty-ring Setting log mode to: 'dirty-ring' Test iterations: 32, interval: 10 (ms) Testing guest mode: PA-bits:40, VA-bits:48, 4K pages guest physical test memory offset: 0xffbfffc000 : Dirtied 995328 pages Total bits checked: dirty (1012434), clear (7114123), track_next (966700) Testing guest mode: PA-bits:40, VA-bits:48, 64K pages guest physical test memory offset: 0xffbffc0000 vcpu stops because vcpu is kicked out... vcpu continues now. Notifying vcpu to continue Iteration 1 collected 0 pages vcpu stops because dirty ring is full... vcpu continues now. vcpu stops because dirty ring is full... vcpu continues now. vcpu stops because dirty ring is full... ==== Test Assertion Failure ==== dirty_log_test.c:369: cleared == count pid=10541 tid=10541 errno=22 - Invalid argument 1 0x0000000000403087: dirty_ring_collect_dirty_pages at dirty_log_test.c:369 2 0x0000000000402a0b: log_mode_collect_dirty_pages at dirty_log_test.c:492 3 (inlined by) run_test at dirty_log_test.c:795 4 (inlined by) run_test at dirty_log_test.c:705 5 0x0000000000403a37: for_each_guest_mode at guest_modes.c:100 6 0x0000000000401ccf: main at dirty_log_test.c:938 7 0x0000ffff9ecd279b: ?? ??:0 8 0x0000ffff9ecd286b: ?? ??:0 9 0x0000000000401def: _start at ??:? Reset dirty pages (0) mismatch with collected (35566) Fix the issues by clearing 'dirty_ring_vcpu_ring_full' and the ring buffer index before next new mode is to be executed. Signed-off-by: Gavin Shan <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10KVM: selftests: Use host page size to map ring buffer in dirty_log_testGavin Shan1-1/+1
In vcpu_map_dirty_ring(), the guest's page size is used to figure out the offset in the virtual area. It works fine when we have same page sizes on host and guest. However, it fails when the page sizes on host and guest are different on arm64, like below error messages indicates. # ./dirty_log_test -M dirty-ring -m 7 Setting log mode to: 'dirty-ring' Test iterations: 32, interval: 10 (ms) Testing guest mode: PA-bits:40, VA-bits:48, 64K pages guest physical test memory offset: 0xffbffc0000 vcpu stops because vcpu is kicked out... Notifying vcpu to continue vcpu continues now. ==== Test Assertion Failure ==== lib/kvm_util.c:1477: addr == MAP_FAILED pid=9000 tid=9000 errno=0 - Success 1 0x0000000000405f5b: vcpu_map_dirty_ring at kvm_util.c:1477 2 0x0000000000402ebb: dirty_ring_collect_dirty_pages at dirty_log_test.c:349 3 0x00000000004029b3: log_mode_collect_dirty_pages at dirty_log_test.c:478 4 (inlined by) run_test at dirty_log_test.c:778 5 (inlined by) run_test at dirty_log_test.c:691 6 0x0000000000403a57: for_each_guest_mode at guest_modes.c:105 7 0x0000000000401ccf: main at dirty_log_test.c:921 8 0x0000ffffb06ec79b: ?? ??:0 9 0x0000ffffb06ec86b: ?? ??:0 10 0x0000000000401def: _start at ??:? Dirty ring mapped private Fix the issue by using host's page size to map the ring buffer. Signed-off-by: Gavin Shan <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-10tools/vm/slabinfo: indicates the cause of the EACCES errorRong Tao1-2/+4
If you don't run slabinfo with a superuser, return 0 when read_slab_dir() reads get_obj_and_str("slabs", &t), because fopen() fails (sometimes EACCES), causing slabcache() to return directly, without any error during this time, we should tell the user about the EACCES problem instead of running successfully($?=0) without any error printing. For example: $ ./slabinfo Permission denied, Try using superuser <== What this submission did $ sudo ./slabinfo Name Objects Objsize Space Slabs/Part/Cpu O/S O %Fr %Ef Flg Acpi-Namespace 5950 48 286.7K 65/0/5 85 0 0 99 Acpi-Operand 13664 72 999.4K 231/0/13 56 0 0 98 ... Signed-off-by: Rong Tao <[email protected]> Signed-off-by: Vlastimil Babka <[email protected]>
2022-11-09selftests: Fix test group SKIPPED resultDomenico Cerasuolo1-16/+22
When showing the result of a test group, if one of the subtests was skipped, while still having passing subtests, the group result was marked as SKIP. E.g.: 223/1 usdt/basic:SKIP 223/2 usdt/multispec:OK 223/3 usdt/urand_auto_attach:OK 223/4 usdt/urand_pid_attach:OK 223 usdt:SKIP The test result of usdt in the example above should be OK instead of SKIP, because the test group did have passing tests and it would be considered in "normal" state. With this change, only if all of the subtests were skipped, the group test is marked as SKIP. When only some of the subtests are skipped, a more detailed result is given, stating how many of the subtests were skipped. E.g: 223/1 usdt/basic:SKIP 223/2 usdt/multispec:OK 223/3 usdt/urand_auto_attach:OK 223/4 usdt/urand_pid_attach:OK 223 usdt:OK (SKIP: 1/4) Signed-off-by: Domenico Cerasuolo <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-11-09selftests/bpf: Tests for btf_dedup_resolve_fwdsEduard Zingerman2-15/+206
Tests to verify the following behavior of `btf_dedup_resolve_fwds`: - remapping for struct forward declarations; - remapping for union forward declarations; - no remapping if forward declaration kind does not match similarly named struct or union declaration; - no remapping if forward declaration name is ambiguous; - base ids are considered for fwd resolution in split btf scenario. Signed-off-by: Eduard Zingerman <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Reviewed-by: Alan Maguire <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-11-09libbpf: Resolve unambigous forward declarationsEduard Zingerman1-4/+139
Resolve forward declarations that don't take part in type graphs comparisons if declaration name is unambiguous. Example: CU #1: struct foo; // standalone forward declaration struct foo *some_global; CU #2: struct foo { int x; }; struct foo *another_global; The `struct foo` from CU #1 is not a part of any definition that is compared against another definition while `btf_dedup_struct_types` processes structural types. The the BTF after `btf_dedup_struct_types` the BTF looks as follows: [1] STRUCT 'foo' size=4 vlen=1 ... [2] INT 'int' size=4 ... [3] PTR '(anon)' type_id=1 [4] FWD 'foo' fwd_kind=struct [5] PTR '(anon)' type_id=4 This commit adds a new pass `btf_dedup_resolve_fwds`, that maps such forward declarations to structs or unions with identical name in case if the name is not ambiguous. The pass is positioned before `btf_dedup_ref_types` so that types [3] and [5] could be merged as a same type after [1] and [4] are merged. The final result for the example above looks as follows: [1] STRUCT 'foo' size=4 vlen=1 'x' type_id=2 bits_offset=0 [2] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED [3] PTR '(anon)' type_id=1 For defconfig kernel with BTF enabled this removes 63 forward declarations. Examples of removed declarations: `pt_regs`, `in6_addr`. The running time of `btf__dedup` function is increased by about 3%. Signed-off-by: Eduard Zingerman <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Reviewed-by: Alan Maguire <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-11-09libbpf: Hashmap interface update to allow both long and void* keys/valuesEduard Zingerman27-340/+410
An update for libbpf's hashmap interface from void* -> void* to a polymorphic one, allowing both long and void* keys and values. This simplifies many use cases in libbpf as hashmaps there are mostly integer to integer. Perf copies hashmap implementation from libbpf and has to be updated as well. Changes to libbpf, selftests/bpf and perf are packed as a single commit to avoid compilation issues with any future bisect. Polymorphic interface is acheived by hiding hashmap interface functions behind auxiliary macros that take care of necessary type casts, for example: #define hashmap_cast_ptr(p) \ ({ \ _Static_assert((p) == NULL || sizeof(*(p)) == sizeof(long),\ #p " pointee should be a long-sized integer or a pointer"); \ (long *)(p); \ }) bool hashmap_find(const struct hashmap *map, long key, long *value); #define hashmap__find(map, key, value) \ hashmap_find((map), (long)(key), hashmap_cast_ptr(value)) - hashmap__find macro casts key and value parameters to long and long* respectively - hashmap_cast_ptr ensures that value pointer points to a memory of appropriate size. This hack was suggested by Andrii Nakryiko in [1]. This is a follow up for [2]. [1] https://lore.kernel.org/bpf/CAEf4BzZ8KFneEJxFAaNCCFPGqp20hSpS2aCj76uRk3-qZUH5xg@mail.gmail.com/ [2] https://lore.kernel.org/bpf/[email protected]/T/#m65b28f1d6d969fcd318b556db6a3ad499a42607d Signed-off-by: Eduard Zingerman <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-11-09selftests: mlxsw: Add a test for invalid locked bridge port configurationsIdo Schimmel1-0/+31
Test that locked bridge port configurations that are not supported by mlxsw are rejected. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: Petr Machata <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-11-09selftests: mlxsw: Add a test for locked port trapIdo Schimmel1-0/+105
Test that packets received via a locked bridge port whose {SMAC, VID} does not appear in the bridge's FDB or appears with a different port, trigger the "locked_port" packet trap. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: Petr Machata <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-11-09selftests: mlxsw: Add a test for EAPOL trapIdo Schimmel1-0/+22
Test that packets with a destination MAC of 01:80:C2:00:00:03 trigger the "eapol" packet trap. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: Petr Machata <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-11-09selftests: devlink_lib: Split out helperIdo Schimmel1-7/+12
Merely checking whether a trap counter incremented or not without logging a test result is useful on its own. Split this functionality to a helper which will be used by subsequent patches. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>