aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2022-09-26radix tree test suite: add kmem_cache_set_non_kernel()Liam R. Howlett1-2/+14
kmem_cache_set_non_kernel() is a mechanism to allow a certain number of kmem_cache_alloc requests to succeed even when GFP_KERNEL is not set in the flags. This functionality allows for testing different paths though the code. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Tested-by: Yu Zhao <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: David Howells <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: SeongJae Park <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-09-26radix tree test suite: add pr_err defineLiam R. Howlett1-0/+1
define pr_err to printk Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Tested-by: Yu Zhao <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: David Howells <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: SeongJae Park <[email protected]> Cc: Sven Schnelle <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-09-26Maple Tree: add new data structureLiam R. Howlett5-0/+74
Patch series "Introducing the Maple Tree" The maple tree is an RCU-safe range based B-tree designed to use modern processor cache efficiently. There are a number of places in the kernel that a non-overlapping range-based tree would be beneficial, especially one with a simple interface. If you use an rbtree with other data structures to improve performance or an interval tree to track non-overlapping ranges, then this is for you. The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf nodes. With the increased branching factor, it is significantly shorter than the rbtree so it has fewer cache misses. The removal of the linked list between subsequent entries also reduces the cache misses and the need to pull in the previous and next VMA during many tree alterations. The first user that is covered in this patch set is the vm_area_struct, where three data structures are replaced by the maple tree: the augmented rbtree, the vma cache, and the linked list of VMAs in the mm_struct. The long term goal is to reduce or remove the mmap_lock contention. The plan is to get to the point where we use the maple tree in RCU mode. Readers will not block for writers. A single write operation will be allowed at a time. A reader re-walks if stale data is encountered. VMAs would be RCU enabled and this mode would be entered once multiple tasks are using the mm_struct. Davidlor said : Yes I like the maple tree, and at this stage I don't think we can ask for : more from this series wrt the MM - albeit there seems to still be some : folks reporting breakage. Fundamentally I see Liam's work to (re)move : complexity out of the MM (not to say that the actual maple tree is not : complex) by consolidating the three complimentary data structures very : much worth it considering performance does not take a hit. This was very : much a turn off with the range locking approach, which worst case scenario : incurred in prohibitive overhead. Also as Liam and Matthew have : mentioned, RCU opens up a lot of nice performance opportunities, and in : addition academia[1] has shown outstanding scalability of address spaces : with the foundation of replacing the locked rbtree with RCU aware trees. A similar work has been discovered in the academic press https://pdos.csail.mit.edu/papers/rcuvm:asplos12.pdf Sheer coincidence. We designed our tree with the intention of solving the hardest problem first. Upon settling on a b-tree variant and a rough outline, we researched ranged based b-trees and RCU b-trees and did find that article. So it was nice to find reassurances that we were on the right path, but our design choice of using ranges made that paper unusable for us. This patch (of 70): The maple tree is an RCU-safe range based B-tree designed to use modern processor cache efficiently. There are a number of places in the kernel that a non-overlapping range-based tree would be beneficial, especially one with a simple interface. If you use an rbtree with other data structures to improve performance or an interval tree to track non-overlapping ranges, then this is for you. The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf nodes. With the increased branching factor, it is significantly shorter than the rbtree so it has fewer cache misses. The removal of the linked list between subsequent entries also reduces the cache misses and the need to pull in the previous and next VMA during many tree alterations. The first user that is covered in this patch set is the vm_area_struct, where three data structures are replaced by the maple tree: the augmented rbtree, the vma cache, and the linked list of VMAs in the mm_struct. The long term goal is to reduce or remove the mmap_lock contention. The plan is to get to the point where we use the maple tree in RCU mode. Readers will not block for writers. A single write operation will be allowed at a time. A reader re-walks if stale data is encountered. VMAs would be RCU enabled and this mode would be entered once multiple tasks are using the mm_struct. There is additional BUG_ON() calls added within the tree, most of which are in debug code. These will be replaced with a WARN_ON() call in the future. There is also additional BUG_ON() calls within the code which will also be reduced in number at a later date. These exist to catch things such as out-of-range accesses which would crash anyways. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Liam R. Howlett <[email protected]> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Tested-by: David Howells <[email protected]> Tested-by: Sven Schnelle <[email protected]> Tested-by: Yu Zhao <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: SeongJae Park <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-09-26rv/monitor: Add __init/__exit annotations to module init/exit funcsXiu Jianfeng3-6/+6
Add missing __init/__exit annotations to module init/exit funcs. Link: https://lkml.kernel.org/r/[email protected] Fixes: 24bce201d798 ("tools/rv: Add dot2k") Fixes: 8812d21219b9 ("rv/monitor: Add the wip monitor skeleton created by dot2k") Fixes: ccc319dcb450 ("rv/monitor: Add the wwnr monitor") Signed-off-by: Xiu Jianfeng <[email protected]> Acked-by: Daniel Bristot de Oliveira <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2022-09-26Merge tag 'mm-hotfixes-stable-2022-09-26' of ↵Linus Torvalds2-20/+2
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull last (?) hotfixes from Andrew Morton: "26 hotfixes. 8 are for issues which were introduced during this -rc cycle, 18 are for earlier issues, and are cc:stable" * tag 'mm-hotfixes-stable-2022-09-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (26 commits) x86/uaccess: avoid check_object_size() in copy_from_user_nmi() mm/page_isolation: fix isolate_single_pageblock() isolation behavior mm,hwpoison: check mm when killing accessing process mm/hugetlb: correct demote page offset logic mm: prevent page_frag_alloc() from corrupting the memory mm: bring back update_mmu_cache() to finish_fault() frontswap: don't call ->init if no ops are registered mm/huge_memory: use pfn_to_online_page() in split_huge_pages_all() mm: fix madivse_pageout mishandling on non-LRU page powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flush mm: gup: fix the fast GUP race against THP collapse mm: fix dereferencing possible ERR_PTR vmscan: check folio_test_private(), not folio_get_private() mm: fix VM_BUG_ON in __delete_from_swap_cache() tools: fix compilation after gfp_types.h split mm/damon/dbgfs: fix memory leak when using debugfs_lookup() mm/migrate_device.c: copy pte dirty bit to page mm/migrate_device.c: add missing flush_cache_page() mm/migrate_device.c: flush TLB while holding PTL x86/mm: disable instrumentations of mm/pgprot.c ...
2022-09-26selftests: net: tsn_lib: run phc2sys in automatic modeVladimir Oltean2-6/+3
We can make the phc2sys helper not only synchronize a PHC to CLOCK_REALTIME, which is what it currently does, but also CLOCK_REALTIME to a PHC, which is going to be needed in distributed TSN tests. Instead of making the complexity of the arguments passed to phc2sys_start() explode, we can let it figure out the sync direction automatically, based on ptp4l's port states. Towards that goal, pass just the path to the desired ptp4l instance's UNIX domain socket, and remove the $if_name argument (from which it derives the PHC). Also adapt the one caller from the ocelot psfp.sh test. In the case of psfp.sh, phc2sys_start is able to properly figure out that CLOCK_REALTIME is the source clock and swp1's PHC is the destination, because of the way in which ptp4l_start for the UDS_ADDRESS_SWP1 was called: with slave_only=false, so it will always win the BMCA and always become the sync master between itself and $h1. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Kurt Kanzenbach <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-26selftests: net: tsn_lib: allow multiple isochron receiversVladimir Oltean1-5/+11
Move the PID variable for the isochron receiver into a separate namespace per stats port, to allow multiple receivers (and/or orchestration daemons) to be instantiated by the same script. Preserve the existing behavior by making isochron_do() use the default stats TCP port of 5000. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Kurt Kanzenbach <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-26selftests: net: tsn_lib: allow running ptp4l on multiple interfacesVladimir Oltean1-8/+19
Switch ports will want to act as Boundary Clocks, which are configured using ptp4l by specifying the "-i" argument multiple times. Since we track a log file and a pid file for each ptp4l instance, and we want to be compatible with the existing single-port callers of ptp4l_start and ptp4l_stop, pass the interface list as a single string of space-separated values. Based on this, we create a label for each ptp4l instance, where the spaces are replaced with underscores (ptp4l_start "eth0 eth1" generates "ptp4l_pid_eth0_eth1"). Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Kurt Kanzenbach <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-26selftests: net: tsn_lib: don't overwrite isochron receiver extra args with UDSVladimir Oltean1-1/+1
The extra_args argument ($3) of isochron_recv_start is overwritten with uds ($2), if that argument exists. This is currently not a problem, because the only TSN selftest (ocelot/psfp.sh) omits remote sync so it does not specify to the receiver a UNIX domain socket for ptp4l. So $uds is currently an empty string. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Kurt Kanzenbach <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-26Merge branch 'mm-hotfixes-stable' into mm-stableAndrew Morton2-20/+2
2022-09-26objtool: Disable CFI warningsSami Tolvanen1-1/+6
The __cfi_ preambles contain a mov instruction that embeds the KCFI type identifier in the following format: ; type preamble __cfi_function: mov <id>, %eax function: ... While the preamble symbols are STT_FUNC and contain valid instructions, they are never executed and always fall through. Skip the warning for them. .kcfi_traps sections point to CFI traps in text sections. Also skip the warning about them referencing !ENDBR instructions. Signed-off-by: Sami Tolvanen <[email protected]> Reviewed-by: Kees Cook <[email protected]> Tested-by: Kees Cook <[email protected]> Tested-by: Nathan Chancellor <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-09-26objtool: Preserve special st_shndx indexes in elf_update_symbolSami Tolvanen1-1/+6
elf_update_symbol fails to preserve the special st_shndx values between [SHN_LORESERVE, SHN_HIRESERVE], which results in it converting SHN_ABS entries into SHN_UNDEF, for example. Explicitly check for the special indexes and ensure these symbols are not marked undefined. Fixes: ead165fa1042 ("objtool: Fix symbol creation") Signed-off-by: Sami Tolvanen <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-09-26rv/dot2K: add 'static' qualifier for local variableZeng Heng3-6/+6
Following Daniel's suggestion, fix similar warning in template files, which would prevent new monitors from such warning. Link: https://lkml.kernel.org/r/[email protected] Cc: <[email protected]> Fixes: 24bce201d798 ("tools/rv: Add dot2k") Suggested-by: Daniel Bristot de Oliveira <[email protected]> Signed-off-by: Zeng Heng <[email protected]> Acked-by: Daniel Bristot de Oliveira <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2022-09-26selftests/ftrace: Add eprobe syntax error testcaseMasami Hiramatsu (Google)1-0/+27
Add a syntax error test case for eprobe as same as kprobes. Link: https://lkml.kernel.org/r/165932115471.2850673.8014722990775242727.stgit@devnote2 Cc: Tzvetomir Stoyanov <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2022-09-26KVM: selftests: Add an x86-only test to verify nested exception queueingSean Christopherson3-0/+297
Add a test to verify that KVM_{G,S}ET_EVENTS play nice with pending vs. injected exceptions when an exception is being queued for L2, and that KVM correctly handles L1's exception intercept wants. Signed-off-by: Sean Christopherson <[email protected]> Reviewed-by: Maxim Levitsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2022-09-26KVM: selftests: Use uapi header to get VMX and SVM exit reasons/codesSean Christopherson2-54/+4
Include the vmx.h and svm.h uapi headers that KVM so kindly provides instead of manually defining all the same exit reasons/code. Signed-off-by: Sean Christopherson <[email protected]> Reviewed-by: Maxim Levitsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2022-09-26KVM: selftests: Switch to updated eVMCSv1 definitionVitaly Kuznetsov1-3/+42
Update Enlightened VMCS definition in selftests from KVM. Reviewed-by: Maxim Levitsky <[email protected]> Signed-off-by: Vitaly Kuznetsov <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2022-09-26KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fieldsVitaly Kuznetsov1-0/+2
The updated Enlightened VMCS definition has 'encls_exiting_bitmap' field which needs mapping to VMCS, add the missing encoding. Reviewed-by: Maxim Levitsky <[email protected]> Reviewed-by: Kai Huang <[email protected]> Signed-off-by: Vitaly Kuznetsov <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Bonzini <[email protected]>
2022-09-26KVM: selftests: Require DISABLE_NX_HUGE_PAGES cap for NX hugepage testOliver Upton1-18/+6
Require KVM_CAP_VM_DISABLE_NX_HUGE_PAGES for the entire NX hugepage test instead of skipping the "disable" subtest if the capability isn't supported by the host kernel. While the "enable" subtest does provide value when the capability isn't supported, silently providing only half the promised coveraged is undesirable, i.e. it's better to skip the test so that the user knows something. Alternatively, the test could print something to alert the user instead of silently skipping the subtest, but that would encourage other tests to follow suit, and it's not clear that it's desirable to take selftests in that direction. And if selftests do head down the path of skipping subtests, such behavior needs first-class support in the framework. Opportunistically convert other test preconditions to TEST_REQUIRE(). Signed-off-by: Oliver Upton <[email protected]> Reviewed-by: David Matlack <[email protected]> Link: https://lore.kernel.org/r/[email protected] [sean: rewrote changelog to capture discussion about skipping the test] Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-09-26selftests/bpf: Add wait send memory test for sockmap redirectLiu Jian1-0/+42
Add one test for wait redirect sock's send memory test for sockmap. Signed-off-by: Liu Jian <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-09-26perf tests powerpc: Fix branch stack sampling test to include sanity check ↵Athira Rajeev1-1/+2
for branch filter Commit b55878c90ab92a24 ("perf test: Add test for branch stack sampling") added test for branch stack sampling. There is a sanity check in the beginning to skip the test if the hardware doesn't support branch stack sampling. Snippet <<>> skip the test if the hardware doesn't support branch stack sampling perf record -b -o- -B true > /dev/null 2>&1 || exit 2 <<>> But the testcase also uses branch sample types: save_type, any. if any platform doesn't support the branch filters used in the test, the testcase will fail. In powerpc, currently mutliple branch filters are not supported and hence this test fails in powerpc. Fix the sanity check to look at the support for branch filters used in this test before proceeding with the test. Fixes: b55878c90ab92a24 ("perf test: Add test for branch stack sampling") Reported-by: Disha Goel <[email protected]> Reviewed-by: Kajol Jain <[email protected]> Signed-off-by: Athira Jajeev <[email protected]> Cc: German Gomez <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: [email protected] Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nageswara R Sastry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-09-26perf parse-events: Remove "not supported" hybrid cache eventsZhengjun Xing4-43/+57
By default, we create two hybrid cache events, one is for cpu_core, and another is for cpu_atom. But Some hybrid hardware cache events are only available on one CPU PMU. For example, the 'L1-dcache-load-misses' is only available on cpu_core, while the 'L1-icache-loads' is only available on cpu_atom. We need to remove "not supported" hybrid cache events. By extending is_event_supported() to global API and using it to check if the hybrid cache events are supported before being created, we can remove the "not supported" hybrid cache events. Before: # ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1 Performance counter stats for 'system wide': 52,570 cpu_core/L1-dcache-load-misses/ <not supported> cpu_atom/L1-dcache-load-misses/ <not supported> cpu_core/L1-icache-loads/ 1,471,817 cpu_atom/L1-icache-loads/ 1.004915229 seconds time elapsed After: # ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1 Performance counter stats for 'system wide': 54,510 cpu_core/L1-dcache-load-misses/ 1,441,286 cpu_atom/L1-icache-loads/ 1.005114281 seconds time elapsed Fixes: 30def61f64bac5f5 ("perf parse-events: Create two hybrid cache events") Reported-by: Yi Ammy <[email protected]> Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Xing Zhengjun <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-09-26perf print-events: Fix "perf list" can not display the PMU prefix for some ↵Zhengjun Xing1-1/+1
hybrid cache events Some hybrid hardware cache events are only available on one CPU PMU. For example, 'L1-dcache-load-misses' is only available on cpu_core. We have supported in the perf list clearly reporting this info, the function works fine before but recently the argument "config" in API is_event_supported() is changed from "u64" to "unsigned int" which caused a regression, the "perf list" then can not display the PMU prefix for some hybrid cache events. For the hybrid systems, the PMU type ID is stored at config[63:32], define config to "unsigned int" will miss the PMU type ID information, then the regression happened, the config should be defined as "u64". Before: # ./perf list |grep "Hardware cache event" L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-stores [Hardware cache event] L1-icache-load-misses [Hardware cache event] L1-icache-loads [Hardware cache event] LLC-load-misses [Hardware cache event] LLC-loads [Hardware cache event] LLC-store-misses [Hardware cache event] LLC-stores [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] dTLB-store-misses [Hardware cache event] dTLB-stores [Hardware cache event] iTLB-load-misses [Hardware cache event] node-load-misses [Hardware cache event] node-loads [Hardware cache event] After: # ./perf list |grep "Hardware cache event" L1-dcache-loads [Hardware cache event] L1-dcache-stores [Hardware cache event] L1-icache-load-misses [Hardware cache event] LLC-load-misses [Hardware cache event] LLC-loads [Hardware cache event] LLC-store-misses [Hardware cache event] LLC-stores [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] cpu_atom/L1-icache-loads/ [Hardware cache event] cpu_core/L1-dcache-load-misses/ [Hardware cache event] cpu_core/node-load-misses/ [Hardware cache event] cpu_core/node-loads/ [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] dTLB-store-misses [Hardware cache event] dTLB-stores [Hardware cache event] iTLB-load-misses [Hardware cache event] Fixes: 9b7c7728f4e4ba8d ("perf parse-events: Break out tracepoint and printing") Reported-by: Yi Ammy <[email protected]> Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Xing Zhengjun <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-09-26perf tools: Get a perf cgroup more portably in BPFNamhyung Kim2-5/+24
The perf_event_cgrp_id can be different on other configurations. To be more portable as CO-RE, it needs to get the cgroup subsys id using the bpf_core_enum_value() helper. Suggested-by: Ian Rogers <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-09-26powerpc/32: Remove powerpc select specialisationRohan McLure1-1/+1
Syscall #82 has been implemented for 32-bit platforms in a unique way on powerpc systems. This hack will in effect guess whether the caller is expecting new select semantics or old select semantics. It does so via a guess, based off the first parameter. In new select, this parameter represents the length of a user-memory array of file descriptors, and in old select this is a pointer to an arguments structure. The heuristic simply interprets sufficiently large values of its first parameter as being a call to old select. The following is a discussion on how this syscall should be handled. As discussed in this thread, the existence of such a hack suggests that for whatever powerpc binaries may predate glibc, it is most likely that they would have taken use of the old select semantics. x86 and arm64 both implement this syscall with oldselect semantics. Remove the powerpc implementation, and update syscall.tbl to refer to emit a reference to sys_old_select and compat_sys_old_select for 32-bit binaries, in keeping with how other architectures support syscall #82. Signed-off-by: Rohan McLure <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Link: https://lore.kernel.org/r/[email protected]
2022-09-25Merge tag 'dax-and-nvdimm-fixes-v6.0-final' of ↵Linus Torvalds1-77/+0
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull NVDIMM and DAX fixes from Dan Williams: "A recently discovered one-line fix for devdax that further addresses a v5.5 regression, and (a bit embarrassing) a small batch of fixes that have been sitting in my fixes tree for weeks. The older fixes have soaked in linux-next during that time and address an fsdax infinite loop and some other minor fixups. - Fix a infinite loop bug in fsdax - Fix memory-type detection for devdax (EINJ regression) - Small cleanups" * tag 'dax-and-nvdimm-fixes-v6.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: devdax: Fix soft-reservation memory description fsdax: Fix infinite loop in dax_iomap_rw() nvdimm/namespace: drop nested variable in create_namespace_pmem() ndtest: Cleanup all of blk namespace specific code pmem: fix a name collision
2022-09-24Merge branch 'for-6.0/dax' into libnvdimm-fixesDan Williams1030-26649/+221501
Pick up another "Soft Reservation" fix for v6.0-final on top of some straggling nvdimm fixes that missed v5.19.
2022-09-23iocost_monitor: reorder BlkgIteratorElijah Conners1-5/+5
In order to comply with PEP 8, the first parameter of a class should be __init__. Signed-off-by: Elijah Conners <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2022-09-23selftests/bpf: allow to adjust BPF verifier log level in veristatAndrii Nakryiko1-1/+13
Add -l (--log-level) flag to override default BPF verifier log lever. This only matters in verbose mode, which is the mode in which veristat emits verifier log for each processed BPF program. This is important because for successfully verified BPF programs log_level 1 is empty, as BPF verifier truncates all the successfully verified paths. So -l2 is the only way to actually get BPF verifier log in practice. It looks sometihng like this: [vmuser@archvm bpf]$ sudo ./veristat xdp_tx.bpf.o -vl2 Processing 'xdp_tx.bpf.o'... PROCESSING xdp_tx.bpf.o/xdp_tx, DURATION US: 19, VERDICT: success, VERIFIER LOG: func#0 @0 0: R1=ctx(off=0,imm=0) R10=fp0 ; return XDP_TX; 0: (b4) w0 = 3 ; R0_w=3 1: (95) exit verification time 19 usec stack depth 0 processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 File Program Verdict Duration (us) Total insns Total states Peak states ------------ ------- ------- ------------- ----------- ------------ ----------- xdp_tx.bpf.o xdp_tx success 19 2 0 0 ------------ ------- ------- ------------- ----------- ------------ ----------- Done. Processed 1 files, 0 programs. Skipped 1 files, 0 programs. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-09-23selftests/bpf: emit processing progress and add quiet mode to veristatAndrii Nakryiko1-1/+14
Emit "Processing <filepath>..." for each BPF object file to be processed, to show progress. But also add -q (--quiet) flag to silence such messages. Doing something more clever (like overwriting same output line) is to cumbersome and easily breakable if there is any other console output (e.g., errors from libbpf). Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-09-23selftests/bpf: make veristat skip non-BPF and failing-to-open BPF objectsAndrii Nakryiko1-8/+70
Make veristat ignore non-BPF object files. This allows simpler mass-verification (e.g., `sudo ./veristat *.bpf.o` in selftests/bpf directory). Note that `sudo ./veristat *.o` would also work, but with selftests's multiple copies of BPF object files (.bpf.o and .bpf.linked{1,2,3}.o) it's 4x slower. Also, given some of BPF object files could be incomplete in the sense that they are meant to be statically linked into final BPF object file (like linked_maps, linked_funcs, linked_vars), note such instances in stderr, but proceed anyways. This seems like a better trade off between completely silently ignoring BPF object file and aborting mass-verification altogether. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-09-23selftests/bpf: make veristat's verifier log parsing faster and more robustAndrii Nakryiko1-9/+20
Make sure veristat doesn't spend ridiculous amount of time parsing verifier stats from verifier log, especially for very large logs or truncated logs (e.g., when verifier returns -ENOSPC due to too small buffer). For this, parse lines from the end of the log and make sure we parse only up to 100 last lines, where stats should be, if at all. Suggested-by: Alexei Starovoitov <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-09-23selftests/bpf: add sign-file to .gitignoreAndrii Nakryiko1-0/+1
Add sign-file to .gitignore to avoid accidentally checking it in. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-09-23libbpf: restore memory layout of bpf_object_open_optsAndrii Nakryiko1-1/+3
When attach_prog_fd field was removed in libbpf 1.0 and replaced with `long: 0` placeholder, it actually shifted all the subsequent fields by 8 byte. This is due to `long: 0` promising to adjust next field's offset to long-aligned offset. But in this case we were already long-aligned as pin_root_path is a pointer. So `long: 0` had no effect, and thus didn't feel the gap created by removed attach_prog_fd. Non-zero bitfield should have been used instead. I validated using pahole. Originally kconfig field was at offset 40. With `long: 0` it's at offset 32, which is wrong. With this change it's back at offset 40. While technically libbpf 1.0 is allowed to break backwards compatibility and applications should have been recompiled against libbpf 1.0 headers, but given how trivial it is to preserve memory layout, let's fix this. Reported-by: Grant Seltzer Richman <[email protected]> Fixes: 146bf811f5ac ("libbpf: remove most other deprecated high-level APIs") Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2022-09-23libbpf: Add pathname_concat() helperWang Yufen1-47/+29
Move snprintf and len check to common helper pathname_concat() to make the code simpler. Signed-off-by: Wang Yufen <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-09-23selftests/bpf: Simplify cgroup_hierarchical_stats selftestYosry Ahmed2-220/+131
The cgroup_hierarchical_stats selftest is complicated. It has to be, because it tests an entire workflow of recording, aggregating, and dumping cgroup stats. However, some of the complexity is unnecessary. The test now enables the memory controller in a cgroup hierarchy, invokes reclaim, measure reclaim time, THEN uses that reclaim time to test the stats collection and aggregation. We don't need to use such a complicated stat, as the context in which the stat is collected is orthogonal. Simplify the test by using a simple stat instead of reclaim time, the total number of times a process has ever entered a cgroup. This makes the test simpler and removes the dependency on the memory controller and the memory reclaim interface. Signed-off-by: Yosry Ahmed <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: KP Singh <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-09-23selftest/net: adjust io_uring sendzc notif handlingPavel Begunkov1-9/+13
It's not currently possible but in the future we may get IORING_CQE_F_MORE and so a notification even for a failed request, i.e. when cqe->res <= 0. That's precisely what the documentation says, so adjust the test and do IORING_CQE_F_MORE checks regardless of the main completion cqe->res. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/aac948ea753a8bfe1fa3b82fe45debcb54586369.1663953085.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-09-23Merge branch 'for-6.0-fixes' into for-6.1Tejun Heo59-620/+1685
for-6.0 has the following fix for cgroup_get_from_id(). 836ac87d ("cgroup: fix cgroup_get_from_id") which conflicts with the following two commits in for-6.1. 4534dee9 ("cgroup: cgroup: Honor caller's cgroup NS when resolving cgroup id") fa7e439c ("cgroup: Homogenize cgroup_get_from_id() return value") While the resolution is straightforward, the code ends up pretty ugly afterwards. Let's pull for-6.0-fixes into for-6.1 so that the code can be fixed up there. Signed-off-by: Tejun Heo <[email protected]>
2022-09-23Merge tag 'landlock-6.0-rc7' of ↵Linus Torvalds2-9/+14
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock fix from Mickaël Salaün: "Fix out-of-tree builds for Landlock tests" * tag 'landlock-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Fix out-of-tree builds
2022-09-23Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-1/+1
Pull kvm fixes from Paolo Bonzini: "As everyone back came back from conferences, here are the pending patches for Linux 6.0. ARM: - Fix for kmemleak with pKVM s390: - Fixes for VFIO with zPCI - smatch fix x86: - Ensure XSAVE-capable hosts always allow FP and SSE state to be saved and restored via KVM_{GET,SET}_XSAVE - Fix broken max_mmu_rmap_size stat - Fix compile error with old glibc that doesn't have gettid()" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Inject #UD on emulated XSETBV if XSAVES isn't enabled KVM: x86: Always enable legacy FP/SSE in allowed user XFEATURES KVM: x86: Reinstate kvm_vcpu_arch.guest_supported_xcr0 KVM: x86/mmu: add missing update to max_mmu_rmap_size selftests: kvm: Fix a compile error in selftests/kvm/rseq_test.c KVM: s390: pci: register pci hooks without interpretation KVM: s390: pci: fix GAIT physical vs virtual pointers usage KVM: s390: Pass initialized arg even if unused KVM: s390: pci: fix plain integer as NULL pointer warnings KVM: arm64: Use kmemleak_free_part_phys() to unregister hyp_mem_base
2022-09-23selftests/livepatch: add sysfs testSong Liu3-1/+122
Add a test for livepatch sysfs entries. Signed-off-by: Song Liu <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Signed-off-by: Petr Mladek <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-09-23Merge tag 'kvm-s390-master-6.0-2' of ↵Paolo Bonzini21-159/+406
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD More pci fixes Fix for a code analyser warning
2022-09-23selftests/bonding: re-add lladdr target testMatthieu Baerts1-0/+1
It looks like this test has been accidentally dropped when resolving conflicts in this Makefile. Most probably because there were 3 different patches modifying this file in parallel: commit 152e8ec77640 ("selftests/bonding: add a test for bonding lladdr target") commit bbb774d921e2 ("net: Add tests for bonding and team address list management") commit 2ffd57327ff1 ("selftests: bonding: cause oops in bond_rr_gen_slave_id") The first one was applied in 'net-next' while the two other ones were recently applied in the 'net' tree. But that's alright, easy to fix by re-adding the missing one! Fixes: 0140a7168f8b ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net") Signed-off-by: Matthieu Baerts <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-23selftests/livepatch: normalize sysctl error messageJoe Lawrence1-1/+1
The livepatch kselftests rely on comparing expected and actual output from such commands as sysctl. A recent commit in procps-ng v4.0.0 [1] changed sysctl's output to emit key pathnames like: sysctl: setting key "/proc/sys/kernel/ftrace_enabled": Device or resource busy versus previous dotted output: sysctl: setting key "kernel.ftrace_enabled": Device or resource busy The modification in output was later reverted [2], but since the change has been tagged in procps-ng v4.0.0, update the livepatch kselftest to handle either case. [1] https://gitlab.com/procps-ng/procps/-/commit/6389deca5bf667f5fab5912acde78ba8e0febbc7 [2] https://gitlab.com/procps-ng/procps/-/commit/b159c198c9160a8eb13254e2b631d0035b9b542c Reported-by: Dennis(Zhuoheng) Li <[email protected]> Signed-off-by: Joe Lawrence <[email protected]> Reviewed-by: Kamalesh Babulal <[email protected]> Signed-off-by: Petr Mladek <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-09-22selftests/tc-testing: add show class case for red qdiscZhengchao Shao1-0/+23
Test 290a: Show RED class Signed-off-by: Zhengchao Shao <[email protected]> Reviewed-by: Victor Nogueira <[email protected]> Tested-by: Victor Nogueira <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-22selftests/tc-testing: add show class case for prio qdiscZhengchao Shao1-0/+20
Test 2410: Show prio class Signed-off-by: Zhengchao Shao <[email protected]> Reviewed-by: Victor Nogueira <[email protected]> Tested-by: Victor Nogueira <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-22selftests/tc-testing: add show class case for mq qdiscZhengchao Shao1-1/+23
Test 1023: Show mq class Signed-off-by: Zhengchao Shao <[email protected]> Reviewed-by: Victor Nogueira <[email protected]> Tested-by: Victor Nogueira <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-22selftests/tc-testing: add show class case for ingress qdiscZhengchao Shao1-0/+20
Test 0521: Show ingress class Signed-off-by: Zhengchao Shao <[email protected]> Reviewed-by: Victor Nogueira <[email protected]> Tested-by: Victor Nogueira <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-22selftests/tc-testing: add selftests for qfq qdiscZhengchao Shao1-0/+145
Test 0582: Create QFQ with default setting Test c9a3: Create QFQ with class weight setting Test 8452: Create QFQ with class maxpkt setting Test d920: Create QFQ with multiple class setting Test 0548: Delete QFQ with handle Test 5901: Show QFQ class Signed-off-by: Zhengchao Shao <[email protected]> Reviewed-by: Victor Nogueira <[email protected]> Tested-by: Victor Nogueira <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2022-09-22selftests/tc-testing: add selftests for netem qdiscZhengchao Shao1-0/+372
Test cb28: Create NETEM with default setting Test a089: Create NETEM with limit flag Test 3449: Create NETEM with delay time Test 3782: Create NETEM with distribution and corrupt flag Test 2b82: Create NETEM with distribution and duplicate flag Test a932: Create NETEM with distribution and loss flag Test e01a: Create NETEM with distribution and loss state flag Test ba29: Create NETEM with loss gemodel flag Test 0492: Create NETEM with reorder flag Test 7862: Create NETEM with rate limit Test 7235: Create NETEM with multiple slot rate Test 5439: Create NETEM with multiple slot setting Test 5029: Change NETEM with loss state Test 3785: Replace NETEM with delay time Test 4502: Delete NETEM with handle Test 0785: Show NETEM class Signed-off-by: Zhengchao Shao <[email protected]> Reviewed-by: Victor Nogueira <[email protected]> Tested-by: Victor Nogueira <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>