aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2024-02-21clocksource: Scale the watchdog read retries automaticallyFeng Tang1-1/+1
On a 8-socket server the TSC is wrongly marked as 'unstable' and disabled during boot time on about one out of 120 boot attempts: clocksource: timekeeping watchdog on CPU227: wd-tsc-wd excessive read-back delay of 153560ns vs. limit of 125000ns, wd-wd read-back delay only 11440ns, attempt 3, marking tsc unstable tsc: Marking TSC unstable due to clocksource watchdog TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'. sched_clock: Marking unstable (119294969739, 159204297)<-(125446229205, -5992055152) clocksource: Checking clocksource tsc synchronization from CPU 319 to CPUs 0,99,136,180,210,542,601,896. clocksource: Switched to clocksource hpet The reason is that for platform with a large number of CPUs, there are sporadic big or huge read latencies while reading the watchog/clocksource during boot or when system is under stress work load, and the frequency and maximum value of the latency goes up with the number of online CPUs. The cCurrent code already has logic to detect and filter such high latency case by reading the watchdog twice and checking the two deltas. Due to the randomness of the latency, there is a low probabilty that the first delta (latency) is big, but the second delta is small and looks valid. The watchdog code retries the readouts by default twice, which is not necessarily sufficient for systems with a large number of CPUs. There is a command line parameter 'max_cswd_read_retries' which allows to increase the number of retries, but that's not user friendly as it needs to be tweaked per system. As the number of required retries is proportional to the number of online CPUs, this parameter can be calculated at runtime. Scale and enlarge the number of retries according to the number of online CPUs and remove the command line parameter completely. [ tglx: Massaged change log and comments ] Signed-off-by: Feng Tang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Jin Wang <[email protected]> Tested-by: Paul E. McKenney <[email protected]> Reviewed-by: Waiman Long <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-02-20Merge branch 'for-6.8/cxl-cper' into for-6.8/cxlDan Williams159-941/+1667
Pick up CXL CPER notification removal for v6.8-rc6, to return in a later merge window.
2024-02-20selftests: sched: Fix spelling mistake "hiearchy" -> "hierarchy"Colin Ian King1-1/+1
There is a spelling mistake in a printed message. Fix it. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-20selftests/mqueue: Set timeout to 180 secondsSeongJae Park1-0/+1
While mq_perf_tests runs with the default kselftest timeout limit, which is 45 seconds, the test takes about 60 seconds to complete on i3.metal AWS instances. Hence, the test always times out. Increase the timeout to 180 seconds. Fixes: 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test") Cc: <[email protected]> # 5.4.x Signed-off-by: SeongJae Park <[email protected]> Reviewed-by: Kees Cook <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-20selftests/ftrace: Add test to exercize function tracer across cpu hotplugNaveen N Rao1-0/+42
Add a test to exercize cpu hotplug with the function tracer active to ensure that sensitive functions in idle path are excluded from being traced. This helps catch issues such as the one fixed by commit 4b3338aaa74d ("powerpc/ftrace: Fix stack teardown in ftrace_no_trace"). Signed-off-by: Naveen N Rao <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Acked-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-20selftest: ftrace: fix minor typo in logVincenzo Mezzela1-1/+1
Resolves a spelling error in the test log, preventing potential confusion. Signed-off-by: Vincenzo Mezzela <[email protected]> Acked-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-20selftests: thermal: intel: workload_hint: add missing gitignoreJavier Carrasco1-0/+1
The 'workload_hint_test' test generates an object with the same name, but there is no .gitignore file in the directory to add the object as stated in the selftest documentation. Add the missing .gitignore file and include 'workload_hint_test'. Signed-off-by: Javier Carrasco <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-20selftests: thermal: intel: power_floor: add missing gitignoreJavier Carrasco1-0/+1
The 'power_floor' test generates an object with the same name, but there is no .gitignore file in the directory to add the object as stated in the selftest documentation. Add the missing .gitignore file and include 'power_floor'. Signed-off-by: Javier Carrasco <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-20selftests: uevent: add missing gitignoreJavier Carrasco1-0/+1
The 'uevent_filtering' test generates an object with the same name, but there is no .gitignore file in the directory to add the object as stated in the selftest documentation. Add the missing .gitignore file and include 'uevent_filtering'. Signed-off-by: Javier Carrasco <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-20selftests: Add test to verify power supply propertiesNícolas F. R. A. Prado4-0/+297
Add a kselftest that verifies power supply properties from sysfs and uevent. It checks whether they are present, readable and return valid values. This initial set of properties is not comprehensive, but rather the ones that I was able to validate locally. Co-developed-by: Sebastian Reichel <[email protected]> Signed-off-by: Sebastian Reichel <[email protected]> Signed-off-by: Nícolas F. R. A. Prado <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-20selftests: ktap_helpers: Add a helper to finish the testNícolas F. R. A. Prado1-2/+14
Similar to the C counterpart, keep track of the number of test cases in the test plan and add a helper function to be called at the end of the test to print the results and exit with the corresponding exit code. Signed-off-by: Nícolas F. R. A. Prado <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-20selftests: ktap_helpers: Add a helper to abort the testNícolas F. R. A. Prado1-0/+7
Similar to the C counterpart, add a helper function to abort the remainder of the test. Signed-off-by: Nícolas F. R. A. Prado <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-20selftests: ktap_helpers: Add helper to pass/fail test based on exit codeNícolas F. R. A. Prado1-0/+11
Similar to the C counterpart, add a helper function that runs a command and passes or fails the test based on the result. Signed-off-by: Nícolas F. R. A. Prado <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-20selftests: ktap_helpers: Add helper to print diagnostic messagesNícolas F. R. A. Prado1-0/+5
Similar to the C counterpart, add a helper to print a diagnostic message. Signed-off-by: Nícolas F. R. A. Prado <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-20selftests: Move KTAP bash helpers to selftests common folderLaura Nao4-6/+9
Move bash helpers for outputting in KTAP format to the common selftests folder. This allows kselftests other than the dt one to source the file and make use of the helper functions. Define pass, fail and skip codes in the same file too. Signed-off-by: Laura Nao <[email protected]> Reviewed-by: Nícolas F. R. A. Prado <[email protected]> Tested-by: Nícolas F. R. A. Prado <[email protected]> Acked-by: Rob Herring <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-20selftests/mm: uffd-unit-test check if huge page size is 0Terry Tritton1-0/+6
If HUGETLBFS is not enabled then the default_huge_page_size function will return 0 and cause a divide by 0 error. Add a check to see if the huge page size is 0 and skip the hugetlb tests if it is. Link: https://lkml.kernel.org/r/[email protected] Fixes: 16a45b57cbf2 ("selftests/mm: add framework for uffd-unit-test") Signed-off-by: Terry Tritton <[email protected]> Cc: Peter Griffin <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Peter Xu <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2024-02-20selftests: ftrace: fix typo in test descriptionAli Zahraee1-1/+1
The typo in the description shows up in test logs and output. This patch submission is part of my application to the Linux Foundation mentorship program: Linux kernel Bug Fixing Spring Unpaid 2024. Signed-off-by: Ali Zahraee <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-20selftest/ftrace: fix typo in ftracetest scriptKousik Sanagavarapu1-1/+1
Fix a typo in ftracetest script which is run when running the kselftests for ftrace. s/faii/fail Signed-off-by: Kousik Sanagavarapu <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-19selftests: fuxex: Report a unique test name per run of futex_requeue_piMark Brown1-1/+12
The futex_requeue_pi test program is run a number of times with different options to provide multiple test cases. Currently every time it runs it reports the result with a consistent string, meaning that automated systems parsing the TAP output from a test run have difficulty in distinguishing which test is which. The parameters used for the test are already logged as part of the test output, let's use the same format to roll them into the test name that we use with KTAP so that automated systems can follow the results of the individual cases that get run. Signed-off-by: Mark Brown <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-19selftests/bpf: Add negtive test cases for task iterYafang Shao2-1/+12
Incorporate a test case to assess the handling of invalid flags or task__nullable parameters passed to bpf_iter_task_new(). Prior to the preceding commit, this scenario could potentially trigger a kernel panic. However, with the previous commit, this test case is expected to function correctly. Signed-off-by: Yafang Shao <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-02-19selftests/bpf: Test racing between bpf_timer_cancel_and_free and ↵Martin KaFai Lau2-2/+67
bpf_timer_cancel This selftest is based on a Alexei's test adopted from an internal user to troubleshoot another bug. During this exercise, a separate racing bug was discovered between bpf_timer_cancel_and_free and bpf_timer_cancel. The details can be found in the previous patch. This patch is to add a selftest that can trigger the bug. I can trigger the UAF everytime in my qemu setup with KASAN. The idea is to have multiple user space threads running in a tight loop to exercise both bpf_map_update_elem (which calls into bpf_timer_cancel_and_free) and bpf_timer_cancel. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Hou Tao <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-02-19selftests: bonding: set active slave to primary eth1 specificallyHangbin Liu1-0/+2
In bond priority testing, we set the primary interface to eth1 and add eth0,1,2 to bond in serial. This is OK in normal times. But when in debug kernel, the bridge port that eth0,1,2 connected would start slowly (enter blocking, forwarding state), which caused the primary interface down for a while after enslaving and active slave changed. Here is a test log from Jakub's debug test[1]. [ 400.399070][ T50] br0: port 1(s0) entered disabled state [ 400.400168][ T50] br0: port 4(s2) entered disabled state [ 400.941504][ T2791] bond0: (slave eth0): making interface the new active one [ 400.942603][ T2791] bond0: (slave eth0): Enslaving as an active interface with an up link [ 400.943633][ T2766] br0: port 1(s0) entered blocking state [ 400.944119][ T2766] br0: port 1(s0) entered forwarding state [ 401.128792][ T2792] bond0: (slave eth1): making interface the new active one [ 401.130771][ T2792] bond0: (slave eth1): Enslaving as an active interface with an up link [ 401.131643][ T69] br0: port 2(s1) entered blocking state [ 401.132067][ T69] br0: port 2(s1) entered forwarding state [ 401.346201][ T2793] bond0: (slave eth2): Enslaving as a backup interface with an up link [ 401.348414][ T50] br0: port 4(s2) entered blocking state [ 401.348857][ T50] br0: port 4(s2) entered forwarding state [ 401.519669][ T250] bond0: (slave eth0): link status definitely down, disabling slave [ 401.526522][ T250] bond0: (slave eth1): link status definitely down, disabling slave [ 401.526986][ T250] bond0: (slave eth2): making interface the new active one [ 401.629470][ T250] bond0: (slave eth0): link status definitely up [ 401.630089][ T250] bond0: (slave eth1): link status definitely up [...] # TEST: prio (active-backup ns_ip6_target primary_reselect 1) [FAIL] # Current active slave is eth2 but not eth1 Fix it by setting active slave to primary slave specifically before testing. [1] https://netdev-3.bots.linux.dev/vmksft-bonding-dbg/results/464301/1-bond-options-sh/stdout Fixes: 481b56e0391e ("selftests: bonding: re-format bond option tests") Signed-off-by: Hangbin Liu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-18selftests: mptcp: diag: unique 'cestab' subtest namesMatthieu Baerts (NGI0)1-6/+11
It is important to have a unique (sub)test name in TAP, because some CI environments drop tests with duplicated name. Some 'cestab' subtests from the diag selftest had the same names, e.g.: ....chk 0 cestab Now the previous value is taken, to have different names, e.g.: ....chk 2->0 cestab after flush While at it, the 'after flush' info is added, similar to what is done with the 'in use' subtests. Also inspired by these 'in use' subtests, 'many' is displayed instead of a large number: many msk socket present [ ok ] ....chk many msk in use [ ok ] ....chk many cestab [ ok ] ....chk many->0 msk in use after flush [ ok ] ....chk many->0 cestab after flush [ ok ] Fixes: 81ab772819da ("selftests: mptcp: diag: check CURRESTAB counters") Cc: [email protected] Reviewed-by: Geliang Tang <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-18selftests: mptcp: diag: unique 'in use' subtest namesMatthieu Baerts (NGI0)1-8/+12
It is important to have a unique (sub)test name in TAP, because some CI environments drop tests with duplicated name. Some 'in use' subtests from the diag selftest had the same names, e.g.: chk 0 msk in use after flush Now the previous value is taken, to have different names, e.g.: chk 2->0 msk in use after flush While at it, avoid repeating the full message, declare it once in the helper. Fixes: ce9902573652 ("selftests: mptcp: diag: format subtests results in TAP") Cc: [email protected] Reviewed-by: Geliang Tang <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-18selftests: mptcp: userspace_pm: unique subtest namesMatthieu Baerts (NGI0)1-2/+2
It is important to have a unique (sub)test name in TAP, because some CI environments drop tests with duplicated names. Some subtests from the userspace_pm selftest had the same names. That's because different subflows are created (and deleted) between the same pair of IP addresses. Simply adding the destination port in the name is then enough to have different names, because the destination port is always different. Note that adding such info takes a bit more space, so we need to increase a bit the width to print the name, simply to keep all the '[ OK ]' aligned as before. Fixes: f589234e1af0 ("selftests: mptcp: userspace_pm: format subtests results in TAP") Cc: [email protected] Reviewed-by: Geliang Tang <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-18selftests: mptcp: simult flows: fix some subtest namesMatthieu Baerts (NGI0)1-1/+2
The selftest was correctly recording all the results, but the 'reverse direction' part was missing in the name when needed. It is important to have a unique (sub)test name in TAP, because some CI environments drop tests with duplicated name. Fixes: 675d99338e7a ("selftests: mptcp: simult flows: format subtests results in TAP") Cc: [email protected] Reviewed-by: Geliang Tang <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-18selftests: mptcp: diag: fix bash warnings on older kernelsMatthieu Baerts (NGI0)1-2/+2
Since the 'Fixes' commit mentioned below, the command that is executed in __chk_nr() helper can return nothing if the feature is not supported. This is the case when the MPTCP CURRESTAB counter is not supported. To avoid this warning ... ./diag.sh: line 65: [: !=: unary operator expected ... we just need to surround '$nr' with double quotes, to support an empty string when the feature is not supported. Fixes: 81ab772819da ("selftests: mptcp: diag: check CURRESTAB counters") Cc: [email protected] Reviewed-by: Geliang Tang <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-18selftests: mptcp: pm nl: avoid error msg on older kernelsMatthieu Baerts (NGI0)1-1/+1
Since the 'Fixes' commit mentioned below, and if the kernel being tested doesn't support the 'fullmesh' flag, this error will be printed: netlink error -22 (Invalid argument) ./pm_nl_ctl: bailing out due to netlink error[s] But that can be normal if the kernel doesn't support the feature, no need to print this worrying error message while everything else looks OK. So we can mute stderr. Failures will still be detected if any. Fixes: 1dc88d241f92 ("selftests: mptcp: pm_nl_ctl: always look for errors") Cc: [email protected] Reviewed-by: Geliang Tang <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-18selftests: mptcp: pm nl: also list skipped testsMatthieu Baerts (NGI0)1-0/+6
If the feature is not supported by older kernels, and instead of just ignoring some tests, we should mark them as skipped, so we can still track them. Fixes: d85555ac11f9 ("selftests: mptcp: pm_netlink: format subtests results in TAP") Cc: [email protected] Reviewed-by: Geliang Tang <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-17Merge tag 'powerpc-6.8-3' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "This is a bit of a big batch for rc4, but just due to holiday hangover and because I didn't send any fixes last week due to a late revert request. I think next week should be back to normal. - Fix ftrace bug on boot caused by exit text sections with '-fpatchable-function-entry' - Fix accuracy of stolen time on pseries since the switch to VIRT_CPU_ACCOUNTING_GEN - Fix a crash in the IOMMU code when doing DLPAR remove - Set pt_regs->link on scv entry to fix BPF stack unwinding - Add missing PPC_FEATURE_BOOKE on 64-bit e5500/e6500, which broke gdb - Fix boot on some 6xx platforms with STRICT_KERNEL_RWX enabled - Fix build failures with KASAN enabled and 32KB stack size - Some other minor fixes Thanks to Arnd Bergmann, Benjamin Gray, Christophe Leroy, David Engraf, Gaurav Batra, Jason Gunthorpe, Jiangfeng Xiao, Matthias Schiffer, Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nysal Jan K.A, R Nageswara Sastry, Shivaprasad G Bhat, Shrikanth Hegde, Spoorthy, Srikar Dronamraju, and Venkat Rao Bagalkote" * tag 'powerpc-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/iommu: Fix the missing iommu_group_put() during platform domain attach powerpc/pseries: fix accuracy of stolen time powerpc/ftrace: Ignore ftrace locations in exit text sections powerpc/cputable: Add missing PPC_FEATURE_BOOKE on PPC64 Book-E powerpc/kasan: Limit KASAN thread size increase to 32KB Revert "powerpc/pseries/iommu: Fix iommu initialisation during DLPAR add" powerpc: 85xx: mark local functions static powerpc: udbg_memcons: mark functions static powerpc/kasan: Fix addr error caused by page alignment powerpc/6xx: set High BAT Enable flag on G2_LE cores selftests/powerpc/papr_vpd: Check devfd before get_system_loc_code() powerpc/64: Set task pt_regs->link to the LR value on scv entry powerpc/pseries/iommu: Fix iommu initialisation during DLPAR add powerpc/pseries/papr-sysparm: use u8 arrays for payloads
2024-02-16cxl/test: Add support for qos_class checkingDave Jiang4-9/+70
Set a fake qos_class to a unique value in order to do simple testing of qos_class for root decoders and mem devs via user cxl_test. A mock function is added to set the fake qos_class values for memory device and overrides cxl_endpoint_parse_cdat() in cxl driver code. Signed-off-by: Dave Jiang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jonathan Cameron <[email protected]> Signed-off-by: Dan Williams <[email protected]>
2024-02-16Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds58-239/+236
Pull KVM fixes from Paolo Bonzini: "ARM: - Avoid dropping the page refcount twice when freeing an unlinked page-table subtree. - Don't source the VFIO Kconfig twice - Fix protected-mode locking order between kvm and vcpus RISC-V: - Fix steal-time related sparse warnings x86: - Cleanup gtod_is_based_on_tsc() to return "bool" instead of an "int" - Make a KVM_REQ_NMI request while handling KVM_SET_VCPU_EVENTS if and only if the incoming events->nmi.pending is non-zero. If the target vCPU is in the UNITIALIZED state, the spurious request will result in KVM exiting to userspace, which in turn causes QEMU to constantly acquire and release QEMU's global mutex, to the point where the BSP is unable to make forward progress. - Fix a type (u8 versus u64) goof that results in pmu->fixed_ctr_ctrl being incorrectly truncated, and ultimately causes KVM to think a fixed counter has already been disabled (KVM thinks the old value is '0'). - Fix a stack leak in KVM_GET_MSRS where a failed MSR read from userspace that is ultimately ignored due to ignore_msrs=true doesn't zero the output as intended. Selftests cleanups and fixes: - Remove redundant newlines from error messages. - Delete an unused variable in the AMX test (which causes build failures when compiling with -Werror). - Fail instead of skipping tests if open(), e.g. of /dev/kvm, fails with an error code other than ENOENT (a Hyper-V selftest bug resulted in an EMFILE, and the test eventually got skipped). - Fix TSC related bugs in several Hyper-V selftests. - Fix a bug in the dirty ring logging test where a sem_post() could be left pending across multiple runs, resulting in incorrect synchronization between the main thread and the vCPU worker thread. - Relax the dirty log split test's assertions on 4KiB mappings to fix false positives due to the number of mappings for memslot 0 (used for code and data that is NOT being dirty logged) changing, e.g. due to NUMA balancing" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits) KVM: arm64: Fix double-free following kvm_pgtable_stage2_free_unlinked() RISC-V: KVM: Use correct restricted types RISC-V: paravirt: Use correct restricted types RISC-V: paravirt: steal_time should be static KVM: selftests: Don't assert on exact number of 4KiB in dirty log split test KVM: selftests: Fix a semaphore imbalance in the dirty ring logging test KVM: x86: Fix KVM_GET_MSRS stack info leak KVM: arm64: Do not source virt/lib/Kconfig twice KVM: x86/pmu: Fix type length error when reading pmu->fixed_ctr_ctrl KVM: x86: Make gtod_is_based_on_tsc() return 'bool' KVM: selftests: Make hyperv_clock require TSC based system clocksource KVM: selftests: Run clocksource dependent tests with hyperv_clocksource_tsc_page too KVM: selftests: Use generic sys_clocksource_is_tsc() in vmx_nested_tsc_scaling_test KVM: selftests: Generalize check_clocksource() from kvm_clock_test KVM: x86: make KVM_REQ_NMI request iff NMI pending for vcpu KVM: arm64: Fix circular locking dependency KVM: selftests: Fail tests when open() fails with !ENOENT KVM: selftests: Avoid infinite loop in hyperv_features when invtsc is missing KVM: selftests: Delete superfluous, unused "stage" variable in AMX test KVM: selftests: x86_64: Remove redundant newlines ...
2024-02-16net/sched: act_mirred: use the backlog for mirred ingressJakub Kicinski1-3/+0
The test Davide added in commit ca22da2fbd69 ("act_mirred: use the backlog for nested calls to mirred ingress") hangs our testing VMs every 10 or so runs, with the familiar tcp_v4_rcv -> tcp_v4_rcv deadlock reported by lockdep. The problem as previously described by Davide (see Link) is that if we reverse flow of traffic with the redirect (egress -> ingress) we may reach the same socket which generated the packet. And we may still be holding its socket lock. The common solution to such deadlocks is to put the packet in the Rx backlog, rather than run the Rx path inline. Do that for all egress -> ingress reversals, not just once we started to nest mirred calls. In the past there was a concern that the backlog indirection will lead to loss of error reporting / less accurate stats. But the current workaround does not seem to address the issue. Fixes: 53592b364001 ("net/sched: act_mirred: Implement ingress actions") Cc: Marcelo Ricardo Leitner <[email protected]> Suggested-by: Davide Caratti <[email protected]> Link: https://lore.kernel.org/netdev/33dc43f587ec1388ba456b4915c75f02a8aae226.1663945716.git.dcaratti@redhat.com/ Signed-off-by: Jakub Kicinski <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-15selftest/bpf: Test the read of vsyscall page under x86-64Hou Tao2-0/+102
Under x86-64, when using bpf_probe_read_kernel{_str}() or bpf_probe_read{_str}() to read vsyscall page, the read may trigger oops, so add one test case to ensure that the problem is fixed. Beside those four bpf helpers mentioned above, testing the read of vsyscall page by using bpf_probe_read_user{_str} and bpf_copy_from_user{_task}() as well. The test case passes the address of vsyscall page to these six helpers and checks whether the returned values are expected: 1) For bpf_probe_read_kernel{_str}()/bpf_probe_read{_str}(), the expected return value is -ERANGE as shown below: bpf_probe_read_kernel_common copy_from_kernel_nofault // false, return -ERANGE copy_from_kernel_nofault_allowed 2) For bpf_probe_read_user{_str}(), the expected return value is -EFAULT as show below: bpf_probe_read_user_common copy_from_user_nofault // false, return -EFAULT __access_ok 3) For bpf_copy_from_user(), the expected return value is -EFAULT: // return -EFAULT bpf_copy_from_user copy_from_user _copy_from_user // return false access_ok 4) For bpf_copy_from_user_task(), the expected return value is -EFAULT: // return -EFAULT bpf_copy_from_user_task access_process_vm // return 0 vma_lookup() // return 0 expand_stack() The occurrence of oops depends on the availability of CPU SMAP [1] feature and there are three possible configurations of vsyscall page in the boot cmd-line: vsyscall={xonly|none|emulate}, so there are a total of six possible combinations. Under all these combinations, the test case runs successfully. [1]: https://en.wikipedia.org/wiki/Supervisor_Mode_Access_Prevention Acked-by: Yonghong Song <[email protected]> Signed-off-by: Hou Tao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-02-15Merge tag 'net-6.8-rc5' of ↵Linus Torvalds13-41/+163
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from can, wireless and netfilter. Current release - regressions: - af_unix: fix task hung while purging oob_skb in GC - pds_core: do not try to run health-thread in VF path Current release - new code bugs: - sched: act_mirred: don't zero blockid when net device is being deleted Previous releases - regressions: - netfilter: - nat: restore default DNAT behavior - nf_tables: fix bidirectional offload, broken when unidirectional offload support was added - openvswitch: limit the number of recursions from action sets - eth: i40e: do not allow untrusted VF to remove administratively set MAC address Previous releases - always broken: - tls: fix races and bugs in use of async crypto - mptcp: prevent data races on some of the main socket fields, fix races in fastopen handling - dpll: fix possible deadlock during netlink dump operation - dsa: lan966x: fix crash when adding interface under a lag when some of the ports are disabled - can: j1939: prevent deadlock by changing j1939_socks_lock to rwlock Misc: - a handful of fixes and reliability improvements for selftests - fix sysfs documentation missing net/ in paths - finish the work of squashing the missing MODULE_DESCRIPTION() warnings in networking" * tag 'net-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (92 commits) net: fill in MODULE_DESCRIPTION()s for missing arcnet net: fill in MODULE_DESCRIPTION()s for mdio_devres net: fill in MODULE_DESCRIPTION()s for ppp net: fill in MODULE_DESCRIPTION()s for fddik/skfp net: fill in MODULE_DESCRIPTION()s for plip net: fill in MODULE_DESCRIPTION()s for ieee802154/fakelb net: fill in MODULE_DESCRIPTION()s for xen-netback net: ravb: Count packets instead of descriptors in GbEth RX path pppoe: Fix memory leak in pppoe_sendmsg() net: sctp: fix skb leak in sctp_inq_free() net: bcmasp: Handle RX buffer allocation failure net-timestamp: make sk_tskey more predictable in error path selftests: tls: increase the wait in poll_partial_rec_async ice: Add check for lport extraction to LAG init netfilter: nf_tables: fix bidirectional offload regression netfilter: nat: restore default DNAT behavior netfilter: nft_set_pipapo: fix missing : in kdoc igc: Remove temporary workaround igb: Fix string truncation warnings in igb_set_fw_version can: netlink: Fix TDCO calculation using the old data bittiming ...
2024-02-15Merge tag 'devicetree-fixes-for-6.8-1' of ↵Linus Torvalds1-6/+7
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Improve devlink dependency parsing for DT graphs - Fix devlink handling of io-channels dependencies - Fix PCI addressing in marvell,prestera example - A few schema fixes for property constraints - Improve performance of DT unprobed devices kselftest - Fix regression in DT_SCHEMA_FILES handling - Fix compile error in unittest for !OF_DYNAMIC * tag 'devicetree-fixes-for-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: ufs: samsung,exynos-ufs: Add size constraints on "samsung,sysreg" of: property: Add in-ports/out-ports support to of_graph_get_port_parent() of: property: Improve finding the supplier of a remote-endpoint property of: property: Improve finding the consumer of a remote-endpoint property net: marvell,prestera: Fix example PCI bus addressing of: unittest: Fix compile in the non-dynamic case of: property: fix typo in io-channels dt-bindings: tpm: Drop type from "resets" dt-bindings: display: nxp,tda998x: Fix 'audio-ports' constraints dt-bindings: xilinx: replace Piyush Mehta maintainership kselftest: dt: Stop relying on dirname to improve performance dt-bindings: don't anchor DT_SCHEMA_FILES to bindings directory
2024-02-14selftests: tls: increase the wait in poll_partial_rec_asyncJakub Kicinski1-2/+2
Test runners on debug kernels occasionally fail with: # # RUN tls_err.13_aes_gcm.poll_partial_rec_async ... # # tls.c:1883:poll_partial_rec_async:Expected poll(&pfd, 1, 5) (0) == 1 (1) # # tls.c:1870:poll_partial_rec_async:Expected status (256) == 0 (0) # # poll_partial_rec_async: Test failed at step #17 # # FAIL tls_err.13_aes_gcm.poll_partial_rec_async # not ok 699 tls_err.13_aes_gcm.poll_partial_rec_async # # FAILED: 698 / 699 tests passed. This points to the second poll() in the test which is expected to wait for the sender to send the rest of the data. Apparently under some conditions that doesn't happen within 5ms, bump the timeout to 20ms. Fixes: 23fcb62bc19c ("selftests: tls: add tests for poll behavior") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-02-14Merge tag 'landlock-6.8-rc5' of ↵Linus Torvalds3-13/+59
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock test fixes from Mickaël Salaün: "Fix build issues for tests, and improve test compatibility" * tag 'landlock-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Fix capability for net_test selftests/landlock: Fix fs_test build with old libc selftests/landlock: Fix net_test build with old libc
2024-02-14Merge tag 'kvm-x86-selftests-6.8-rcN' of https://github.com/kvm-x86/linux ↵Paolo Bonzini58-239/+236
into HEAD KVM selftests fixes/cleanups (and one KVM x86 cleanup) for 6.8: - Remove redundant newlines from error messages. - Delete an unused variable in the AMX test (which causes build failures when compiling with -Werror). - Fail instead of skipping tests if open(), e.g. of /dev/kvm, fails with an error code other than ENOENT (a Hyper-V selftest bug resulted in an EMFILE, and the test eventually got skipped). - Fix TSC related bugs in several Hyper-V selftests. - Fix a bug in the dirty ring logging test where a sem_post() could be left pending across multiple runs, resulting in incorrect synchronization between the main thread and the vCPU worker thread. - Relax the dirty log split test's assertions on 4KiB mappings to fix false positives due to the number of mappings for memslot 0 (used for code and data that is NOT being dirty logged) changing, e.g. due to NUMA balancing. - Have KVM's gtod_is_based_on_tsc() return "bool" instead of an "int" (the function generates boolean values, and all callers treat the return value as a bool).
2024-02-13selftests/resctrl: Get domain id from cache idIlpo Järvinen3-12/+21
Domain id is acquired differently depending on CPU. AMD tests use id from L3 cache, whereas CPUs from other vendors base the id on topology package id. In order to support L2 CAT test, this has to be generalized. The driver side code seems to get the domain ids from cache ids so the approach used by the AMD branch seems to match the kernel-side code. It will also work with L2 domain IDs as long as the cache level is generalized. Using the topology id was always fragile due to mismatch with the kernel-side way to acquire the domain id. It got incorrect domain id, e.g., when Cluster-on-Die (CoD) is enabled for CPU (but CoD is not well suited for resctrl in the first place so it has not been a big issue if tests don't work correctly with it). Taking all the above into account, generalize acquiring the domain id by taking it from the cache id and do not hard-code the cache level. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Rename resource ID to domain IDIlpo Järvinen3-25/+25
Kernel-side calls the instances of a resource domains. Change the resource_id naming in the selftest code to domain_id to match the kernel side better. Suggested-by: Maciej Wieczór-Retman <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Add helper to convert L2/3 to integerIlpo Järvinen1-8/+20
"L2"/"L3" conversion to integer is embedded into get_cache_size() which prevents reuse. Create a helper for the cache string to integer conversion to make it reusable. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Pass write_schemata() resource instead of test nameIlpo Järvinen7-38/+38
write_schemata() takes the test name as an argument and determines the relevant resource based on the test name. Such mapping from name to resource does not really belong to resctrlfs.c that should provide only generic, test-independent functions. Pass the resource stored in the test information structure to write_schemata() instead of the test name. The new API is also more flexible as it enables to use write_schemata() for more than one resource within a test. While touching the sprintf(), move the unnecessary %c that is always '=' directly into the format string. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Introduce generalized test frameworkIlpo Järvinen7-121/+148
Each test currently has a "run test" function in per test file and another resctrl_tests.c. The functions in resctrl_tests.c are almost identical. Generalize the one in resctrl_tests.c such that it can be shared between all of the tests. It makes adding new tests easier and removes the per test if () forests. Also add comment to CPU vendor IDs that they must be defined as bits for a bitmask. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Create struct for input parametersIlpo Järvinen7-71/+95
resctrl_tests reads a set of parameters and passes them individually for each tests which causes variations in the call signature between the tests. Add struct input_params to hold all input parameters. It can be easily passed to every test without varying the call signature. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Restore the CPU affinity after CAT testIlpo Järvinen4-9/+42
CAT test does not reset the CPU affinity after the benchmark. This is relatively harmless as is because CAT test is the last benchmark to run, however, more tests may be added later. Store the CPU affinity the first time taskset_benchmark() is run and add taskset_restore() which the test can call to reset the CPU mask to its original value. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Rewrite Cache Allocation Technology (CAT) testIlpo Järvinen4-199/+139
CAT test spawns two processes into two different control groups with exclusive schemata. Both the processes alloc a buffer from memory matching their allocated LLC block size and flush the entire buffer out of caches. Since the processes are reading through the buffer only once during the measurement and initially all the buffer was flushed, the test isn't testing CAT. Rewrite the CAT test to allocate a buffer sized to half of LLC. Then perform a sequence of tests with different LLC alloc sizes starting from half of the CBM bits down to 1-bit CBM. Flush the buffer before each test and read the buffer twice. Observe the LLC misses on the second read through the buffer. As the allocated LLC block gets smaller and smaller, the LLC misses will become larger and larger giving a strong signal on CAT working properly. The new CAT test is using only a single process because it relies on measured effect against another run of itself rather than another process adding noise. The rest of the system is set to use the CBM bits not used by the CAT test to keep the test isolated. Replace count_bits() with count_contiguous_bits() to get the first bit position in order to be able to calculate masks based on it. This change has been tested with a number of systems from different generations. Suggested-by: Reinette Chatre <[email protected]> Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Read in less obvious order to defeat prefetch optimizationsIlpo Järvinen1-8/+30
When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results. Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo. Using a prime multiplier with modulo ensures the entire buffer is eventually read. 23 is small enough that the reads are spread out but wrapping does not occur very frequently (wrapping too often can trigger L2 hits more frequently which causes noise to the test because getting the data from LLC is not required). It was discovered that not all primes work equally well and some can cause wildly unstable results (e.g., in an earlier version of this patch, the reads were done in reversed order and 59 was used as the prime resulting in unacceptably high and unstable results in MBA and MBM test on some architectures). Link: https://lore.kernel.org/linux-kselftest/TYAPR01MB6330025B5E6537F94DA49ACB8B499@TYAPR01MB6330.jpnprd01.prod.outlook.com/ Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Replace file write with volatile variableIlpo Järvinen3-21/+16
The fill_buf code prevents compiler optimizating the entire read loop away by writing the final value of the variable into a file. While it achieves the goal, writing into a file requires significant amount of work within the innermost test loop and also error handling. A simpler approach is to take advantage of volatile. Writing through a pointer to a volatile variable is enough to prevent compiler from optimizing the write away, and therefore compiler cannot remove the read loop either. Add a volatile 'value_sink' into resctrl_tests.c and make fill_buf to write into it. As a result, the error handling in fill_buf.c can be simplified. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2024-02-13selftests/resctrl: Open perf fd before start & add error handlingIlpo Järvinen3-11/+23
Perf fd (pe_fd) is opened, reset, and enabled during every test the CAT selftest runs. Also, ioctl(pe_fd, ...) calls are not error checked even if ioctl() could return an error. Open perf fd only once before the tests and only reset and enable the counter within the test loop. Add error checking to pe_fd ioctl() calls. Signed-off-by: Ilpo Järvinen <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Signed-off-by: Shuah Khan <[email protected]>