aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-01-06rcu/nocb: Set SEGCBLIST_SOFTIRQ_ONLY at the very last stage of de-offloadingFrederic Weisbecker1-1/+8
This commit sets SEGCBLIST_SOFTIRQ_ONLY once toggling is otherwise fully complete, allowing further RCU callback manipulation to be carried out locklessly and locally. Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Thomas Gleixner <[email protected]> Inspired-by: Paul E. McKenney <[email protected]> Tested-by: Boqun Feng <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06rcu/nocb: Flush bypass before setting SEGCBLIST_SOFTIRQ_ONLYFrederic Weisbecker1-1/+10
This commit flushes the bypass queue and sets state to avoid its being refilled before switching to the final de-offloaded state. To avoid refilling, this commit sets SEGCBLIST_SOFTIRQ_ONLY before re-enabling IRQs. Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Thomas Gleixner <[email protected]> Inspired-by: Paul E. McKenney <[email protected]> Tested-by: Boqun Feng <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06rcu/nocb: Shutdown nocb timer on de-offloadingFrederic Weisbecker2-1/+12
This commit ensures that the nocb timer is shut down before reaching the final de-offloaded state. The key goal is to prevent the timer handler from manipulating the callbacks without the protection of the nocb locks. Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Thomas Gleixner <[email protected]> Inspired-by: Paul E. McKenney <[email protected]> Tested-by: Boqun Feng <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06rcu/nocb: Re-offload supportFrederic Weisbecker2-22/+138
To re-offload the callback processing off of a CPU, it is necessary to clear SEGCBLIST_SOFTIRQ_ONLY, set SEGCBLIST_OFFLOADED, and then notify both the CB and GP kthreads so that they both set their own bit flag and start processing the callbacks remotely. The re-offloading worker is then notified that it can stop the RCU_SOFTIRQ handler (or rcuc kthread, as the case may be) from processing the callbacks locally. Ordering must be carefully enforced so that the callbacks that used to be processed locally without locking will have the same ordering properties when they are invoked by the nocb CB and GP kthreads. This commit makes this change. Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Thomas Gleixner <[email protected]> Inspired-by: Paul E. McKenney <[email protected]> Tested-by: Boqun Feng <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> [ paulmck: Export rcu_nocb_cpu_offload(). ] Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06rcu/nocb: De-offloading GP kthreadFrederic Weisbecker1-3/+51
To de-offload callback processing back onto a CPU, it is necessary to clear SEGCBLIST_OFFLOAD and notify the nocb GP kthread, which will then clear its own bit flag and ignore this CPU until further notice. Whichever of the nocb CB and nocb GP kthreads is last to clear its own bit notifies the de-offloading worker kthread. Once notified, this worker kthread can proceed safe in the knowledge that the nocb CB and GP kthreads will no longer be manipulating this CPU's RCU callback list. This commit makes this change. Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Thomas Gleixner <[email protected]> Inspired-by: Paul E. McKenney <[email protected]> Tested-by: Boqun Feng <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06rcu/nocb: Don't deoffload an offline CPU with pending workFrederic Weisbecker1-0/+9
Offloaded CPUs do not migrate their callbacks, instead relying on their rcuo kthread to invoke them. But if the CPU is offline, it will be running neither its RCU_SOFTIRQ handler nor its rcuc kthread. This means that de-offloading an offline CPU that still has pending callbacks will strand those callbacks. This commit therefore refuses to toggle offline CPUs having pending callbacks. Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Thomas Gleixner <[email protected]> Suggested-by: Paul E. McKenney <[email protected]> Tested-by: Boqun Feng <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06rcu/nocb: De-offloading CB kthreadFrederic Weisbecker5-22/+123
To de-offload callback processing back onto a CPU, it is necessary to clear SEGCBLIST_OFFLOAD and notify the nocb CB kthread, which will then clear its own bit flag and go to sleep to stop handling callbacks. This commit makes that change. It will also be necessary to notify the nocb GP kthread in this same way, which is the subject of a follow-on commit. Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Thomas Gleixner <[email protected]> Inspired-by: Paul E. McKenney <[email protected]> Tested-by: Boqun Feng <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> [ paulmck: Add export per kernel test robot feedback. ] Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06rcu/nocb: Always init segcblist on CPU upFrederic Weisbecker1-3/+9
How the rdp->cblist enabled state is treated at CPU-hotplug time depends on whether or not that ->cblist is offloaded. 1) Not offloaded: The ->cblist is disabled when the CPU goes down. All its callbacks are migrated and none can to enqueued until after some later CPU-hotplug operation brings the CPU back up. 2) Offloaded: The ->cblist is not disabled on CPU down because the CB/GP kthreads must finish invoking the remaining callbacks. There is thus no need to re-enable it on CPU up. Since the ->cblist offloaded state is set in stone at boot, it cannot change between CPU down and CPU up. So 1) and 2) are symmetrical. However, given runtime toggling of the offloaded state, there are two additional asymmetrical scenarios: 3) The ->cblist is not offloaded when the CPU goes down. The ->cblist is later toggled to offloaded and then the CPU comes back up. 4) The ->cblist is offloaded when the CPU goes down. The ->cblist is later toggled to no longer be offloaded and then the CPU comes back up. Scenario 4) is currently handled correctly. The ->cblist remains enabled on CPU down and gets re-initialized on CPU up. The toggling operation will wait until ->cblist is empty, so ->cblist will remain empty until CPU-up time. The scenario 3) would run into trouble though, as the rdp is disabled on CPU down and not re-initialized/re-enabled on CPU up. Except that in this case, ->cblist is guaranteed to be empty because all its callbacks were migrated away at CPU-down time. And the CPU-up code already initializes and enables any empty ->cblist structures in order to handle the possibility of early-boot invocations of call_rcu() in the case where such invocations don't occur. So all that need be done is to adjust the locking. Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Thomas Gleixner <[email protected]> Inspired-by: Paul E. McKenney <[email protected]> Tested-by: Boqun Feng <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06rcu/nocb: Provide basic callback offloading state machine bitsFrederic Weisbecker4-3/+128
Offloading and de-offloading RCU callback processes must be done carefully. There must never be a time at which callback processing is disabled because the task driving the offloading or de-offloading might be preempted or otherwise stalled at that point in time, which would result in OOM due to calbacks piling up indefinitely. This implies that there will be times during which a given CPU's callbacks might be concurrently invoked by both that CPU's RCU_SOFTIRQ handler (or, equivalently, that CPU's rcuc kthread) and by that CPU's rcuo kthread. This situation could fatally confuse both rcu_barrier() and the CPU-hotplug offlining process, so these must be excluded during any concurrent-callback-invocation period. In addition, during times of concurrent callback invocation, changes to ->cblist must be protected both as needed for RCU_SOFTIRQ and as needed for the rcuo kthread. This commit therefore defines and documents the states for a state machine that coordinates offloading and deoffloading. Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Thomas Gleixner <[email protected]> Inspired-by: Paul E. McKenney <[email protected]> Tested-by: Boqun Feng <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06rcu/nocb: Turn enabled/offload states into a common flagFrederic Weisbecker3-7/+28
This commit gathers the rcu_segcblist ->enabled and ->offloaded property field into a single ->flags bitmask to avoid further proliferation of individual u8 fields in the structure. This change prepares for the state formerly known as ->offloaded state to be modified at runtime. Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Thomas Gleixner <[email protected]> Inspired-by: Paul E. McKenney <[email protected]> Tested-by: Boqun Feng <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06rcu/segcblist: Add debug checks for segment lengthsJoel Fernandes (Google)3-2/+21
This commit adds debug checks near the end of rcu_do_batch() that emit warnings if an empty rcu_segcblist structure has non-zero segment counts, or, conversely, if a non-empty structure has all-zero segment counts. Signed-off-by: Joel Fernandes (Google) <[email protected]> [ paulmck: Fix queue/segment-length checks. ] Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06rcu/trace: Add tracing for how segcb list changesJoel Fernandes (Google)2-0/+35
This commit adds tracing to track how the segcb list changes before/after acceleration, during queuing and during dequeuing. This tracing helped discover an optimization that avoided needless GP requests when no callbacks were accelerated. The tracing overhead is minimal as each segment's length is now stored in the respective segment. Reviewed-by: Frederic Weisbecker <[email protected]> Reviewed-by: Neeraj Upadhyay <[email protected]> Signed-off-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06rcu/tree: segcblist: Remove redundant smp_mb()sJoel Fernandes (Google)2-2/+0
The full memory barriers in rcu_segcblist_enqueue() and in rcu_do_batch() are not needed because rcu_segcblist_add_len(), and thus also rcu_segcblist_inc_len(), already includes a memory barrier *before* and *after* the length of the list is updated. This commit therefore removes these redundant smp_mb() invocations. Reviewed-by: Frederic Weisbecker <[email protected]> Signed-off-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06rcu/segcblist: Add counters to segcblist datastructureJoel Fernandes (Google)4-46/+82
Add counting of segment lengths of segmented callback list. This will be useful for a number of things such as knowing how big the ready-to-execute segment have gotten. The immediate benefit is ability to trace how the callbacks in the segmented callback list change. Also this patch remove hacks related to using donecbs's ->len field as a temporary variable to save the segmented callback list's length. This cannot be done anymore and is not needed. Also fix SRCU: The negative counting of the unsegmented list cannot be used to adjust the segmented one. To fix this, sample the unsegmented length in advance, and use it after CB execution to adjust the segmented list's length. Reviewed-by: Frederic Weisbecker <[email protected]> Signed-off-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06rcu/segcblist: Add additional comments to explain smp_mb()Joel Fernandes (Google)1-4/+64
One counter-intuitive property of RCU is the fact that full memory barriers are needed both before and after updates to the full (non-segmented) length. This patch therefore helps to assist the reader's intuition by adding appropriate comments. [ paulmck: Wordsmithing. ] Signed-off-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06doc: Use CONFIG_PREEMPTIONSebastian Andrzej Siewior6-24/+24
CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT. Both PREEMPT and PREEMPT_RT require the same functionality which today depends on CONFIG_PREEMPT. Update the documents and mention CONFIG_PREEMPTION. Spell out CONFIG_PREEMPT_RT (instead PREEMPT_RT) since it is an option now. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-06doc: Update RCU's requirements page about the PREEMPT_RT wikiSebastian Andrzej Siewior1-1/+1
The PREEMPT_RT wiki moved from kernel.org to the Linux Foundation wiki. The kernel.org wiki is read only. This commit therefore updates the URL of the active PREEMPT_RT wiki. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-05torture: Do Kconfig analysis only once per scenarioPaul E. McKenney1-7/+15
Currently, if a scenario is repeated as in "--configs '4*TREE01'", the Kconfig analysis is performed for each occurrance (four times in this example) and each analysis places the exact same data into the exact same files. This is not really an issue in this repetition-four example, but it can needlessly consume tens of seconds of wallclock time for something like "--config '128*TINY01'". This commit therefore does Kconfig analysis only once per set of repeats of a given scenario, courtesy of the "sort -u" command and an automatically generated awk script. While in the area, this commit also wordsmiths a comment. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04rcu: Make TASKS_TRACE_RCU select IRQ_WORKPaul E. McKenney1-0/+1
Tasks Trace RCU uses irq_work_queue() to safely awaken its grace-period kthread, so this commit therefore causes the TASKS_TRACE_RCU Kconfig option select the IRQ_WORK Kconfig option. Reported-by: kernel test robot <[email protected]> Acked-by: Randy Dunlap <[email protected]> # build-tested Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04rcu-tasks: Add RCU-tasks self testsUladzislau Rezki (Sony)1-0/+79
This commit adds self tests for early-boot use of RCU-tasks grace periods. It tests all three variants (Rude, Tasks, and Tasks Trace) and covers both synchronous (e.g., synchronize_rcu_tasks()) and asynchronous (e.g., call_rcu_tasks()) grace-period APIs. Self-tests are run only in kernels built with CONFIG_PROVE_RCU=y. Signed-off-by: Uladzislau Rezki (Sony) <[email protected]> [ paulmck: Handle CONFIG_PROVE_RCU=n and identify test cases' callbacks. ] Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04rcu: Add lockdep_assert_irqs_disabled() to raw_spin_unlock_rcu_node() macrosPaul E. McKenney1-3/+13
This commit adds a lockdep_assert_irqs_disabled() call to the helper macros that release the rcu_node structure's ->lock, namely to raw_spin_unlock_rcu_node(), raw_spin_unlock_irq_rcu_node() and raw_spin_unlock_irqrestore_rcu_node(). The point of this is to help track down a situation where lockdep appears to be insisting that interrupts are enabled while holding an rcu_node structure's ->lock. Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04rcu: Add lockdep_assert_irqs_disabled() to rcu_sched_clock_irq() and calleesPaul E. McKenney3-0/+13
This commit adds a number of lockdep_assert_irqs_disabled() calls to rcu_sched_clock_irq() and a number of the functions that it calls. The point of this is to help track down a situation where lockdep appears to be insisting that interrupts are enabled within these functions, which should only ever be invoked from the scheduling-clock interrupt handler. Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04locking: Remove duplicate include of percpu-rwsem.hWang Qing1-1/+0
This commit removes an unnecessary #include. Signed-off-by: Wang Qing <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04sched/core: Allow try_invoke_on_locked_down_task() with irqs disabledPeter Zijlstra1-5/+4
The try_invoke_on_locked_down_task() function currently requires that interrupts be enabled, but it is called with interrupts disabled from rcu_print_task_stall(), resulting in an "IRQs not enabled as expected" diagnostic. This commit therefore updates try_invoke_on_locked_down_task() to use raw_spin_lock_irqsave() instead of raw_spin_lock_irq(), thus allowing use from either context. Link: https://lore.kernel.org/lkml/[email protected]/ Link: https://lore.kernel.org/lkml/[email protected]/ Reported-by: [email protected] Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Allow standalone kvm-recheck.sh run detect --trust-makePaul E. McKenney1-1/+1
Normally, kvm-recheck.sh is run from kvm.sh, which provides the TORTURE_TRUST_MAKE environment variable that, if a non-empty string, indicates that the --trust-make command-line parameter has been passed to kvm.sh. If there was no --trust-make, kvm-recheck.sh insists that the Make.out file contain at least one "CC" command. Thus, when kvm-recheck.sh is run standalone to evaluate a prior --trust-make run, it will incorrectly insist that a proper kernel build did not happen. This commit therefore causes kvm-recheck.sh to also search the "log" file in the top-level results directory for the string "--trust-make". Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Remove "Failed to add ttynull console" false positivePaul E. McKenney2-1/+2
Commit 757055ae8ded ("init/console: Use ttynull as a fallback when there is no console") results in the string "Warning: Failed to add ttynull console. No stdin, stdout, and stderr for the init process!" appearing on the console, which the rcutorture scripting interprets as a warning, which causes every rcutorture run to be flagged. However, the rcutorture init process never attempts to do any I/O, and thus does not care that it has no stdin, stdout, or stderr. This commit therefore causes the rcutorture scripting to ignore this message. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Simplify exit-code plumbing for kvm-recheck.sh and kvm-find-errors.shPaul E. McKenney2-1/+3
This commit simplifies exit-code plumbing. It makes kvm-recheck.sh return the value 1 for a build error and 2 for a runtime error. It also makes kvm-find-errors.sh avoid checking runtime files for --build-only runs. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: s/STOP/STOP.1/ to avoid scenario collisionPaul E. McKenney2-5/+5
This commit changes the "STOP" file that is used to cleanly halt a running rcutorture run to "STOP.1" because no scenario directory will ever end with ".1". If there really was a scenario named "STOP", its directories would instead be named "STOP", "STOP.2", "STOP.3", and so on. While in the area, the commit also changes the kernel-run-time checks for this file to look directly in the directory above $resdir, thus avoiding the need to pass the TORTURE_STOPFILE environment variable to remote systems. While in the area, move the STOP.1 file to the top-level directory covering all of the scenarios. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Add --dryrun batches to help schedule a distributed runPaul E. McKenney1-5/+17
When all of the remote systems have the same number of CPUs, one approach is to use one "--buildonly" run and one "--dryrun sched" run, and then distributing the batches out one per remote system. However, the output of "--dryrun sched" is not made for parsing, so this commit adds a "--dryrun batches" that provides the same information in easily parsed form. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Stop hanging on panicPaul E. McKenney1-0/+1
By default, the "panic" kernel parameter is zero, which causes the kernel to loop indefinitely after a panic(). The rcutorture scripting will eventually kill the corresponding qemu process, but only after waiting for the full run duration plus a few minutes. This works, but delays notifying the developer of the failure. This commit therefore causes the rcutorture scripting to pass the "panic=-1" kernel parameter, which caused the kernel to instead unceremoniously shut down immediately. This in turn causes qemu to terminate, so that if all of the runs in a given batch panic(), the rcutorture scripting can immediately proceed to the next batch. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Add kvm.sh test summary to end of log filePaul E. McKenney1-8/+11
This commit adds the test summary to the end of the log in the top-level directory containing the kvm.sh test artifacts. While in the area, it adds the kvm.sh exit code to this test summary. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Make kvm.sh include --kconfig arguments in CPU calculationPaul E. McKenney1-1/+7
Currently, passing something like "--kconfig CONFIG_NR_CPUS=2" to kvm.sh has no effect on scenario scheduling. For scenarios that do not specify the number of CPUs, this can result in kvm.sh wastefully scheduling only one scenario at a time even when the --kconfig argument would allow a number to be run concurrently. This commit therefore makes kvm.sh consider the --kconfig arguments when scheduling scenarios across the available CPUs. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Make kvm.sh return failure upon build failurePaul E. McKenney1-1/+7
The kvm.sh script uses kvm-find-errors.sh to evaluate whether or not a build failed. Unfortunately, kvm-find-errors.sh returns success if there are no failed runs (including when there are no runs at all) even if there are build failures. This commit therefore makes kvm-find-errors.sh return failure in response to build failures. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Print run duration at end of kvm.sh executionPaul E. McKenney2-0/+39
Yes, you can mentally subtract the timestamps, but this commit makes the computer do this work. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Make kvm.sh arguments accumulatePaul E. McKenney1-6/+6
Given that kvm.sh in invoked from scripts, it is only natural for different levels of scripting to provide their own Kconfig option values, for example. Unfortunately, right now, the last such argument on the command line wins. This commit therefore makes the --bootargs, --configs, --kconfigs, --kmake-args, and --qemu-args argument values accumulate. For example, where "--configs TREE01 --configs TREE02" would previously have run only scenario TREE02, now it will run both scenarios. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Make kvm.sh "Test Summary" date be end of testPaul E. McKenney1-1/+3
Currently, the "date" command producing the output on the kvm.sh "Test Summary" line is executed at the beginning of the test, which produces a date that is less than helpful to someone wanting to know the duration of the test. This commit therefore defers this command's execution to the end of the test. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04tools/rcutorture: Make identify_qemu_vcpus() independent of local languageFrederic Weisbecker1-1/+1
The rcutorture scripts' identify_qemu_vcpus() function expects `lscpu` to have a "CPU: " line, for example: CPU(s): 8 But different local language settings can give different results: Processeur(s) : 8 As a result, identify_qemu_vcpus() may return an empty string, resulting in the following warning (with the same local language settings): kvm-test-1-run.sh: ligne 138 : test: : nombre entier attendu comme expression This commit therefore changes identify_qemu_vcpus() to use getconf, which produces local-language-independend output. Cc: Josh Triplett <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: [email protected] Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Add config2csv.sh script to compare torture scenariosPaul E. McKenney1-0/+67
This commit adds a config2csv.sh script that converts the specified torture-test scenarios' Kconfig options and kernel-boot parameters to .csv format. This allows easier comparison of scenarios when one fails and another does not. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Prepare for splitting qemu execution from kvm-test-1-run.shPaul E. McKenney1-1/+3
Distributed execution of rcutorture is eased if the qemu execution can be split from the building of the kernel, as this allows target systems to be used that are not set up to build kernels. It also avoids issues with toolchain version skew across the cluster, aside of course from qemu and KVM version skew. This commit therefore records needed data as comments in the qemu-cmd file and moves recording of the starting time to just before qemu is launched. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Allow kvm.sh --datestamp to specify subdirectoriesPaul E. McKenney1-2/+2
Scripts like kvm-check-branches.sh group runs under a single directory in resdir in order to allow easier retrospective analysis. However, they do this by letting kvm.sh create a directory as usual and then moving it after the run. This can be very confusing when looking at the results while kvm-check-branches.sh is running. This commit therefore enables --datestamp to hand subdirectories to kvm.sh. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Make kvm.sh "--dryrun sched" summarize number of buildsPaul E. McKenney1-0/+4
Knowing the number of builds that kvm.sh will split a run into allows estimation of the duration of a test, give or take build duration. This commit therefore adds a line of output to "--dryrun sched" that gives the number of builds that will be run. This excludes "builds" for repeated scenarios that reuse an earlier build. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Make kvm.sh "--dryrun sched" summarize number of batchesPaul E. McKenney1-0/+2
Knowing the number of batches that kvm.sh will split a run into allows estimation of the duration of a test, give or take the number of builds. This commit therefore adds a line of output to "--dryrun sched" that gives the number of batches that will be run. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04torture: Make --kcsan specify lockdepPaul E. McKenney1-1/+1
The --kcsan argument to kvm.sh adds CONFIG_KCSAN_VERBOSE=y in order to get more detail from the KCSAN reports. However, this Kconfig option requires lockdep to be enabled. This commit therefore causes --kcsan to also enable lockdep. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04rcu: Do not NMI offline CPUsPaul E. McKenney1-5/+12
Currently, RCU CPU stall warning messages will NMI whatever CPU looks like it is blocking either the current grace period or the grace-period kthread. This can produce confusing output if the target CPU is offline. This commit therefore checks for offline CPUs. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04rcu: For RCU grace-period kthread starvation, dump last CPU it ran onPaul E. McKenney1-1/+8
When the RCU CPU stall-warning code detects that the RCU grace-period kthread is being starved, it dumps that kthread's stack. This can sometimes be useful, but it is also useful to know what is running on the CPU that this kthread is attempting to run on. This commit therefore adds a stack trace of this CPU in order to help track down whatever it is that might be preventing RCU's grace-period kthread from running. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04rcu: Mark obtuse portion of stall warning as internal debugPaul E. McKenney1-1/+1
There is a rather obtuse string that can be printed as part of an expedited RCU CPU stall-warning message that starts with "blocking rcu_node structures". Under normal conditions, most of this message is just repeating the list of CPUs blocking the current expedited grace period, but in a manner that is rather difficult to read. This commit therefore marks this message as "(internal RCU debug)" in an effort to give people the option of avoiding wasting time attempting to extract nonexistent additional meaning from this portion of the message. Reported-by: Jonathan Lemon <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04scftorture: Add debug output for wrong-CPU warningPaul E. McKenney1-1/+5
This commit adds the desired CPU, the actual CPU, and nr_cpu_ids to the wrong-CPU warning in scftorture_invoker(), the better to help with debugging. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04rcutorture: Add testing for RCU's global memory orderingPaul E. McKenney1-6/+92
RCU guarantees that anything seen by a given reader will also be seen after any grace period that must wait on that reader. This is very likely to hold based on inspection, but the advantage of having rcutorture do the inspecting is that rcutorture doesn't mind inspecting frequently and often. This commit therefore adds code to test RCU's global memory ordering. Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04rcutorture: Add reader-side tests of polling grace-period APIPaul E. McKenney1-0/+10
This commit adds reader-side testing of the polling grace-period API. This testing verifies that a cookie obtained in an SRCU read-side critical section does not get a true return from poll_state_synchronize_srcu() within that same critical section. Link: https://lore.kernel.org/rcu/[email protected]/ Reported-by: Kent Overstreet <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2021-01-04rcutorture: Add writer-side tests of polling grace-period APIPaul E. McKenney1-7/+72
This commit adds writer-side testing of the polling grace-period API. One test verifies that the polling API sees a grace period caused by some other mechanism. Another test verifies that using the polling API to wait for a grace period does not result in too-short grace periods. A third test verifies that the polling API does not report completion within a read-side critical section. A fourth and final test verifies that the polling API does report completion given an intervening grace period. Link: https://lore.kernel.org/rcu/[email protected]/ Reported-by: Kent Overstreet <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>