aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2022-06-21context_tracking: Rename __context_tracking_enter/exit() to ↵Frederic Weisbecker1-10/+10
__ct_user_enter/exit() The context tracking namespace is going to expand and some new functions will require even longer names. Start shrinking the context_tracking prefix to "ct" as is already the case for some existing macros, this will make the introduction of new functions easier. Acked-by: Paul E. McKenney <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Uladzislau Rezki <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Nicolas Saenz Julienne <[email protected]> Cc: Marcelo Tosatti <[email protected]> Cc: Xiongfeng Wang <[email protected]> Cc: Yu Liao <[email protected]> Cc: Phil Auld <[email protected]> Cc: Paul Gortmaker<[email protected]> Cc: Alex Belits <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Nicolas Saenz Julienne <[email protected]> Tested-by: Nicolas Saenz Julienne <[email protected]>
2022-06-21refscale: Convert test_lock spinlock to raw_spinlockZqiang1-9/+9
In kernels built with CONFIG_PREEMPT_RT=y, spinlocks are replaced by rt_mutex, which can sleep. This means that acquiring a non-raw spinlock in a critical section where preemption is disabled can trigger the following BUG: BUG: scheduling while atomic: ref_scale_reade/76/0x00000002 Preemption disabled at: ref_lock_section+0x16/0x80 Call Trace: <TASK> dump_stack_lvl+0x5b/0x82 dump_stack+0x10/0x12 __schedule_bug.cold+0x9c/0xad __schedule+0x839/0xc00 schedule_rtlock+0x22/0x40 rtlock_slowlock_locked+0x460/0x1350 rt_spin_lock+0x61/0xe0 ref_lock_section+0x29/0x80 rcu_scale_one_reader+0x52/0x60 ref_scale_reader+0x28d/0x490 kthread+0x128/0x150 ret_from_fork+0x22/0x30 </TASK> This commit therefore converts spinlock to raw_spinlock. Signed-off-by: Zqiang <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-21rcutorture: Handle failure of memory allocation functionsLi Qiong1-0/+10
This commit adds warnings for allocation failure during the mem_dump_obj() tests. It also terminates these tests upon such failure. Signed-off-by: Li Qiong <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-21rcutorture: Fix ksoftirqd boosting timing and iterationFrederic Weisbecker1-15/+13
The RCU priority boosting can fail in two situations: 1) If (nr_cpus= > maxcpus=), which means if the total number of CPUs is higher than those brought online at boot, then torture_onoff() may later bring up CPUs that weren't online on boot. Now since rcutorture initialization only boosts the ksoftirqds of the CPUs that have been set online on boot, the CPUs later set online by torture_onoff won't benefit from the boost, making RCU priority boosting fail. 2) The ksoftirqd kthreads are boosted after the creation of rcu_torture_boost() kthreads, which opens a window large enough for these rcu_torture_boost() kthreads to wait (despite running at FIFO priority) for ksoftirqds that are still running at SCHED_NORMAL priority. The issues can trigger for example with: ./kvm.sh --configs TREE01 --kconfig "CONFIG_RCU_BOOST=y" [ 34.968561] rcu-torture: !!! [ 34.968627] ------------[ cut here ]------------ [ 35.014054] WARNING: CPU: 4 PID: 114 at kernel/rcu/rcutorture.c:1979 rcu_torture_stats_print+0x5ad/0x610 [ 35.052043] Modules linked in: [ 35.069138] CPU: 4 PID: 114 Comm: rcu_torture_sta Not tainted 5.18.0-rc1 #1 [ 35.096424] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014 [ 35.154570] RIP: 0010:rcu_torture_stats_print+0x5ad/0x610 [ 35.198527] Code: 63 1b 02 00 74 02 0f 0b 48 83 3d 35 63 1b 02 00 74 02 0f 0b 48 83 3d 21 63 1b 02 00 74 02 0f 0b 48 83 3d 0d 63 1b 02 00 74 02 <0f> 0b 83 eb 01 0f 8e ba fc ff ff 0f 0b e9 b3 fc ff f82 [ 37.251049] RSP: 0000:ffffa92a0050bdf8 EFLAGS: 00010202 [ 37.277320] rcu: De-offloading 8 [ 37.290367] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001 [ 37.290387] RDX: 0000000000000000 RSI: 00000000ffffbfff RDI: 00000000ffffffff [ 37.290398] RBP: 000000000000007b R08: 0000000000000000 R09: c0000000ffffbfff [ 37.290407] R10: 000000000000002a R11: ffffa92a0050bc18 R12: ffffa92a0050be20 [ 37.290417] R13: ffffa92a0050be78 R14: 0000000000000000 R15: 000000000001bea0 [ 37.290427] FS: 0000000000000000(0000) GS:ffff96045eb00000(0000) knlGS:0000000000000000 [ 37.290448] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 37.290460] CR2: 0000000000000000 CR3: 000000001dc0c000 CR4: 00000000000006e0 [ 37.290470] Call Trace: [ 37.295049] <TASK> [ 37.295065] ? preempt_count_add+0x63/0x90 [ 37.295095] ? _raw_spin_lock_irqsave+0x12/0x40 [ 37.295125] ? rcu_torture_stats_print+0x610/0x610 [ 37.295143] rcu_torture_stats+0x29/0x70 [ 37.295160] kthread+0xe3/0x110 [ 37.295176] ? kthread_complete_and_exit+0x20/0x20 [ 37.295193] ret_from_fork+0x22/0x30 [ 37.295218] </TASK> Fix this with boosting the ksoftirqds kthreads from the boosting hotplug callback itself and before the boosting kthreads are created. Fixes: ea6d962e80b6 ("rcutorture: Judge RCU priority boosting on grace periods, not callbacks") Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-21rcuscale: Fix smp_processor_id()-in-preemptible warningsZqiang1-0/+1
Systems built with CONFIG_DEBUG_PREEMPT=y can trigger the following BUG while running the rcuscale performance test: BUG: using smp_processor_id() in preemptible [00000000] code: rcu_scale_write/69 CPU: 0 PID: 66 Comm: rcu_scale_write Not tainted 5.18.0-rc7-next-20220517-yoctodev-standard+ caller is debug_smp_processor_id+0x17/0x20 Call Trace: <TASK> dump_stack_lvl+0x49/0x5e dump_stack+0x10/0x12 check_preemption_disabled+0xdf/0xf0 debug_smp_processor_id+0x17/0x20 rcu_scale_writer+0x2b5/0x580 kthread+0x177/0x1b0 ret_from_fork+0x22/0x30 </TASK> Reproduction method: runqemu kvm slirp nographic qemuparams="-m 4096 -smp 8" bootparams="isolcpus=2,3 nohz_full=2,3 rcu_nocbs=2,3 rcutree.dump_tree=1 rcuscale.shutdown=false rcuscale.gp_async=true" -d The problem is that the rcu_scale_writer() kthreads fail to set the PF_NO_SETAFFINITY flags, which causes is_percpu_thread() to assume that the kthread's affinity might change at any time, thus the BUG noted above. This commit therefore causes rcu_scale_writer() to set PF_NO_SETAFFINITY in its kthread's ->flags field, thus preventing this BUG. Signed-off-by: Zqiang <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-21rcutorture: Make failure indication note reader-batch overflowPaul E. McKenney1-1/+1
The loop scanning the pipesummary[] array currently skips the last element, which means that the diagnostics ignore those rarest of situations, namely where some readers persist across more than ten grace periods, but all other readers avoid spanning a full grace period. This commit therefore adjusts the scan to include the last element of this array. Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-21rcutorture: Fix memory leak in rcu_test_debug_objects()Zqiang1-0/+1
The kernel memory leak detector located the following: unreferenced object 0xffff95d941135b50 (size 16): comm "swapper/0", pid 1, jiffies 4294667610 (age 1367.451s) hex dump (first 16 bytes): f0 c6 c2 bd d9 95 ff ff 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000bc81d9b1>] kmem_cache_alloc_trace+0x2f6/0x500 [<00000000d28be229>] rcu_torture_init+0x1235/0x1354 [<0000000032c3acd9>] do_one_initcall+0x51/0x210 [<000000003c117727>] kernel_init_freeable+0x205/0x259 [<000000003961f965>] kernel_init+0x1a/0x120 [<000000001998f890>] ret_from_fork+0x22/0x30 This is caused by the rcu_test_debug_objects() function allocating an rcu_head structure, then failing to free it. This commit therefore adds the needed kfree() after the last use of this structure. Signed-off-by: Zqiang <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-21rcutorture: Simplify rcu_torture_read_exit_child() loopPaul E. McKenney1-27/+20
The existing loop has an implicit manual loop that obscures the flow and requires an extra control variable. This commit makes this implicit loop explicit, thus saving several lines of code. Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-21rcu/torture: Change order of warning and trace dumpAnna-Maria Behnsen1-1/+2
Dumping a big ftrace buffer could lead to a RCU stall. So there is the ftrace buffer and the stall information which needs to be printed. When there is additionally a WARN_ON() which describes the reason for the ftrace buffer dump and the WARN_ON() is executed _after_ ftrace buffer dump, the information get lost in the middle of the RCU stall information. Therefore print WARN_ON() message before dumping the ftrace buffer in rcu_torture_writer(). [ paulmck: Add tracing_off() to avoid cruft from WARN(). ] Signed-off-by: Anna-Maria Behnsen <[email protected]> Reviewed-by: Benedikt Spranger <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-21rcu-tasks: Use delayed_work to delay rcu_tasks_verify_self_tests()Waiman Long1-5/+32
Commit 2585014188d5 ("rcu-tasks: Be more patient for RCU Tasks boot-time testing") fixes false positive rcu_tasks verification check failure by repeating the test once every second until timeout using schedule_timeout_uninterruptible(). Since rcu_tasks_verify_selft_tests() is called from do_initcalls() as a late_initcall, this has the undesirable side effect of delaying other late_initcall's queued after it by a second or more. Fix this by instead using delayed_work to repeat the verification check. Fixes: 2585014188d5 ("rcu-tasks: Be more patient for RCU Tasks boot-time testing") Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-21rcu-tasks: Be more patient for RCU Tasks boot-time testingPaul E. McKenney1-7/+21
The RCU-Tasks family of grace-period primitives can take some time to complete, and the amount of time can depend on the exact hardware and software configuration. Some configurations boot up fast enough that the RCU-Tasks verification process gets false-positive failures. This commit therefore allows up to 30 seconds for the grace periods to complete, with this value adjustable downwards using the rcupdate.rcu_task_stall_timeout kernel boot parameter. Reported-by: Matthew Wilcox <[email protected]> Reported-by: Zhouyi Zhou <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Tested-by: Zhouyi Zhou <[email protected]> Tested-by: Mark Rutland <[email protected]>
2022-06-21rcu-tasks: Update commentsPaul E. McKenney1-38/+33
This commit updates comments to reflect the changes in the series of commits that eliminated the full task-list scan. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-21rcu-tasks: Disable and enable CPU hotplug in same functionPaul E. McKenney1-3/+3
The rcu_tasks_trace_pregp_step() function invokes cpus_read_lock() to disable CPU hotplug, and a later call to the rcu_tasks_trace_postscan() function invokes cpus_read_unlock() to re-enable it. This was absolutely necessary in the past in order to protect the intervening scan of the full tasks list, but there is no longer such a scan. This commit therefore improves readability by moving the cpus_read_unlock() call to the end of the rcu_tasks_trace_pregp_step() function. This commit is a pure code-motion commit without any (intended) change in functionality. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-21rcu-tasks: Eliminate RCU Tasks Trace IPIs to online CPUsPaul E. McKenney2-17/+39
Currently, the RCU Tasks Trace grace-period kthread IPIs each online CPU using smp_call_function_single() in order to track any tasks currently in RCU Tasks Trace read-side critical sections during which the corresponding task has neither blocked nor been preempted. These IPIs are annoying and are also not strictly necessary because any task that blocks or is preempted within its current RCU Tasks Trace read-side critical section will be tracked on one of the per-CPU rcu_tasks_percpu structure's ->rtp_blkd_tasks list. So the only time that this is a problem is if one of the CPUs runs through a long-duration RCU Tasks Trace read-side critical section without a context switch. Note that the task_call_func() function cannot help here because there is no safe way to identify the target task. Of course, the task_call_func() function will be very useful later, when processing the list of tasks, but it needs to know the task. This commit therefore creates a cpu_curr_snapshot() function that returns a pointer the task_struct structure of some task that happened to be running on the specified CPU more or less during the time that the cpu_curr_snapshot() function was executing. If there was no context switch during this time, this function will return a pointer to the task_struct structure of the task that was running throughout. If there was a context switch, then the outgoing task will be taken care of by RCU's context-switch hook, and the incoming task was either already taken care during some previous context switch, or it is not currently within an RCU Tasks Trace read-side critical section. And in this latter case, the grace period already started, so there is no need to wait on this task. This new cpu_curr_snapshot() function is invoked on each CPU early in the RCU Tasks Trace grace-period processing, and the resulting tasks are queued for later quiescent-state inspection. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-21rcu-tasks: Maintain a count of tasks blocking RCU Tasks Trace grace periodPaul E. McKenney1-1/+5
This commit maintains a new n_trc_holdouts counter that tracks the number of tasks blocking the RCU Tasks grace period. This counter is useful for debugging, and its value has been added to a diagostic message. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-21rcu-tasks: Stop RCU Tasks Trace from scanning full tasks listPaul E. McKenney1-5/+6
This commit takes off the training wheels and relies only on scanning currently running tasks and tasks that have blocked or been preempted within their current RCU Tasks Trace read-side critical section. Before this commit, the time complexity of an RCU Tasks Trace grace period is O(T), where T is the number of tasks. After this commit, this time complexity is O(C+B), where C is the number of CPUs and B is the number of tasks that have blocked (or been preempted) at least once during their current RCU Tasks Trace read-side critical sections. Of course, if all tasks have blocked (or been preempted) at least once during their current RCU Tasks Trace read-side critical sections, this is still O(T), but current expectations are that RCU Tasks Trace read-side critical section will be short and that there will normally not be large numbers of tasks blocked within such a critical section. Dave Marchevsky kindly measured the effects of this commit on the RCU Tasks Trace grace-period latency and the rcu_tasks_trace_kthread task's CPU consumption per RCU Tasks Trace grace period over the course of a fixed test, all in milliseconds: Before After GP latency 22.3 ms stddev > 0.1 17.0 ms stddev < 0.1 GP CPU 2.3 ms stddev 0.3 1.1 ms stddev 0.2 This was on a system with 15,000 tasks, so it is reasonable to expect much larger savings on the systems on which this issue was first noted, given that they sport well in excess of 100,000 tasks. CPU consumption was measured using profiling techniques. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]> Tested-by: Dave Marchevsky <[email protected]>
2022-06-21PM: hibernate: Use kernel_can_power_off()Dmitry Osipenko1-1/+1
Use new kernel_can_power_off() API instead of legacy pm_power_off global variable to fix regressed hibernation to disk where machine no longer powers off when it should because ACPI power driver transitioned to the new sys-off based API and it doesn't use pm_power_off anymore. Fixes: 98f30d0ecf79 ("ACPI: power: Switch to sys-off handler API") Tested-by: Ken Moffat <[email protected]> Reported-by: Ken Moffat <[email protected]> Signed-off-by: Dmitry Osipenko <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2022-06-21rcutorture: Update rcutorture.fwd_progress help textPaul E. McKenney1-35/+18
This commit updates the rcutorture.fwd_progress help text to say that it is the number of forward-progress kthreads to spawn rather than the old enable/disable functionality. While in the area, make the list of torture-test parameters easier to read by taking advantage of 100 columns. Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Neeraj Upadhyay <[email protected]>
2022-06-21bpf, x64: Add predicate for bpf2bpf with tailcalls support in JITTony Ambardar2-1/+8
The BPF core/verifier is hard-coded to permit mixing bpf2bpf and tail calls for only x86-64. Change the logic to instead rely on a new weak function 'bool bpf_jit_supports_subprog_tailcalls(void)', which a capable JIT backend can override. Update the x86-64 eBPF JIT to reflect this. Signed-off-by: Tony Ambardar <[email protected]> [jakub: drop MIPS bits and tweak patch subject] Signed-off-by: Jakub Sitnicki <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2022-06-20bpf: Inline calls to bpf_loop when callback is knownEduard Zingerman2-9/+180
Calls to `bpf_loop` are replaced with direct loops to avoid indirection. E.g. the following: bpf_loop(10, foo, NULL, 0); Is replaced by equivalent of the following: for (int i = 0; i < 10; ++i) foo(i, NULL); This transformation could be applied when: - callback is known and does not change during program execution; - flags passed to `bpf_loop` are always zero. Inlining logic works as follows: - During execution simulation function `update_loop_inline_state` tracks the following information for each `bpf_loop` call instruction: - is callback known and constant? - are flags constant and zero? - Function `optimize_bpf_loop` increases stack depth for functions where `bpf_loop` calls can be inlined and invokes `inline_bpf_loop` to apply the inlining. The additional stack space is used to spill registers R6, R7 and R8. These registers are used as loop counter, loop maximal bound and callback context parameter; Measurements using `benchs/run_bench_bpf_loop.sh` inside QEMU / KVM on i7-4710HQ CPU show a drop in latency from 14 ns/op to 2 ns/op. Signed-off-by: Eduard Zingerman <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-20uprobe: gate bpf call behind BPF_EVENTSDelyan Kratunov1-0/+2
The call into bpf from uprobes needs to be gated now that it doesn't use the trace_events.h helpers. Randy found this as a randconfig build failure on linux-next [1]. [1]: https://lore.kernel.org/linux-next/[email protected]/ Reported-by: Randy Dunlap <[email protected]> Signed-off-by: Delyan Kratunov <[email protected]> Tested-by: Randy Dunlap <[email protected]> Acked-by: Randy Dunlap <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2022-06-20context_tracking: Add a note about noinstr VS unsafe context tracking functionsFrederic Weisbecker1-0/+34
Some context tracking functions enter or exit into/from RCU idle mode while using trace-able and lockdep-aware IRQs (un-)masking. As a result those functions can't get tagged as noinstr. This is unlikely to be fixed since these are obsolete APIs. Drop a note about this matter. [ paulmck: Apply Peter Zijlstra feedback. ] Reported-by: Peter Zijlstra <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Uladzislau Rezki <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Nicolas Saenz Julienne <[email protected]> Cc: Marcelo Tosatti <[email protected]> Cc: Xiongfeng Wang <[email protected]> Cc: Yu Liao <[email protected]> Cc: Phil Auld <[email protected]> Cc: Paul Gortmaker<[email protected]> Cc: Alex Belits <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Nicolas Saenz Julienne <[email protected]> Tested-by: Nicolas Saenz Julienne <[email protected]>
2022-06-20rcu: Apply noinstr to rcu_idle_enter() and rcu_idle_exit()Paul E. McKenney1-7/+7
This commit applies the "noinstr" tag to the rcu_idle_enter() and rcu_idle_exit() functions, which are invoked from portions of the idle loop that cannot be instrumented. These tags require reworking the rcu_eqs_enter() and rcu_eqs_exit() functions that these two functions invoke in order to cause them to use normal assertions rather than lockdep. In addition, within rcu_idle_exit(), the raw versions of local_irq_save() and local_irq_restore() are used, again to avoid issues with lockdep in uninstrumented code. This patch is based in part on an earlier patch by Jiri Olsa, discussions with Peter Zijlstra and Frederic Weisbecker, earlier changes by Thomas Gleixner, and off-list discussions with Yonghong Song. Link: https://lore.kernel.org/lkml/[email protected]/ Reported-by: Jiri Olsa <[email protected]> Reported-by: Alexei Starovoitov <[email protected]> Reported-by: Andrii Nakryiko <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Yonghong Song <[email protected]>
2022-06-20rcu: Dump rcuc kthread status for CPUs not reporting quiescent stateZqiang1-28/+21
If the rcutree.use_softirq kernel boot parameter is disabled, then it is possible that a RCU CPU stall is due to the rcuc kthreads being starved of CPU time. There is currently no easy way to infer this from the RCU CPU stall warning output. This commit therefore adds a string of the form " rcuc=%ld jiffies(starved)" to a given CPU's output if the corresponding rcuc kthread has been starved for more than two seconds. [ paulmck: Eliminate extraneous space characters. ] Signed-off-by: Zqiang <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-20rcu-tasks: Stop RCU Tasks Trace from scanning idle tasksPaul E. McKenney1-7/+1
Now that RCU scans both running tasks and tasks that have blocked within their current RCU Tasks Trace read-side critical section, there is no need for it to scan the idle tasks. After all, an idle loop should not be remain within an RCU Tasks Trace read-side critical section across exit from idle, and from a BPF viewpoint, functions invoked from the idle loop should not sleep. So only running idle tasks can be within RCU Tasks Trace read-side critical sections. This commit therefore removes the scan of the idle tasks from the rcu_tasks_trace_postscan() function. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Pull in tasks blocked within RCU Tasks Trace readersPaul E. McKenney1-0/+24
This commit scans each CPU's ->rtp_blkd_tasks list, adding them to the list of holdout tasks. This will cause the current RCU Tasks Trace grace period to wait until these tasks exit their RCU Tasks Trace read-side critical sections. This commit will enable later work omitting the scan of the full task list. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Scan running tasks for RCU Tasks Trace readersPaul E. McKenney1-11/+40
A running task might be within an RCU Tasks Trace read-side critical section for any length of time, but will not be placed on any of the per-CPU rcu_tasks_percpu structure's ->rtp_blkd_tasks lists. Therefore any RCU Tasks Trace grace-period processing that does not scan the full task list must interact with the running tasks. This commit therefore causes the rcu_tasks_trace_pregp_step() function to IPI each CPU in order to place the corresponding task on the holdouts list and to record whether or not it was in an RCU Tasks Trace read-side critical section. Yes, it is possible to avoid adding it to that list if it is not a reader, but that would prevent the system from remembering that this task was in a quiescent state. Which is why the running tasks are unconditionally added to the holdout list. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Avoid rcu_tasks_trace_pertask() duplicate list additionsPaul E. McKenney1-3/+4
This commit adds checks within rcu_tasks_trace_pertask() to avoid duplicate (and destructive) additions to the holdouts list. These checks will be required later due to the possibility of a given task having blocked while in an RCU Tasks Trace read-side critical section, but now running on a CPU. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Move rcu_tasks_trace_pertask() before rcu_tasks_trace_pregp_step()Paul E. McKenney1-14/+14
This is a code-motion-only commit that moves rcu_tasks_trace_pertask() to precede rcu_tasks_trace_pregp_step(), so that the latter will be able to invoke the other without forward references. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Add blocked-task indicator to RCU Tasks Trace stall warningsPaul E. McKenney1-1/+2
This commit adds a "B" indicator to the RCU Tasks Trace CPU stall warning when the task has blocked within its current read-side critical section. This serves as a debugging aid. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Untrack blocked RCU Tasks Trace at reader endPaul E. McKenney1-6/+18
This commit causes rcu_read_unlock_trace() to check for the current task being on a per-CPU list within the rcu_tasks_percpu structure, and removes it from that list if so. This has the effect of curtailing tracking of a task that blocked within an RCU Tasks Trace read-side critical section once it exits that critical section. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Track blocked RCU Tasks Trace readersPaul E. McKenney1-1/+21
This commit places any task that has ever blocked within its current RCU Tasks Trace read-side critical section on a per-CPU list within the rcu_tasks_percpu structure. Tasks are removed from this list when they exit by the exit_tasks_rcu_finish_trace() function. The purpose of this commit is to provide the information needed to eliminate the current scan of the full task list. This commit offsets the INT_MIN value for ->trc_reader_nesting with the new nesting level in order to avoid queueing tasks that are exiting their read-side critical sections. [ paulmck: Apply kernel test robot feedback. ] [ paulmck: Apply feedback from [email protected] ] Signed-off-by: Paul E. McKenney <[email protected]> Tested-by: syzbot <[email protected]> Tested-by: "Zhang, Qiang1" <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Add data structures for lightweight grace periodsPaul E. McKenney2-0/+5
This commit adds fields to task_struct and to rcu_tasks_percpu that will be used to avoid the task-list scan for RCU Tasks Trace grace periods, and also initializes these fields. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Make RCU Tasks Trace stall warning handle idle offline tasksPaul E. McKenney1-1/+1
When a CPU is offline, its idle task can appear to be running, but it cannot be doing anything while CPU-hotplug operations are excluded. This commit takes advantage of that fact by making trc_check_slow_task() check for task_curr(t) && cpu_online(task_cpu(t)), and recording full information in that case. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Make RCU Tasks Trace stall warnings print full .b.need_qs fieldPaul E. McKenney1-2/+3
Currently, the RCU Tasks Trace CPU stall warning simply indicates whether or not the .b.need_qs field is zero. This commit shows the three permitted values and flags other values with either "!" or "?". This is a debugging aid. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Flag offline CPUs in RCU Tasks Trace stall warningsPaul E. McKenney1-2/+2
This commit tags offline CPUs with "(offline)" in RCU Tasks Trace CPU stall warnings. This is a debugging aid. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Add slow-IPI indicator to RCU Tasks Trace stall warningsPaul E. McKenney1-1/+2
This commit adds a "I" indicator to the RCU Tasks Trace CPU stall warning when an IPI directed to a task has thus far failed to arrive. This serves as a debugging aid. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Simplify trc_inspect_reader() QS logicPaul E. McKenney1-6/+7
Currently, trc_inspect_reader() does one check for nesting less than or equal to zero, then sorts out the distinctions within this single "if" statement. This commit simplifies the logic by providing one "if" statement for quiescent states (nesting of zero) and another "if" statement for transitioning from one nesting level to another or the outermost rcu_read_unlock_trace() (negative nesting). Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Make rcu_note_context_switch() unconditionally call rcu_tasks_qs()Paul E. McKenney1-1/+1
This commit makes rcu_note_context_switch() unconditionally invoke the rcu_tasks_qs() function, as opposed to doing so only when RCU (as opposed to RCU Tasks Trace) urgently needs a grace period to end. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: RCU Tasks Trace grace-period kthread has implicit QSPaul E. McKenney1-2/+3
Because the task driving the grace-period kthread is in quiescent state throughout, this commit excludes it from the list of tasks from which a quiescent state is needed. This does mean that attaching a sleepable BPF program to function in kernel/rcu/tasks.h is a bad idea, by the way. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Handle idle tasks for recently offlined CPUsPaul E. McKenney1-8/+7
This commit identifies idle tasks for recently offlined CPUs as residing in a quiescent state. This is safe only because CPU-hotplug operations are excluded during these checks. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Idle tasks on offline CPUs are in quiescent statesPaul E. McKenney1-1/+1
Any idle task corresponding to an offline CPU is in an RCU Tasks Trace quiescent state. This commit causes rcu_tasks_trace_postscan() to ignore idle tasks for offline CPUs, which it can do safely due to CPU-hotplug operations being disabled. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Make trc_read_check_handler() fetch ->trc_reader_nesting only oncePaul E. McKenney1-2/+4
This commit replaces the pair of READ_ONCE(t->trc_reader_nesting) calls with a single such call and a local variable. This makes the code's intent more clear. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Remove rcu_tasks_trace_postgp() wait for counterPaul E. McKenney1-59/+3
Now that tasks are not removed from the list until they have responded to any needed request for a quiescent state, it is no longer necessary to wait for the trc_n_readers_need_end counter to go to zero. This commit therefore removes that waiting code. It is therefore also no longer necessary for rcu_tasks_trace_postgp() to do the final decrement of this counter, so that code is also removed. This in turn means that trc_n_readers_need_end counter itself can be removed, as can the rcu_tasks_trace_iw irq_work structure and the rcu_read_unlock_iw() function. [ paulmck: Apply feedback from Zqiang. ] Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Merge state into .b.need_qs and atomically updatePaul E. McKenney1-32/+71
This commit gets rid of the task_struct structure's ->trc_reader_checked field, making it instead be a bit within the task_struct structure's existing ->trc_reader_special.b.need_qs field. This commit also atomically loads, stores, and checks the resulting combination of the reader-checked and need-quiescent state flags. This will in turn allow significant simplification of the rcu_tasks_trace_postgp() function as well as elimination of the trc_n_readers_need_end counter in later commits. These changes will in turn simplify later elimination of the RCU Tasks Trace scan of the task list, which will make RCU Tasks Trace grace periods less CPU-intensive. [ paulmck: Apply kernel test robot feedback. ] Signed-off-by: Paul E. McKenney <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: KP Singh <[email protected]>
2022-06-20rcu-tasks: Drive synchronous grace periods from calling taskPaul E. McKenney1-8/+16
This commit causes synchronous grace periods to be driven from the task invoking synchronize_rcu_*(), allowing these functions to be invoked from the mid-boot dead zone extending from when the scheduler was initialized to to point that the various RCU tasks grace-period kthreads are spawned. This change will allow the self-tests to run in a consistent manner. Reported-by: Matthew Wilcox <[email protected]> Reported-by: Zhouyi Zhou <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-20rcu-tasks: Move synchronize_rcu_tasks_generic() downPaul E. McKenney1-11/+11
This is strictly a code-motion commit that moves the synchronize_rcu_tasks_generic() down to where it can invoke rcu_tasks_one_gp() without the need for a forward declaration. Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-20rcu-tasks: Split rcu_tasks_one_gp() from rcu_tasks_kthread()Paul E. McKenney1-22/+36
This commit abstracts most of the rcu_tasks_kthread() function's loop body into a new rcu_tasks_one_gp() function. It also introduces a new ->tasks_gp_mutex to synchronize concurrent calls to this new rcu_tasks_one_gp() function. This commit is preparation for allowing RCU tasks grace periods to be driven by the calling task during the mid-boot dead zone. Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-20rcu-tasks: Check for abandoned callbacksPaul E. McKenney1-0/+5
This commit adds a debugging scan for callbacks that got lost during a callback-queueing transition. Signed-off-by: Paul E. McKenney <[email protected]>
2022-06-20rcutorture: Validate get_completed_synchronize_rcu()Paul E. McKenney1-0/+6
This commit verifies that the RCU grace-period state cookie returned from get_completed_synchronize_rcu() causes poll_state_synchronize_rcu() to return true, as required. This commit is in preparation for polled expedited grace periods. Link: https://lore.kernel.org/all/[email protected]/ Link: https://docs.google.com/document/d/1RNKWW9jQyfjxw2E8dsXVTdvZYh0HnYeSHDKog9jhdN8/edit?usp=sharing Cc: Brian Foster <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Al Viro <[email protected]> Cc: Ian Kent <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>