aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-08-13rcu/nocb: Reduce contention at no-CBs invocation-done timePaul E. McKenney1-3/+4
Currently, nocb_cb_wait() unconditionally acquires the leaf rcu_node ->lock to advance callbacks when done invoking the previous batch. It does this while holding ->nocb_lock, which means that contention on the leaf rcu_node ->lock visits itself on the ->nocb_lock. This commit therefore makes this lock acquisition conditional, forgoing callback advancement when the leaf rcu_node ->lock is not immediately available. (In this case, the no-CBs grace-period kthread will eventually do any needed callback advancement.) Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Reduce contention at no-CBs registry-time CB advancementPaul E. McKenney2-5/+4
Currently, __call_rcu_nocb_wake() conditionally acquires the leaf rcu_node structure's ->lock, and only afterwards does rcu_advance_cbs_nowake() check to see if it is possible to advance callbacks without potentially needing to awaken the grace-period kthread. Given that the no-awaken check can be done locklessly, this commit reverses the order, so that rcu_advance_cbs_nowake() is invoked without holding the leaf rcu_node structure's ->lock and rcu_advance_cbs_nowake() checks the grace-period state before conditionally acquiring that lock, thus reducing the number of needless acquistions of the leaf rcu_node structure's ->lock. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Round down for number of no-CBs grace-period kthreadsPaul E. McKenney1-1/+1
Currently, when the square root of the number of CPUs is rounded down by int_sqrt(), this round-down is applied to the number of callback kthreads per grace-period kthreads. This makes almost no difference for large systems, but results in oddities such as three no-CBs grace-period kthreads for a five-CPU system, which is a bit excessive. This commit therefore causes the round-down to apply to the number of no-CBs grace-period kthreads, so that systems with from four to eight CPUs have only two no-CBs grace period kthreads. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Avoid ->nocb_lock capture by corresponding CPUPaul E. McKenney2-24/+62
A given rcu_data structure's ->nocb_lock can be acquired very frequently by the corresponding CPU and occasionally by the corresponding no-CBs grace-period and callbacks kthreads. In particular, these two kthreads will have frequent gaps between ->nocb_lock acquisitions that are roughly a grace period in duration. This means that any excessive ->nocb_lock contention will be due to the CPU's acquisitions, and this in turn enables a very naive contention-avoidance strategy to be quite effective. This commit therefore modifies rcu_nocb_lock() to first attempt a raw_spin_trylock(), and to atomically increment a separate ->nocb_lock_contended across a raw_spin_lock(). This new ->nocb_lock_contended field is checked in __call_rcu_nocb_wake() when interrupts are enabled, with a spin-wait for contending acquisitions to complete, thus allowing the kthreads a chance to acquire the lock. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Avoid needless wakeups of no-CBs grace-period kthreadPaul E. McKenney2-4/+24
Currently, the code provides an extra wakeup for the no-CBs grace-period kthread if one of its CPUs is generating excessive numbers of callbacks. But satisfying though it is to wake something up when things are going south, unless the thing being awakened can actually help solve the problem, that extra wakeup does nothing but consume additional CPU time, which is exactly what you don't want during a call_rcu() flood. This commit therefore avoids doing anything if the corresponding no-CBs callback kthread is going full tilt. Otherwise, if advancing callbacks immediately might help and if the leaf rcu_node structure's lock is immediately available, this commit invokes a new variant of rcu_advance_cbs() that advances callbacks only if doing so won't require awakening the grace-period kthread (not to be confused with any of the no-CBs grace-period kthreads). Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Make __call_rcu_nocb_wake() safe for many callbacksPaul E. McKenney1-1/+1
It might be hard to imagine having more than two billion callbacks queued on a single CPU's ->cblist, but someone will do it sometime. This commit therefore makes __call_rcu_nocb_wake() handle this situation by upgrading local variable "len" from "int" to "long". Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Never downgrade ->nocb_defer_wakeup in wake_nocb_gp_defer()Paul E. McKenney1-1/+2
Currently, wake_nocb_gp_defer() simply stores whatever waketype was passed in, which can result in a RCU_NOCB_WAKE_FORCE being downgraded to RCU_NOCB_WAKE, which could in turn delay callback processing. This commit therefore adds a check so that wake_nocb_gp_defer() only updates ->nocb_defer_wakeup when the update increases the forcefulness, thus avoiding downgrades. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Enable re-awakening under high callback loadPaul E. McKenney1-2/+2
The __call_rcu_nocb_wake() function and its predecessors set ->qlen_last_fqs_check to zero for the first callback and to LONG_MAX / 2 for forced reawakenings. The former can result in a too-quick reawakening when there are many callbacks ready to invoke and the latter prevents a second reawakening. This commit therefore sets ->qlen_last_fqs_check to the current number of callbacks in both cases. While in the area, this commit also moves both assignments under ->nocb_lock. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nohz: Turn off tick for offloaded CPUsPaul E. McKenney1-7/+9
Historically, no-CBs CPUs allowed the scheduler-clock tick to be unconditionally disabled on any transition to idle or nohz_full userspace execution (see the rcu_needs_cpu() implementations). Unfortunately, the checks used by rcu_needs_cpu() are defeated now that no-CBs CPUs use ->cblist, which might make users of battery-powered devices rather unhappy. This commit therefore adds explicit rcu_segcblist_is_offloaded() checks to return to the historical energy-efficient semantics. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Suppress uninitialized false-positive in nocb_gp_wait()Paul E. McKenney1-2/+2
Some compilers complain that wait_gp_seq might be used uninitialized in nocb_gp_wait(). This cannot actually happen because when wait_gp_seq is uninitialized, needwait_gp must be false, which prevents wait_gp_seq from being used. But this analysis is apparently beyond some compilers, so this commit adds a bogus initialization of wait_gp_seq for the sole purpose of suppressing the false-positive warning. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Use build-time no-CBs check in rcu_pending()Paul E. McKenney1-1/+2
Currently, rcu_pending() invokes rcu_segcblist_is_offloaded() even in CONFIG_RCU_NOCB_CPU=n kernels, which cannot possibly be offloaded. Given that rcu_pending() is on a fastpath, it makes sense to check for CONFIG_RCU_NOCB_CPU=y before invoking rcu_segcblist_is_offloaded(). This commit therefore makes this change. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Use build-time no-CBs check in rcu_core()Paul E. McKenney1-4/+4
Currently, rcu_core() invokes rcu_segcblist_is_offloaded() each time it needs to know whether the current CPU is a no-CBs CPU. Given that it is not possible to change the no-CBs status of a CPU after boot, and given that it is not possible to even have no-CBs CPUs in CONFIG_RCU_NOCB_CPU=n kernels, this repeated runtime invocation wastes CPU. This commit therefore created a const on-stack variable to allow this check to be done only once per rcu_core() invocation. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Use build-time no-CBs check in rcu_do_batch()Paul E. McKenney1-5/+5
Currently, rcu_do_batch() invokes rcu_segcblist_is_offloaded() each time it needs to know whether the current CPU is a no-CBs CPU. Given that it is not possible to change the no-CBs status of a CPU after boot, and given that it is not possible to even have no-CBs CPUs in CONFIG_RCU_NOCB_CPU=n kernels, this per-callback invocation wastes CPU. This commit therefore created a const on-stack variable to allow this check to be done only once per rcu_do_batch() invocation. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Remove obsolete nocb_gp_head and nocb_gp_tail fieldsPaul E. McKenney1-4/+2
Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Remove obsolete nocb_cb_tail and nocb_cb_head fieldsPaul E. McKenney2-3/+0
Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Remove obsolete nocb_q_count and nocb_q_count_lazy fieldsPaul E. McKenney3-20/+3
This commit removes the obsolete nocb_q_count and nocb_q_count_lazy fields, also removing rcu_get_n_cbs_nocb_cpu(), adjusting rcu_get_n_cbs_cpu(), and making rcutree_migrate_callbacks() once again disable the ->cblist fields of offline CPUs. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Remove obsolete nocb_head and nocb_tail fieldsPaul E. McKenney2-4/+0
Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Use rcu_segcblist for no-CBs CPUsPaul E. McKenney6-384/+270
Currently the RCU callbacks for no-CBs CPUs are queued on a series of ad-hoc linked lists, which means that these callbacks cannot benefit from "drive-by" grace periods, thus suffering needless delays prior to invocation. In addition, the no-CBs grace-period kthreads first wait for callbacks to appear and later wait for a new grace period, which means that callbacks appearing during a grace-period wait can be delayed. These delays increase memory footprint, and could even result in an out-of-memory condition. This commit therefore enqueues RCU callbacks from no-CBs CPUs on the rcu_segcblist structure that is already used by non-no-CBs CPUs. It also restructures the no-CBs grace-period kthread to be checking for incoming callbacks while waiting for grace periods. Also, instead of waiting for a new grace period, it waits for the closest grace period that will cause some of the callbacks to be safe to invoke. All of these changes reduce callback latency and thus the number of outstanding callbacks, in turn reducing the probability of an out-of-memory condition. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Leave ->cblist enabled for no-CBs CPUsPaul E. McKenney5-35/+11
As a first step towards making no-CBs CPUs use the ->cblist, this commit leaves the ->cblist enabled for these CPUs. The main reason to make no-CBs CPUs use ->cblist is to take advantage of callback numbering, which will reduce the effects of missed grace periods which in turn will reduce forward-progress problems for no-CBs CPUs. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Allow lockless use of rcu_segcblist_empty()Paul E. McKenney2-3/+3
Currently, rcu_segcblist_empty() assumes that the callback list is not being changed by other CPUs, but upcoming changes will require it to operate locklessly. This commit therefore adds the needed READ_ONCE() call, along with the WRITE_ONCE() calls when updating the callback list's ->head field. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Allow lockless use of rcu_segcblist_restempty()Paul E. McKenney2-16/+16
Currently, rcu_segcblist_restempty() assumes that the callback list is not being changed by other CPUs, but upcoming changes will require it to operate locklessly. This commit therefore adds the needed READ_ONCE() calls, along with the WRITE_ONCE() calls when updating the callback list. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Remove deferred wakeup checks for extended quiescent statesPaul E. McKenney1-10/+0
The idea behind the checks for extended quiescent states at the end of __call_rcu_nocb() is to handle cases where call_rcu() is invoked directly from within an extended quiescent state, for example, from the idle loop. However, this will result in a timer-mediated deferred wakeup, which will cause the needed wakeup to happen within a jiffy or thereabouts. There should be no forward-progress concerns, and if there are, the proper response is to exit the extended quiescent state while executing the endless blast of call_rcu() invocations, for example, using RCU_NONIDLE(). Given the more realistic case of an isolated call_rcu() invocation, there should be no problem. This commit therefore removes the checks for invoking call_rcu() within an extended quiescent state for on no-CBs CPUs. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Check for deferred nocb wakeups before nohz_full early exitPaul E. McKenney1-4/+4
In theory, a timer is used to defer wakeups of no-CBs grace-period kthreads when the wakeup cannot be done safely directly from the call_rcu(). In practice, the one-jiffy delay is not always consistent with timely callback invocation under heavy call_rcu() loads. Therefore, there are a number of checks for a pending deferred wakeup, including from the scheduling-clock interrupt. Unfortunately, this check follows the rcu_nohz_full_cpu() early exit, which renders it useless on such CPUs. This commit therefore moves the check for the pending deferred no-CB wakeup to precede the rcu_nohz_full_cpu() early exit. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Make rcutree_migrate_callbacks() start at leaf rcu_node structurePaul E. McKenney1-5/+6
Because rcutree_migrate_callbacks() is invoked infrequently and because an exact snapshot of the grace-period state might save some callbacks a second trip through a grace period, this function has used the root rcu_node structure. However, this safe-second-trip optimization happens only if rcutree_migrate_callbacks() races with grace-period initialization, so it is not worth the added mental load. This commit therefore makes rcutree_migrate_callbacks() start with the leaf rcu_node structures, as is done elsewhere. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Add checks for offloaded callback processingPaul E. McKenney1-3/+8
This commit is a preparatory patch for offloaded callbacks using the same ->cblist structure used by non-offloaded callbacks. It therefore adds rcu_segcblist_is_offloaded() calls where they will be needed when !rcu_segcblist_is_enabled() no longer flags the offloaded case. It also adds checks in rcu_do_batch() to ensure that there are no missed checks: Currently, it should not be possible for offloaded execution to reach rcu_do_batch(), though this will change later in this series. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Use separate flag to indicate offloaded ->cblistPaul E. McKenney5-8/+33
RCU callback processing currently uses rcu_is_nocb_cpu() to determine whether or not the current CPU's callbacks are to be offloaded. This works, but it is not so good for cache locality. Plus use of ->cblist for offloaded callbacks will greatly increase the frequency of these checks. This commit therefore adds a ->offloaded flag to the rcu_segcblist structure to provide a more flexible and cache-friendly means of checking for callback offloading. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Use separate flag to indicate disabled ->cblistPaul E. McKenney4-3/+8
NULLing the RCU_NEXT_TAIL pointer was a clever way to save a byte, but forward-progress considerations would require that this pointer be both NULL and non-NULL, which, absent a quantum-computer port of the Linux kernel, simply won't happen. This commit therefore creates as separate ->enabled flag to replace the current NULL checks. [ paulmck: Add include files per 0day test robot and -next. ] Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Print gp/cb kthread hierarchy if dump_treePaul E. McKenney1-0/+6
This commit causes the no-CBs grace-period/callback hierarchy to be printed to the console when the dump_tree kernel boot parameter is set. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Rename rcu_nocb_leader_stride kernel boot parameterPaul E. McKenney2-10/+11
This commit changes the name of the rcu_nocb_leader_stride kernel boot parameter to rcu_nocb_gp_stride in order to account for the new distinction between callback and grace-period no-CBs kthreads. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Rename and document no-CB CB kthread sleep trace eventPaul E. McKenney2-2/+3
The nocb_cb_wait() function traces a "FollowerSleep" trace_rcu_nocb_wake() event, which never was documented and is now misleading. This commit therefore changes "FollowerSleep" to "CBSleep", documents this, and updates the documentation for "Sleep" as well. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Rename rcu_organize_nocb_kthreads() local variablePaul E. McKenney1-3/+3
This commit renames rdp_leader to rdp_gp in order to account for the new distinction between callback and grace-period no-CBs kthreads. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Rename wake_nocb_leader_defer() to wake_nocb_gp_defer()Paul E. McKenney1-6/+6
This commit adjusts naming to account for the new distinction between callback and grace-period no-CBs kthreads. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Rename __wake_nocb_leader() to __wake_nocb_gp()Paul E. McKenney1-9/+9
This commit adjusts naming to account for the new distinction between callback and grace-period no-CBs kthreads. While in the area, it also updates local variables. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Rename wake_nocb_leader() to wake_nocb_gp()Paul E. McKenney1-3/+3
This commit adjusts naming to account for the new distinction between callback and grace-period no-CBs kthreads. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Rename nocb_follower_wait() to nocb_cb_wait()Paul E. McKenney1-2/+2
This commit adjusts naming to account for the new distinction between callback and grace-period no-CBs kthreads. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Provide separate no-CBs grace-period kthreadsPaul E. McKenney2-60/+61
Currently, there is one no-CBs rcuo kthread per CPU, and these kthreads are divided into groups. The first rcuo kthread to come online in a given group is that group's leader, and the leader both waits for grace periods and invokes its CPU's callbacks. The non-leader rcuo kthreads only invoke callbacks. This works well in the real-time/embedded environments for which it was intended because such environments tend not to generate all that many callbacks. However, given huge floods of callbacks, it is possible for the leader kthread to be stuck invoking callbacks while its followers wait helplessly while their callbacks pile up. This is a good recipe for an OOM, and rcutorture's new callback-flood capability does generate such OOMs. One strategy would be to wait until such OOMs start happening in production, but similar OOMs have in fact happened starting in 2018. It would therefore be wise to take a more proactive approach. This commit therefore features per-CPU rcuo kthreads that do nothing but invoke callbacks. Instead of having one of these kthreads act as leader, each group has a separate rcog kthread that handles grace periods for its group. Because these rcuog kthreads do not invoke callbacks, callback floods on one CPU no longer block callbacks from reaching the rcuc callback-invocation kthreads on other CPUs. This change does introduce additional kthreads, however: 1. The number of additional kthreads is about the square root of the number of CPUs, so that a 4096-CPU system would have only about 64 additional kthreads. Note that recent changes decreased the number of rcuo kthreads by a factor of two (CONFIG_PREEMPT=n) or even three (CONFIG_PREEMPT=y), so this still represents a significant improvement on most systems. 2. The leading "rcuo" of the rcuog kthreads should allow existing scripting to affinity these additional kthreads as needed, the same as for the rcuop and rcuos kthreads. (There are no longer any rcuob kthreads.) 3. A state-machine approach was considered and rejected. Although this would allow the rcuo kthreads to continue their dual leader/follower roles, it complicates callback invocation and makes it more difficult to consolidate rcuo callback invocation with existing softirq callback invocation. The introduction of rcuog kthreads should thus be acceptable. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Update comments to prepare for forward-progress workPaul E. McKenney2-32/+33
This commit simply rewords comments to prepare for leader nocb kthreads doing only grace-period work and callback shuffling. This will mean the addition of replacement kthreads to invoke callbacks. The "leader" and "follower" thus become less meaningful, so the commit changes no-CB comments with these strings to "GP" and "CB", respectively. (Give or take the usual grammatical transformations.) Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13rcu/nocb: Rename rcu_data fields to prepare for forward-progress workPaul E. McKenney2-46/+46
This commit simply renames rcu_data fields to prepare for leader nocb kthreads doing only grace-period work and callback shuffling. This will mean the addition of replacement kthreads to invoke callbacks. The "leader" and "follower" thus become less meaningful, so the commit changes no-CB fields with these strings to "gp" and "cb", respectively. Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13Merge branches 'consolidate.2019.08.01b', 'fixes.2019.08.12a', ↵Paul E. McKenney33-139/+351
'lists.2019.08.13a' and 'torture.2019.08.01b' into HEAD consolidate.2019.08.01b: Further consolidation cleanups fixes.2019.08.12a: Miscellaneous fixes lists.2019.08.13a: Optional lockdep arguments for RCU list macros torture.2019.08.01b: Torture-test updates
2019-08-13acpi: Use built-in RCU list checking for acpi_ioremaps listJoel Fernandes (Google)1-2/+4
This commit applies the consolidated list_for_each_entry_rcu() support for lockdep conditions. Acked-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13x86/pci: Pass lockdep condition to pcm_mmcfg_list iteratorJoel Fernandes (Google)1-2/+3
The pcm_mmcfg_list is traversed by list_for_each_entry_rcu() outside of an RCU read-side critical section, which is safe because the pci_mmcfg_lock is held. This commit therefore adds a lockdep expression to list_for_each_entry_rcu() in order t avoid lockdep warnings. Acked-by: Bjorn Helgaas <[email protected]> Signed-off-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-13driver/core: Convert to use built-in RCU list checkingJoel Fernandes (Google)3-5/+23
This commit applies the consolidated hlist_for_each_entry_rcu() support for lockdep conditions. Acked-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-12MAINTAINERS: Update e-mail address for Andrea ParriAndrea Parri1-1/+1
My @amarulasolutions.com address stopped working this July, so update to my @gmail.com address where you'll still be able to reach me. Signed-off-by: Andrea Parri <[email protected]> Cc: Alan Stern <[email protected]> Cc: Will Deacon <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: David Howells <[email protected]> Cc: Jade Alglave <[email protected]> Cc: Luc Maranget <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: Akira Yokosawa <[email protected]> Cc: Daniel Lustig <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-12rcu: Fix spelling mistake "greate"->"great"Mukesh Ojha1-1/+1
This commit fixes a spelling mistake in file tree_exp.h. Signed-off-by: Mukesh Ojha <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-12arm: Use common outgoing-CPU-notification codePaul E. McKenney1-4/+2
This commit removes the open-coded CPU-offline notification with new common code. In particular, this change avoids calling scheduler code using RCU from an offline CPU that RCU is ignoring. This is a minimal change. A more intrusive change might invoke the cpu_check_up_prepare() and cpu_set_state_online() functions at CPU-online time, which would allow onlining throw an error if the CPU did not go offline properly. Signed-off-by: Paul E. McKenney <[email protected]> Cc: [email protected] Cc: Russell King <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Dietmar Eggemann <[email protected]>
2019-08-12rcu: Remove redundant "if" condition from rcu_gp_is_expedited()Paul E. McKenney1-2/+1
Because rcu_expedited_nesting is initialized to 1 and not decremented until just before init is spawned, rcu_expedited_nesting is guaranteed to be non-zero whenever rcu_scheduler_active == RCU_SCHEDULER_INIT. This commit therefore removes this redundant "if" equality test. Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Joel Fernandes (Google) <[email protected]>
2019-08-12idle: Prevent late-arriving interrupts from disrupting offlinePeter Zijlstra1-2/+3
Scheduling-clock interrupts can arrive late in the CPU-offline process, after idle entry and the subsequent call to cpuhp_report_idle_dead(). Once execution passes the call to rcu_report_dead(), RCU is ignoring the CPU, which results in lockdep complaints when the interrupt handler uses RCU: ------------------------------------------------------------------------ ============================= WARNING: suspicious RCU usage 5.2.0-rc1+ #681 Not tainted ----------------------------- kernel/sched/fair.c:9542 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 2, debug_locks = 1 no locks held by swapper/5/0. stack backtrace: CPU: 5 PID: 0 Comm: swapper/5 Not tainted 5.2.0-rc1+ #681 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Bochs 01/01/2011 Call Trace: <IRQ> dump_stack+0x5e/0x8b trigger_load_balance+0xa8/0x390 ? tick_sched_do_timer+0x60/0x60 update_process_times+0x3b/0x50 tick_sched_handle+0x2f/0x40 tick_sched_timer+0x32/0x70 __hrtimer_run_queues+0xd3/0x3b0 hrtimer_interrupt+0x11d/0x270 ? sched_clock_local+0xc/0x74 smp_apic_timer_interrupt+0x79/0x200 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0010:delay_tsc+0x22/0x50 Code: ff 0f 1f 80 00 00 00 00 65 44 8b 05 18 a7 11 48 0f ae e8 0f 31 48 89 d6 48 c1 e6 20 48 09 c6 eb 0e f3 90 65 8b 05 fe a6 11 48 <41> 39 c0 75 18 0f ae e8 0f 31 48 c1 e2 20 48 09 c2 48 89 d0 48 29 RSP: 0000:ffff8f92c0157ed0 EFLAGS: 00000212 ORIG_RAX: ffffffffffffff13 RAX: 0000000000000005 RBX: ffff8c861f356400 RCX: ffff8f92c0157e64 RDX: 000000321214c8cc RSI: 00000032120daa7f RDI: 0000000000260f15 RBP: 0000000000000005 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: ffff8c861ee18000 R15: ffff8c861ee18000 cpuhp_report_idle_dead+0x31/0x60 do_idle+0x1d5/0x200 ? _raw_spin_unlock_irqrestore+0x2d/0x40 cpu_startup_entry+0x14/0x20 start_secondary+0x151/0x170 secondary_startup_64+0xa4/0xb0 ------------------------------------------------------------------------ This happens rarely, but can be forced by happen more often by placing delays in cpuhp_report_idle_dead() following the call to rcu_report_dead(). With this in place, the following rcutorture scenario reproduces the problem within a few minutes: tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 8 --duration 5 --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --configs "TREE04" This commit uses the crude but effective expedient of moving the disabling of interrupts within the idle loop to precede the cpu_is_offline() check. It also invokes tick_nohz_idle_stop_tick() instead of tick_nohz_idle_stop_tick_protected() to shut off the scheduling-clock interrupt. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> [ paulmck: Revert tick_nohz_idle_stop_tick_protected() removal, new callers. ] Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-09ipv4: Add lockdep condition to fix for_each_entry()Joel Fernandes (Google)1-1/+2
This commit applies the consolidated list_for_each_entry_rcu() support for lockdep conditions. Acked-by: David S. Miller <[email protected]> Signed-off-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-09rcu/sync: Remove custom check for RCU readersJoel Fernandes (Google)1-3/+1
The rcu/sync code currently does a special check for being in an RCU read-side critical section. With RCU consolidating flavors and the generic helper added earlier in this series, this check is no longer need. This commit switches to the generic helper, saving a couple of lines of code. Cc: Oleg Nesterov <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Signed-off-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2019-08-09rcu: Add support for consolidated-RCU reader checkingJoel Fernandes (Google)4-38/+108
This commit adds RCU-reader checks to list_for_each_entry_rcu() and hlist_for_each_entry_rcu(). These checks are optional, and are indicated by a lockdep expression passed to a new optional argument to these two macros. If this optional lockdep expression is omitted, these two macros act as before, checking for an RCU read-side critical section. Signed-off-by: Joel Fernandes (Google) <[email protected]> [ paulmck: Update to eliminate return within macro and update comment. ] Signed-off-by: Paul E. McKenney <[email protected]>