aboutsummaryrefslogtreecommitdiff
path: root/kernel/time
AgeCommit message (Collapse)AuthorFilesLines
2013-04-26nohz: Select VIRT_CPU_ACCOUNTING_GEN from full dynticks configFrederic Weisbecker1-1/+3
Turn the full dynticks passive dependency on VIRT_CPU_ACCOUNTING_GEN to an active one. The full dynticks Kconfig is currently hidden behind the full dynticks cputime accounting, which is an awkward and counter-intuitive layout: the user first has to select the dynticks cputime accounting in order to make the full dynticks feature to be visible. We definetly want it the other way around. The usual way to perform this kind of active dependency is use "select" on the depended target. Now we can't use the Kconfig "select" instruction when the target is a "choice". So this patch inspires on how the RCU subsystem Kconfig interact with its dependencies on SMP and PREEMPT: we make sure that cputime accounting can't propose another option than VIRT_CPU_ACCOUNTING_GEN when NO_HZ_FULL is selected by using the right "depends on" instruction for each cputime accounting choices. v2: Keep full dynticks cputime accounting available even without full dynticks, as per Paul McKenney's suggestion. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-26nohz: Reduce overhead under high-freq idling patternsIngo Molnar1-3/+4
One testbox of mine (Intel Nehalem, 16-way) uses MWAIT for its idle routine, which apparently can break out of its idle loop rather frequently, with high frequency. In that case NO_HZ_FULL=y kernels show high ksoftirqd overhead and constant context switching, because tick_nohz_stop_sched_tick() will, if delta_jiffies == 0, mis-identify this as a timer event - activating the TIMER_SOFTIRQ, which wakes up ksoftirqd. Fix this by treating delta_jiffies == 0 the same way we treat other short wakeups, delta_jiffies == 1. Cc: Frederic Weisbecker <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2013-04-25clockevents: Set dummy handler on CPU_DEAD shutdownThomas Gleixner2-0/+5
Vitaliy reported that a per cpu HPET timer interrupt crashes the system during hibernation. What happens is that the per cpu HPET timer gets shut down when the nonboot cpus are stopped. When the nonboot cpus are onlined again the HPET code sets up the MSI interrupt which fires before the clock event device is registered. The event handler is still set to hrtimer_interrupt, which then crashes the machine due to highres mode not being active. See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=700333 There is no real good way to avoid that in the HPET code. The HPET code alrady has a mechanism to detect spurious interrupts when event handler == NULL for a similar reason. We can handle that in the clockevent/tick layer and replace the previous functional handler with a dummy handler like we do in tick_setup_new_device(). The original clockevents code did this in clockevents_exchange_device(), but that got removed by commit 7c1e76897 (clockevents: prevent clockevent event_handler ending up handler_noop) which forgot to fix it up in tick_shutdown(). Same issue with the broadcast device. Reported-by: Vitaliy Fillipov <[email protected]> Cc: Ben Hutchings <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2013-04-24Merge branch 'linus' into timers/coreThomas Gleixner1-1/+2
Reason: Get upstream fixes before adding conflicting code. Signed-off-by: Thomas Gleixner <[email protected]>
2013-04-24nohz: Remove full dynticks' superfluous dependency on RCU treeFrederic Weisbecker1-2/+0
Remove the dependency on (TREE_RCU || TREE_PREEMPT_RCU). The full dynticks option already depends on SMP which implies (whatever flavour of) RCU tree config anyway. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-22nohz: Add basic tracingFrederic Weisbecker1-4/+15
It's not obvious to find out why the full dynticks subsystem doesn't always stop the tick: whether this is due to kthreads, posix timers, perf events, etc... These new tracepoints are here to help the user diagnose the failures and test this feature. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-22nohz: Select wide RCU nocb for full dynticksFrederic Weisbecker1-0/+1
It makes testing and implementation much easier as we know in advance that all CPUs are RCU nocbs. Also this prepares to remove the dynamic check for nohz_full= boot mask to be a subset of rcu_nocbs= Eventually this should also help removing the requirement for the boot CPU to be outside the full dynticks range. Suggested-by: Christoph Lameter <[email protected]> Suggested-by: Paul E. McKenney <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-22nohz: Re-evaluate the tick for the new task after a context switchFrederic Weisbecker1-0/+20
When a task is scheduled in, it may have some properties of its own that could make the CPU reconsider the need for the tick: posix cpu timers, perf events, ... So notify the full dynticks subsystem when a task gets scheduled in and re-check the tick dependency at this stage. This is done through a self IPI to avoid messing up with any current lock scenario. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-22nohz: Prepare to stop the tick on irq exitFrederic Weisbecker1-6/+25
Interrupt exit is a natural place to stop the tick: it happens after all events happening before and during the irq which are liable to update the dependency on the tick occured. Also it makes sure that any check on tick dependency is well ordered against dynticks kick IPIs. Bring in the infrastructure that performs the tick dependency checks on irq exit and shut it down if these checks show that we can do it safely. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-22nohz: Implement full dynticks kickFrederic Weisbecker1-4/+38
Implement the full dynticks kick that is performed from IPIs sent by various subsystems (scheduler, posix timers, ...) when they want to notify about a new event that may reconsider the dependency on the tick. Most of the time, such an event end up restarting the tick. (Part of the design with subsystems providing *_can_stop_tick() helpers suggested by Peter Zijlstra a while ago). Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-22timekeeping: Update tk->cycle_last in resumeThomas Gleixner1-1/+1
commit 7ec98e15aa (timekeeping: Delay update of clock->cycle_last) forgot to update tk->cycle_last in the resume path. This results in a stale value versus clock->cycle_last and prevents resume in the worst case. Reported-by: Jiri Slaby <[email protected]> Reported-and-tested-by: Borislav Petkov <[email protected]> Acked-by: John Stultz <[email protected]> Cc: Linux-pm mailing list <[email protected]> Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304211648150.21884@ionos Signed-off-by: Thomas Gleixner <[email protected]>
2013-04-22nohz: Re-evaluate the tick from the scheduler IPIFrederic Weisbecker1-1/+1
The scheduler IPI is used by the scheduler to kick full dynticks CPUs asynchronously when more than one task are running or when a new timer list timer is enqueued. This way the destination CPU can decide to restart the tick to handle this new situation. Now let's call that kick in the scheduler IPI. (Reusing the scheduler IPI rather than implementing a new IPI was suggested by Peter Zijlstra a while ago) Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-21Merge branch 'timers/nohz-posix-timers-v2' of ↵Ingo Molnar2-0/+52
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks into timers/nohz Pull posix cpu timers handling on full dynticks from Frederic Weisbecker. Signed-off-by: Ingo Molnar <[email protected]>
2013-04-19nohz: New option to default all CPUs in full dynticks rangeFrederic Weisbecker2-2/+31
Provide a new kernel config that defaults all CPUs to be part of the full dynticks range, except the boot one for timekeeping. This default setting is overriden by the nohz_full= boot option if passed by the user. This is helpful for those who don't need a finegrained range of full dynticks CPU and also for automated testing. Suggested-by: Ingo Molnar <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-19nohz: Ensure full dynticks CPUs are RCU nocbsFrederic Weisbecker1-6/+16
We need full dynticks CPU to also be RCU nocb so that we don't have to keep the tick to handle RCU callbacks. Make sure the range passed to nohz_full= boot parameter is a subset of rcu_nocbs= The CPUs that fail to meet this requirement will be excluded from the nohz_full range. This is checked early in boot time, before any CPU has the opportunity to stop its tick. Suggested-by: Steven Rostedt <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-19nohz: Force boot CPU outside full dynticks rangeFrederic Weisbecker1-39/+15
The timekeeping job must be able to run early on boot because there may be some pre-SMP (and thus pre-initcalls ) components that rely on it. The IO-APIC is one such users as it tests the timer health by watching jiffies progression. Given that it happens before we know the initial online set, we can't rely on it to select a timekeeper. We need one before SMP time otherwise we simply crash on boot. To fix this and keep things simple for now, force the boot CPU outside of the full dynticks range in any case and do this early on kernel parameter parsing time. We might want a trickier solution later, expecially for aSMP architectures that need to assign housekeeping tasks to arbitrary low power CPUs. But it's still first pass KISS time for now. Reviewed-by: Paul E. McKenney <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-18nohz: New APIs to re-evaluate the tick on full dynticks CPUsFrederic Weisbecker2-0/+52
Provide two new helpers in order to notify the full dynticks CPUs about some internal system changes against which they may reconsider the state of their tick. Some practical examples include: posix cpu timers, perf tick and sched clock tick. For now the notifying handler, implemented through IPIs, is a stub that will be implemented when we get the tick stop/restart infrastructure in. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-17clockevents: Switch into oneshot mode even if broadcast registered lateStephen Boyd1-0/+10
tick_oneshot_notify() is used to notify a particular CPU to try to switch into oneshot mode after a oneshot capable tick device is registered and tick_clock_notify() is used to notify all CPUs to try to switch into oneshot mode after a high res clocksource is registered. There is one caveat; if the tick devices suffer from FEAT_C3_STOP we don't try to switch into oneshot mode unless we have a oneshot capable broadcast device already registered. If the broadcast device is registered after the tick devices that have FEAT_C3_STOP we'll never try to switch into oneshot mode again, causing us to be stuck in periodic mode forever. Avoid this scenario by calling tick_clock_notify() after we register the broadcast device so that we try to switch into oneshot mode on all CPUs one more time. [ tglx: Adopted to timers/core and added a comment ] Signed-off-by: Stephen Boyd <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2013-04-17timer_list: Convert timer list to be a proper seq_fileNathan Zimmer1-12/+77
When running with 4096 cores attemping to read /proc/timer_list will fail with an ENOMEM condition. On a sufficantly large systems the total amount of data is more then 4mb, so it won't fit into a single buffer. The failure can also occur on smaller systems when memory fragmentation is high as reported by Dave Jones. Convert /proc/timer_list to a proper seq_file with its own iterator. This is a little more complex given that we have to make two passes with two separate headers. sysrq_timer_list_show also needed to be updated to reflect the fact that now timer_list_show only does one cpu at at time. Signed-off-by: Nathan Zimmer <[email protected]> Reported-by: Dave Jones <[email protected]> Cc: John Stultz <[email protected]> Cc: Stephen Boyd <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2013-04-17timer_list: Split timer_list_show_tickdevicesNathan Zimmer1-12/+9
Split timer_list_show_tickdevices() into the header printout and pull the rest up to timer_list_show. This is a preparatory patch for converting timer_list to a proper seqfile with its own iterator Signed-off-by: Nathan Zimmer <[email protected]> Reported-by: Dave Jones <[email protected]> Cc: John Stultz <[email protected]> Cc: Stephen Boyd <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2013-04-15nohz: Improve a bit the full dynticks Kconfig documentationFrederic Weisbecker1-3/+5
Remove the "single task" statement from CONFIG_NO_HZ_FULL title. The constraint can be invalidated when tasks from other sched classes than SCHED_FAIR are running. Moreover it's possible that hrtick join the party in the future. Also add a line about the dependency on SMP. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-15nohz: Align periodic tick Kconfig with other choices' naming conventionFrederic Weisbecker1-1/+1
Rename CONFIG_PERIODIC_HZ to CONFIG_HZ_PERIODIC in order to stay consistent with other tick implementation entries: CONFIG_HZ_PERIODIC CONFIG_NO_HZ_IDLE CONFIG_NO_HZ_FULL Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-15nohz: Switch from "extended nohz" to "full nohz" based namingFrederic Weisbecker4-31/+31
"Extended nohz" was used as a naming base for the full dynticks API and Kconfig symbols. It reflects the fact the system tries to stop the tick in more places than just idle. But that "extended" name is a bit opaque and vague. Rename it to "full" makes it clearer what the system tries to do under this config: try to shutdown the tick anytime it can. The various constraints that prevent that to happen shouldn't be considered as fundamental properties of this feature but rather technical issues that may be solved in the future. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-15nohz: Fix old dynticks idle Kconfig backward compatibilityFrederic Weisbecker1-4/+8
In order to enforce backward compatibility with older config files, we want the new dynticks-idle Kconfig entry to default its value to the one of the old CONFIG_NO_HZ symbol if present. Namely we want: config NO_HZ # old obsolete dynticks idle symbol bool config NO_HZ_IDLE # new dynticks idle symbol default NO_HZ However Kconfig prevents this to work if the old symbol is not visible. And this is currently the case because NO_HZ lacks a title in order to show it in make oldconfig and alike. To fix this, bring a minimal title and help text to the obsolete Kconfig entry that explains its purpose. This makes the "defaulting" to work. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-11timekeeping: Make sure to notify hrtimers when TAI offset changesJohn Stultz1-3/+7
Now that we have CLOCK_TAI timers, make sure we notify hrtimer code when TAI offset is changed. Signed-off-by: John Stultz <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2013-04-04timekeeping: Shorten seq_count regionThomas Gleixner1-3/+2
Shorten the seqcount write hold region to the actual update of the timekeeper and the related data (e.g vsyscall). On a contemporary x86 system this reduces the maximum latencies on Preempt-RT from 8us to 4us on the non-timekeeping cores. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-04-04timekeeping: Implement a shadow timekeeperThomas Gleixner1-12/+29
Use the shadow timekeeper to do the update_wall_time() adjustments and then copy it over to the real timekeeper. Keep the shadow timekeeper in sync when updating stuff outside of update_wall_time(). This allows us to limit the timekeeper_seq hold time to the update of the real timekeeper and the vsyscall data in the next patch. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-04-04timekeeping: Delay update of clock->cycle_lastThomas Gleixner1-1/+3
For calculating the new timekeeper values store the new cycle_last value in the timekeeper and update the clock->cycle_last just when we actually update the new values. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-04-04timekeeping: Store cycle_last value in timekeeper struct as wellThomas Gleixner1-2/+2
For implementing a shadow timekeeper and a split calculation/update region we need to store the cycle_last value in the timekeeper and update the value in the clocksource struct only in the update region. Add the extra storage to the timekeeper. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-04-04ntp: Remove ntp_lock, using the timekeeping locks to protect ntp stateJohn Stultz1-37/+4
In order to properly handle the NTP state in future changes to the timekeeping lock management, this patch moves the management of all of the ntp state under the timekeeping locks. This allows us to remove the ntp_lock. Cc: Thomas Gleixner <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Prarit Bhargava <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-04-04timekeeping: Simplify tai updating from do_adjtimexJohn Stultz1-5/+4
Since we are taking the timekeeping locks, just go ahead and update any tai change directly, rather then dropping the lock and calling a function that will just take it again. Cc: Thomas Gleixner <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Prarit Bhargava <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-04-04timekeeping: Hold timekeepering locks in do_adjtimex and hardppsJohn Stultz1-2/+17
In moving the NTP state to be protected by the timekeeping locks, be sure to acquire the timekeeping locks prior to calling ntp functions. Cc: Thomas Gleixner <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Prarit Bhargava <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-04-04timekeeping: Move ADJ_SETOFFSET to top level do_adjtimex()John Stultz2-11/+11
Since ADJ_SETOFFSET adjusts the timekeeping state, process it as part of the top level do_adjtimex() function in timekeeping.c. This avoids deadlocks that could occur once we change the ntp locking rules. Cc: Thomas Gleixner <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Prarit Bhargava <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-04-04ntp: Rework do_adjtimex to take timespec and tai argumentsJohn Stultz3-16/+17
In order to change the locking rules, we need to provide the timespec and tai values rather then having the ntp logic acquire these values itself. Cc: Thomas Gleixner <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Prarit Bhargava <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-04-04ntp: Move timex validation to timekeeping do_adjtimex call.John Stultz3-5/+8
Move logic that does not need the ntp state to be done in the timekeeping do_adjtimex() call. Cc: Thomas Gleixner <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Prarit Bhargava <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-04-04ntp: Move do_adjtimex() and hardpps() functions to timekeeping.cJohn Stultz3-5/+36
In preparation for changing the ntp locking rules, move do_adjtimex and hardpps accessor functions to timekeeping.c, but keep the code logic in ntp.c. This patch also introduces a ntp_internal.h file so timekeeping specific interfaces of ntp.c can be more limitedly shared with timekeeping.c. Cc: Thomas Gleixner <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Prarit Bhargava <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-04-04ntp: Split out timex validation from do_adjtimexJohn Stultz1-12/+27
Split out the timex validation done in do_adjtimex into a separate function. This will help simplify logic in following patches. Cc: Thomas Gleixner <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Prarit Bhargava <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-04-03nohz: Print final full dynticks CPUs range on bootFrederic Weisbecker1-0/+10
Given that we apply a few restrictions on the full dynticks CPUs range (keep an online timekeeper oustide the range, then in the future have the range be an RCU nocb CPUs subset), let's print the final resulting range of full dynticks CPUs to the user so that he knows what's really going to run. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-03nohz: Pack nohz Kconfig option in a menu of choicesFrederic Weisbecker1-5/+23
Now the user has the choice between three implementations of the timer tick: * Static periodic tick * Idle dynticks * Full dynticks At least for now, these are mutually exclusive choices, so let's rely on the proper Kconfig feature to display these to the user. A new entry CONFIG_NO_HZ_IDLE is created and the old CONFIG_NO_HZ maps to it for config file backward compatibility. The old name was too general now that we have more granular dynticks implementations. While at it, add some explanation to help the user on his decision between the 3 entries. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-03nohz: Rename CONFIG_NO_HZ to CONFIG_NO_HZ_COMMONFrederic Weisbecker2-10/+15
We are planning to convert the dynticks Kconfig options layout into a choice menu. The user must be able to easily pick any of the following implementations: constant periodic tick, idle dynticks, full dynticks. As this implies a mutual exclusion, the two dynticks implementions need to converge on the selection of a common Kconfig option in order to ease the sharing of a common infrastructure. It would thus seem pretty natural to reuse CONFIG_NO_HZ to that end. It already implements all the idle dynticks code and the full dynticks depends on all that code for now. So ideally the choice menu would propose CONFIG_NO_HZ_IDLE and CONFIG_NO_HZ_EXTENDED then both would select CONFIG_NO_HZ. On the other hand we want to stay backward compatible: if CONFIG_NO_HZ is set in an older config file, we want to enable CONFIG_NO_HZ_IDLE by default. But we can't afford both at the same time or we run into a circular dependency: 1) CONFIG_NO_HZ_IDLE and CONFIG_NO_HZ_EXTENDED both select CONFIG_NO_HZ 2) If CONFIG_NO_HZ is set, we default to CONFIG_NO_HZ_IDLE We might be able to support that from Kconfig/Kbuild but it may not be wise to introduce such a confusing behaviour. So to solve this, create a new CONFIG_NO_HZ_COMMON option which gathers the common code between idle and full dynticks (that common code for now is simply the idle dynticks code) and select it from their referring Kconfig. Then we'll later create CONFIG_NO_HZ_IDLE and map CONFIG_NO_HZ to it for backward compatibility. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-04-03Merge branch 'fortglx/3.10/time' of ↵Thomas Gleixner3-74/+228
git://git.linaro.org/people/jstultz/linux into timers/core
2013-04-02nohz: Unhide full dynticks feature from its dependenciesFrederic Weisbecker1-5/+14
The full dynticks feature only shows up when all its Kconfig dependencies are met (RCU nocbs, RCU user mode, ...) This is far from being user friendly as those who want to activate this feature need to look into the Kconfig files and iterate through each dependency then activate these by hand in order to show and select the full dynticks Kconfig option. So process the other way around: show up the Kconfig option if the minimal low level dependencies are met and activate the high level ones when we enable the feature. Note there is one exception in the picture: CONFIG_VIRT_CPU_ACCOUNTING_GEN is part of a Kconfig choice menu and it appears we can't select it from another Kconfig selection when it's under such layout. So for now this particular item stays as a passive dependency. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Geoff Levand <[email protected]> Cc: Gilad Ben Yossef <[email protected]> Cc: Hakan Akkan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kevin Hilman <[email protected]> Cc: Li Zhong <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]>
2013-03-25timekeeping: __timekeeping_set_tai_offset can be staticFengguang Wu1-1/+1
Yet again, the kbuild test robot saves the day, noting I left out defining __timekeeping_set_tai_offset as static. It even sent me this patch. Reported-by: Fengguang Wu <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-03-25tick: Change log level of NOHZ: local_softirq_pending messageRado Vrbovsky1-2/+2
The "NOHZ: local_softirq_pending" message is a largely informational message. This makes extra work for customers that have a policy of investigating all kernel log messages logged at <= KERN_ERR log level. This patch sets the message to a different log level. [ tglx: Use pr_warn() ] Signed-off-by: Rado Vrbovsky <[email protected]> Cc: Don Zickus <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2013-03-22timekeeping: Split timekeeper_lock into lock and seqcountThomas Gleixner1-59/+73
We want to shorten the seqcount write hold time. So split the seqlock into a lock and a seqcount. Open code the seqwrite_lock in the places which matter and drop the sequence counter update where it's pointless. Signed-off-by: Thomas Gleixner <[email protected]> [jstultz: Merge fixups from CLOCK_TAI collisions] Signed-off-by: John Stultz <[email protected]>
2013-03-22timekeeping: Move lock out of timekeeper structThomas Gleixner1-55/+53
Make the lock a separate entity. Preparatory patch for shadow timekeeper structure. Signed-off-by: Thomas Gleixner <[email protected]> [Merged with CLOCK_TAI changes] Signed-off-by: John Stultz <[email protected]>
2013-03-22timekeeping: Make jiffies_lock internalThomas Gleixner2-0/+3
Nothing outside of the timekeeping core needs that lock. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-03-22timekeeping: Calc stuff onceThomas Gleixner1-3/+4
Calculate the cycle interval shifted value once. No functional change, just makes the code more readable. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: John Stultz <[email protected]>
2013-03-22hrtimer: Add hrtimer support for CLOCK_TAIJohn Stultz1-1/+19
Add hrtimer support for CLOCK_TAI, as well as posix timer interfaces. Signed-off-by: John Stultz <[email protected]>
2013-03-22timekeeping: Add CLOCK_TAI clockidJohn Stultz1-0/+30
This add a CLOCK_TAI clockid and the needed accessors. CC: Thomas Gleixner <[email protected]> CC: Eric Dumazet <[email protected]> CC: Richard Cochran <[email protected]> Signed-off-by: John Stultz <[email protected]>