aboutsummaryrefslogtreecommitdiff
path: root/kernel/time
AgeCommit message (Collapse)AuthorFilesLines
2017-10-27sched/isolation: Handle the nohz_full= parameterFrederic Weisbecker1-10/+3
We want to centralize the isolation management, done by the housekeeping subsystem. Therefore we need to handle the nohz_full= parameter from there. Since nohz_full= so far has involved unbound timers, watchdog, RCU and tilegx NAPI isolation, we keep that default behaviour. nohz_full= will be deprecated in the future. We want to control the isolation features from the isolcpus= parameter. Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luiz Capitulino <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Wanpeng Li <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-10-27sched/isolation: Move housekeeping related code to its own fileFrederic Weisbecker1-18/+0
The housekeeping code is currently tied to the NOHZ code. As we are planning to make housekeeping independent from it, start with moving the relevant code to its own file. Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Paul E. McKenney <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luiz Capitulino <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Wanpeng Li <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-10-19clockevents: Retry programming min delta up to 10 timesJames Hogan1-8/+13
When CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=n, the call path hrtimer_reprogram -> clockevents_program_event -> clockevents_program_min_delta will not retry if the clock event driver returns -ETIME. If the driver could not satisfy the program_min_delta for any reason, the lack of a retry means the CPU may not receive a tick interrupt, potentially until the counter does a full period. This leads to rcu_sched timeout messages as the stalled CPU is detected by other CPUs, and other issues if the CPU is holding locks or other resources at the point at which it stalls. There have been a couple of observed mechanisms through which a clock event driver could not satisfy the requested min_delta and return -ETIME. With the MIPS GIC driver, the shared execution resource within MT cores means inconventient latency due to execution of instructions from other hardware threads in the core, within gic_next_event, can result in an event being set in the past. Additionally under virtualisation it is possible to get unexpected latency during a clockevent device's set_next_event() callback which can make it return -ETIME even for a delta based on min_delta_ns. It isn't appropriate to use MIN_ADJUST in the virtualisation case as occasional hypervisor induced high latency will cause min_delta_ns to quickly increase to the maximum. Instead, borrow the retry pattern from the MIN_ADJUST case, but without making adjustments. Retry up to 10 times, each time increasing the attempted delta by min_delta, before giving up. [ Matt: Reworked the loop and made retry increase the delta. ] Signed-off-by: James Hogan <[email protected]> Signed-off-by: Matt Redfearn <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: Daniel Lezcano <[email protected]> Cc: "Martin Schwidefsky" <[email protected]> Cc: James Hogan <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2017-10-18timer: Convert stub timer to timer_setup()Thomas Gleixner1-3/+3
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new timer_setup() and from_timer() to pass the timer pointer explicitly. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Kees Cook <[email protected]>
2017-10-18timers: Avoid an unnecessary iteration in __run_timers()Zhenzhong Duan1-2/+5
If the base clock is behind jiffies in the soft irq expiry code then the next timer is retrieved by get_next_timer_interrupt() to avoid incrementing base clock one by one. If the next timer interrupt is past current jiffies then the base clock is set to jiffies - 1. At the call site this is incremented and another iteration through the expiry loop is executed which checks empty hash buckets. That's a pointless excercise because it's already known that the next timer is past jiffies. Set the base clock in that case to jiffies directly so it gets incremented to jiffies + 1 at the call site resulting in immediate termination of the expiry loop. [ tglx: Massaged changelog and added comment to the code ] Signed-off-by: Zhenzhong Duan <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Anna-Maria Gleixner <[email protected]> Cc: Joe Jin <[email protected]> Cc: [email protected] Cc: Srinivas Reddy Eeda <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/7086a857-f90c-4616-bbe8-f7696f21626c@default
2017-10-17time: Use do_settimeofday64() internallyArnd Bergmann1-6/+6
do_settimeofday() is a wrapper around do_settimeofday64(), so that function can be called directly. The wrapper can be removed once the last user is gone. Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: Frederic Weisbecker <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: John Stultz <[email protected]> Cc: Al Viro <[email protected]> Cc: Deepa Dinamani <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2017-10-17posix-stubs: Use get_timespec64() and put_timespec64()Arnd Bergmann1-12/+8
This is a follow-up to commit 5c4994102fb5 ("posix-timers: Use get_timespec64() and put_timespec64()"), which left two system call using copy_from_user()/copy_to_user(). Change them as well for consistency. Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Nicolas Pitre <[email protected]> Cc: [email protected] Cc: John Stultz <[email protected]> Cc: Al Viro <[email protected]> Cc: Deepa Dinamani <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2017-10-10sched/idle: Move quiet_vmstate() into the NOHZ codePeter Zijlstra1-0/+2
quiet_vmstat() is an expensive function that only makes sense when we go into NOHZ. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-10-05timer: Convert schedule_timeout() to use from_timer()Kees Cook1-7/+19
In preparation for unconditionally passing the struct timer_list pointer to all timer callbacks, switch to using the new from_timer() helper and passing the timer pointer explicitly. Since this special timer is on the stack, it needs to have a wrapper structure to carry state once .data is eliminated. Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: Petr Mladek <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Sebastian Reichel <[email protected]> Cc: Kalle Valo <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Machek <[email protected]> Cc: [email protected] Cc: Chris Metcalf <[email protected]> Cc: [email protected] Cc: [email protected] Cc: "James E.J. Bottomley" <[email protected]> Cc: Wim Van Sebroeck <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Ursula Braun <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Harish Patil <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: Manish Chopra <[email protected]> Cc: Len Brown <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: [email protected] Cc: Heiko Carstens <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Julian Wiedmann <[email protected]> Cc: John Stultz <[email protected]> Cc: Mark Gross <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: [email protected] Cc: [email protected] Cc: "Martin K. Petersen" <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Stefan Richter <[email protected]> Cc: Michael Reed <[email protected]> Cc: [email protected] Cc: Martin Schwidefsky <[email protected]> Cc: Andrew Morton <[email protected]> Cc: [email protected] Cc: Sudip Mukherjee <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2017-09-25timekeeping: Provide NMI safe access to clock realtimeThomas Gleixner1-0/+35
The configurable printk timestamping wants access to clock realtime. Right now there is no ktime_get_real_fast_ns() accessor because reading the monotonic base and the realtime offset cannot be done atomically. Contrary to boot time this offset can change during runtime and cause half updated readouts. struct tk_read_base was fully packed when the fast timekeeper access was implemented. commit ceea5e3771ed ("time: Fix clock->read(clock) race around clocksource changes") removed the 'read' function pointer from the structure, but of course left the comment stale. So now the structure can fit a new 64bit member w/o violating the cache line constraints. Add real_base to tk_read_base and update it in the fast timekeeper update sequence. Implement an accessor which follows the same scheme as the accessor to clock monotonic, but uses the new real_base to access clock real time. The runtime overhead for updating real_base is minimal as it just adds two cache hot values and stores them into an already dirtied cache line along with the other fast timekeeper updates. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: John Stultz <[email protected]> Cc: Peter Zijlstra <peterz@infradead,org> Link: https://lkml.kernel.org/r/[email protected]
2017-09-25timekeeping: Make fast accessors return 0 before timekeeping is initializedPrarit Bhargava1-14/+21
printk timestamps will be extended to include mono and boot time by using the fast timekeeping accessors ktime_get_mono|boot_fast_ns(). The functions can return garbage before timekeeping is initialized resulting in garbage timestamps. Initialize the fast timekeepers with dummy clocks which guarantee a 0 readout up to timekeeping_init(). Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Prarit Bhargava <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: John Stultz <[email protected]> Link: http://lkml.kernel.org/r/[email protected]
2017-09-08drivers/pps: aesthetic tweaks to PPS-related contentRobert P. J. Day1-1/+1
Collection of aesthetic adjustments to various PPS-related files, directories and Documentation, some quite minor just for the sake of consistency, including: * Updated example of pps device tree node (courtesy Rodolfo G.) * "PPS-API" -> "PPS API" * "pps_source_info_s" -> "pps_source_info" * "ktimer driver" -> "pps-ktimer driver" * "ppstest /dev/pps0" -> "ppstest /dev/pps1" to match example * Add missing PPS-related entries to MAINTAINERS file * Other trivialities Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Robert P. J. Day <[email protected]> Acked-by: Rodolfo Giometti <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-09-05Merge tag 'pm-4.14-rc1' of ↵Linus Torvalds1-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "This time (again) cpufreq gets the majority of changes which mostly are driver updates (including a major consolidation of intel_pstate), some schedutil governor modifications and core cleanups. There also are some changes in the system suspend area, mostly related to diagnostics and debug messages plus some renames of things related to suspend-to-idle. One major change here is that suspend-to-idle is now going to be preferred over S3 on systems where the ACPI tables indicate to do so and provide requsite support (the Low Power Idle S0 _DSM in particular). The system sleep documentation and the tools related to it are updated too. The rest is a few cpuidle changes (nothing major), devfreq updates, generic power domains (genpd) framework updates and a few assorted modifications elsewhere. Specifics: - Drop the P-state selection algorithm based on a PID controller from intel_pstate and make it use the same P-state selection method (based on the CPU load) for all types of systems in the active mode (Rafael Wysocki, Srinivas Pandruvada). - Rework the cpufreq core and governors to make it possible to take cross-CPU utilization updates into account and modify the schedutil governor to actually do so (Viresh Kumar). - Clean up the handling of transition latency information in the cpufreq core and untangle it from the information on which drivers cannot do dynamic frequency switching (Viresh Kumar). - Add support for new SoCs (MT2701/MT7623 and MT7622) to the mediatek cpufreq driver and update its DT bindings (Sean Wang). - Modify the cpufreq dt-platdev driver to autimatically create cpufreq devices for the new (v2) Operating Performance Points (OPP) DT bindings and update its whitelist of supported systems (Viresh Kumar, Shubhrajyoti Datta, Marc Gonzalez, Khiem Nguyen, Finley Xiao). - Add support for Ux500 to the cpufreq-dt driver and drop the obsolete dbx500 cpufreq driver (Linus Walleij, Arnd Bergmann). - Add new SoC (R8A7795) support to the cpufreq rcar driver (Khiem Nguyen). - Fix and clean up assorted issues in the cpufreq drivers and core (Arvind Yadav, Christophe Jaillet, Colin Ian King, Gustavo Silva, Julia Lawall, Leonard Crestez, Rob Herring, Sudeep Holla). - Update the IO-wait boost handling in the schedutil governor to make it less aggressive (Joel Fernandes). - Rework system suspend diagnostics to make it print fewer messages to the kernel log by default, add a sysfs knob to allow more suspend-related messages to be printed and add Low Power S0 Idle constraints checks to the ACPI suspend-to-idle code (Rafael Wysocki, Srinivas Pandruvada). - Prefer suspend-to-idle over S3 on ACPI-based systems with the ACPI_FADT_LOW_POWER_S0 flag set and the Low Power Idle S0 _DSM interface present in the ACPI tables (Rafael Wysocki). - Update documentation related to system sleep and rename a number of items in the code to make it cleare that they are related to suspend-to-idle (Rafael Wysocki). - Export a variable allowing device drivers to check the target system sleep state from the core system suspend code (Florian Fainelli). - Clean up the cpuidle subsystem to handle the polling state on x86 in a more straightforward way and to use %pOF instead of full_name (Rafael Wysocki, Rob Herring). - Update the devfreq framework to fix and clean up a few minor issues (Chanwoo Choi, Rob Herring). - Extend diagnostics in the generic power domains (genpd) framework and clean it up slightly (Thara Gopinath, Rob Herring). - Fix and clean up a couple of issues in the operating performance points (OPP) framework (Viresh Kumar, Waldemar Rymarkiewicz). - Add support for RV1108 to the rockchip-io Adaptive Voltage Scaling (AVS) driver (David Wu). - Fix the usage of notifiers in CPU power management on some platforms (Alex Shi). - Update the pm-graph system suspend/hibernation and boot profiling utility (Todd Brandt). - Make it possible to run the cpupower utility without CPU0 (Prarit Bhargava)" * tag 'pm-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (87 commits) cpuidle: Make drivers initialize polling state cpuidle: Move polling state initialization code to separate file cpuidle: Eliminate the CPUIDLE_DRIVER_STATE_START symbol cpufreq: imx6q: Fix imx6sx low frequency support cpufreq: speedstep-lib: make several arrays static, makes code smaller PM: docs: Delete the obsolete states.txt document PM: docs: Describe high-level PM strategies and sleep states PM / devfreq: Fix memory leak when fail to register device PM / devfreq: Add dependency on PM_OPP PM / devfreq: Move private devfreq_update_stats() into devfreq PM / devfreq: Convert to using %pOF instead of full_name PM / AVS: rockchip-io: add io selectors and supplies for RV1108 cpufreq: ti: Fix 'of_node_put' being called twice in error handling path cpufreq: dt-platdev: Drop few entries from whitelist cpufreq: dt-platdev: Automatically create cpufreq device with OPP v2 ARM: ux500: don't select CPUFREQ_DT cpuidle: Convert to using %pOF instead of full_name cpufreq: Convert to using %pOF instead of full_name PM / Domains: Convert to using %pOF instead of full_name cpufreq: Cap the default transition delay value to 10 ms ...
2017-09-04Merge branch 'timers-core-for-linus' of ↵Linus Torvalds3-11/+22
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A rather small update for the time(r) subsystem: - A new clocksource driver IMX-TPM - Minor fixes to the alarmtimer facility - Device tree cleanups for Renesas drivers - A new kselftest and fixes for the timer related tests - Conversion of the clocksource drivers to use %pOF - Use the proper helpers to access rlimits in the posix-cpu-timer code" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: alarmtimer: Ensure RTC module is not unloaded clocksource: Convert to using %pOF instead of full_name clocksource/drivers/bcm2835: Remove message for a memory allocation failure devicetree: bindings: Remove deprecated properties devicetree: bindings: Remove unused 32-bit CMT bindings devicetree: bindings: Deprecate property, update example devicetree: bindings: r8a73a4 and R-Car Gen2 CMT bindings devicetree: bindings: R-Car Gen2 CMT0 and CMT1 bindings devicetree: bindings: Remove sh7372 CMT binding clocksource/drivers/imx-tpm: Add imx tpm timer support dt-bindings: timer: Add nxp tpm timer binding doc posix-cpu-timers: Use dedicated helper to access rlimit values alarmtimer: Fix unavailable wake-up source in sysfs timekeeping: Use proper timekeeper for debug code kselftests: timers: set-timer-lat: Add one-shot timer test cases kselftests: timers: set-timer-lat: Tweak reporting when timer fires early kselftests: timers: freq-step: Fix build warning kselftests: timers: freq-step: Define ADJ_SETOFFSET if device has older kernel headers
2017-09-04Merge branch 'pm-sleep'Rafael J. Wysocki1-2/+3
* pm-sleep: ACPI / PM: Check low power idle constraints for debug only PM / s2idle: Rename platform operations structure PM / s2idle: Rename ->enter_freeze to ->enter_s2idle PM / s2idle: Rename freeze_state enum and related items PM / s2idle: Rename PM_SUSPEND_FREEZE to PM_SUSPEND_TO_IDLE ACPI / PM: Prefer suspend-to-idle over S3 on some systems platform/x86: intel-hid: Wake up Dell Latitude 7275 from suspend-to-idle PM / suspend: Define pr_fmt() in suspend.c PM / suspend: Use mem_sleep_labels[] strings in messages PM / sleep: Put pm_test under CONFIG_PM_SLEEP_DEBUG PM / sleep: Check pm_wakeup_pending() in __device_suspend_noirq() PM / core: Add error argument to dpm_show_time() PM / core: Split dpm_suspend_noirq() and dpm_resume_noirq() PM / s2idle: Rearrange the main suspend-to-idle loop PM / timekeeping: Print debug messages when requested PM / sleep: Mark suspend/hibernation start and finish PM / sleep: Do not print debug messages by default PM / suspend: Export pm_suspend_target_state
2017-08-31alarmtimer: Ensure RTC module is not unloadedAlexandre Belloni1-0/+6
When registering the rtc device to be used to handle alarm timers, get_device is used to ensure the device doesn't go away but the module can still be unloaded. Call try_module_get to ensure the rtc driver will not go away. Reported-and-tested-by: Michal Simek <[email protected]> Signed-off-by: Alexandre Belloni <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: John Stultz <[email protected]> Cc: Stephen Boyd <[email protected]> Link: http://lkml.kernel.org/r/[email protected]
2017-08-26time: Fix ktime_get_raw() incorrect base accumulationJohn Stultz1-3/+1
In comqit fc6eead7c1e2 ("time: Clean up CLOCK_MONOTONIC_RAW time handling"), the following code got mistakenly added to the update of the raw timekeeper: /* Update the monotonic raw base */ seconds = tk->raw_sec; nsec = (u32)(tk->tkr_raw.xtime_nsec >> tk->tkr_raw.shift); tk->tkr_raw.base = ns_to_ktime(seconds * NSEC_PER_SEC + nsec); Which adds the raw_sec value and the shifted down raw xtime_nsec to the base value. But the read function adds the shifted down tk->tkr_raw.xtime_nsec value another time, The result of this is that ktime_get_raw() users (which are all internal users) see the raw time move faster then it should (the rate at which can vary with the current size of tkr_raw.xtime_nsec), which has resulted in at least problems with graphics rendering performance. The change tried to match the monotonic base update logic: seconds = (u64)(tk->xtime_sec + tk->wall_to_monotonic.tv_sec); nsec = (u32) tk->wall_to_monotonic.tv_nsec; tk->tkr_mono.base = ns_to_ktime(seconds * NSEC_PER_SEC + nsec); Which adds the wall_to_monotonic.tv_nsec value, but not the tk->tkr_mono.xtime_nsec value to the base. To fix this, simplify the tkr_raw.base accumulation to only accumulate the raw_sec portion, and do not include the tkr_raw.xtime_nsec portion, which will be added at read time. Fixes: fc6eead7c1e2 ("time: Clean up CLOCK_MONOTONIC_RAW time handling") Reported-and-tested-by: Chris Wilson <[email protected]> Signed-off-by: John Stultz <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Kevin Brodsky <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: Will Deacon <[email protected]> Cc: Miroslav Lichvar <[email protected]> Cc: Daniel Mentz <[email protected]> Link: http://lkml.kernel.org/r/[email protected]
2017-08-24timers: Fix excessive granularity of new timers after a nohz idleNicholas Piggin1-9/+41
When a timer base is idle, it is forwarded when a new timer is added to ensure that granularity does not become excessive. When not idle, the timer tick is expected to increment the base. However there are several problems: - If an existing timer is modified, the base is forwarded only after the index is calculated. - The base is not forwarded by add_timer_on. - There is a window after a timer is restarted from a nohz idle, after it is marked not-idle and before the timer tick on this CPU, where a timer may be added but the ancient base does not get forwarded. These result in excessive granularity (a 1 jiffy timeout can blow out to 100s of jiffies), which cause the rcu lockup detector to trigger, among other things. Fix this by keeping track of whether the timer base has been idle since it was last run or forwarded, and if so then forward it before adding a new timer. There is still a case where mod_timer optimises the case of a pending timer mod with the same expiry time, where the timer can see excessive granularity relative to the new, shorter interval. A comment is added, but it's not changed because it is an important fastpath for networking. This has been tested and found to fix the RCU softlockup messages. Testing was also done with tracing to measure requested versus achieved wakeup latencies for all non-deferrable timers in an idle system (with no lockup watchdogs running). Wakeup latency relative to absolute latency is calculated (note this suffers from round-up skew at low absolute times) and analysed: max avg std upstream 506.0 1.20 4.68 patched 2.0 1.08 0.15 The bug was noticed due to the lockup detector Kconfig changes dropping it out of people's .configs and resulting in larger base clk skew When the lockup detectors are enabled, no CPU can go idle for longer than 4 seconds, which limits the granularity errors. Sub-optimal timer behaviour is observable on a smaller scale in that case: max avg std upstream 9.0 1.05 0.19 patched 2.0 1.04 0.11 Fixes: Fixes: a683f390b93f ("timers: Forward the wheel clock whenever possible") Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Jonathan Cameron <[email protected]> Tested-by: David Miller <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Stephen Boyd <[email protected]> Cc: [email protected] Cc: [email protected] Cc: John Stultz <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected]
2017-08-20Merge branch 'fortglx/4.14/time' of ↵Thomas Gleixner2-3/+10
https://git.linaro.org/people/john.stultz/linux into timers/core Pull timekeepig updates from John Stultz - kselftest improvements - Use the proper timekeeper in the debug code - Prevent accessing an unavailable wakeup source in the alarmtimer sysfs interface.
2017-08-18posix-cpu-timers: Use dedicated helper to access rlimit valuesKrzysztof Opasiak1-8/+6
Use rlimit() and rlimit_max() helper instead of manually writing whole chain from task to rlimit value Signed-off-by: Krzysztof Opasiak <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected]
2017-08-17alarmtimer: Fix unavailable wake-up source in sysfsGeert Uytterhoeven1-2/+9
Currently the alarmtimer registers a wake-up source unconditionally, regardless of the system having a (wake-up capable) RTC or not. Hence the alarmtimer will always show up in /sys/kernel/debug/wakeup_sources, even if it is not available, and thus cannot be a wake-up source. To fix this, postpone registration until a wake-up capable RTC device is added. Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Miroslav Lichvar <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Stephen Boyd <[email protected]> Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: John Stultz <[email protected]>
2017-08-17timekeeping: Use proper timekeeper for debug codeStafford Horne1-1/+1
When CONFIG_DEBUG_TIMEKEEPING is enabled the timekeeping_check_update() function will update status like last_warning and underflow_seen on the timekeeper. If there are issues found this state is used to rate limit the warnings that get printed. This rate limiting doesn't really really work if stored in real_tk as the shadow timekeeper is overwritten onto real_tk at the end of every update_wall_time() call, resetting last_warning and other statuses. Fix rate limiting by using the shadow_timekeeper for timekeeping_check_update(). Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Miroslav Lichvar <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Stephen Boyd <[email protected]> Fixes: commit 57d05a93ada7 ("time: Rework debugging variables so they aren't global") Signed-off-by: Stafford Horne <[email protected]> Signed-off-by: John Stultz <[email protected]>
2017-08-01timers: Fix overflow in get_next_timer_interruptMatija Glavinic Pecotic1-1/+1
For e.g. HZ=100, timer being 430 jiffies in the future, and 32 bit unsigned int, there is an overflow on unsigned int right-hand side of the expression which results with wrong values being returned. Type cast the multiplier to 64bit to avoid that issue. Fixes: 46c8f0b077a8 ("timers: Fix get_next_timer_interrupt() computation") Signed-off-by: Matija Glavinic Pecotic <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Alexander Sverdlin <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected]
2017-07-23PM / timekeeping: Print debug messages when requestedRafael J. Wysocki1-2/+3
The messages printed by tk_debug_account_sleep_time() are basically useful for system sleep debugging, so print them only when the other debug messages from the core suspend/hibernate code are enabled. While at it, make it clear that the messages from tk_debug_account_sleep_time() are about timekeeping suspend duration, because in general timekeeping may be suspeded and resumed for multiple times during one system suspend-resume cycle. Signed-off-by: Rafael J. Wysocki <[email protected]>
2017-07-05Merge branch 'timers-compat' of ↵Linus Torvalds6-147/+176
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull timer-related user access updates from Al Viro: "Continuation of timers-related stuff (there had been more, but my parts of that series are already merged via timers/core). This is more of y2038 work by Deepa Dinamani, partially disrupted by the unification of native and compat timers-related syscalls" * 'timers-compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: posix_clocks: Use get_itimerspec64() and put_itimerspec64() timerfd: Use get_itimerspec64() and put_itimerspec64() nanosleep: Use get_timespec64() and put_timespec64() posix-timers: Use get_timespec64() and put_timespec64() posix-stubs: Conditionally include COMPAT_SYS_NI defines time: introduce {get,put}_itimerspec64 time: add get_timespec64 and put_timespec64
2017-07-03Merge branch 'timers-core-for-linus' of ↵Linus Torvalds11-781/+1129
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "A rather large update for timers/timekeeping: - compat syscall consolidation (Al Viro) - Posix timer consolidation (Christoph Helwig / Thomas Gleixner) - Cleanup of the device tree based initialization for clockevents and clocksources (Daniel Lezcano) - Consolidation of the FTTMR010 clocksource/event driver (Linus Walleij) - The usual set of small fixes and updates all over the place" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (93 commits) timers: Make the cpu base lock raw clocksource/drivers/mips-gic-timer: Fix an error code in 'gic_clocksource_of_init()' clocksource/drivers/fsl_ftm_timer: Unmap region obtained by of_iomap clocksource/drivers/tcb_clksrc: Make IO endian agnostic clocksource/drivers/sun4i: Switch to the timer-of common init clocksource/drivers/timer-of: Fix invalid iomap check Revert "ktime: Simplify ktime_compare implementation" clocksource/drivers: Fix uninitialized variable use in timer_of_init kselftests: timers: Add test for frequency step kselftests: timers: Fix inconsistency-check to not ignore first timestamp time: Add warning about imminent deprecation of CONFIG_GENERIC_TIME_VSYSCALL_OLD time: Clean up CLOCK_MONOTONIC_RAW time handling posix-cpu-timers: Make timespec to nsec conversion safe itimer: Make timeval to nsec conversion range limited timers: Fix parameter description of try_to_del_timer_sync() ktime: Simplify ktime_compare implementation clocksource/drivers/fttmr010: Factor out clock read code clocksource/drivers/fttmr010: Implement delay timer clocksource/drivers: Add timer-of common init routine clocksource/drivers/tcb_clksrc: Save timer context on suspend/resume ...
2017-07-03Merge branch 'timers-nohz-for-linus' of ↵Linus Torvalds2-23/+42
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull nohz updates from Ingo Molnar: "The main changes in this cycle relate to fixing another bad (but sporadic and hard to detect) interaction between the dynticks scheduler tick and hrtimers, plus related improvements to better detection and handling of similar problems - by Frédéric Weisbecker" * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: nohz: Fix spurious warning when hrtimer and clockevent get out of sync nohz: Fix buggy tick delay on IRQ storms nohz: Reset next_tick cache even when the timer has no regs nohz: Fix collision between tick and other hrtimers, again nohz: Add hrtimer sanity check
2017-07-03Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2-5/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - Add the SYSTEM_SCHEDULING bootup state to move various scheduler debug checks earlier into the bootup. This turns silent and sporadically deadly bugs into nice, deterministic splats. Fix some of the splats that triggered. (Thomas Gleixner) - A round of restructuring and refactoring of the load-balancing and topology code (Peter Zijlstra) - Another round of consolidating ~20 of incremental scheduler code history: this time in terms of wait-queue nomenclature. (I didn't get much feedback on these renaming patches, and we can still easily change any names I might have misplaced, so if anyone hates a new name, please holler and I'll fix it.) (Ingo Molnar) - sched/numa improvements, fixes and updates (Rik van Riel) - Another round of x86/tsc scheduler clock code improvements, in hope of making it more robust (Peter Zijlstra) - Improve NOHZ behavior (Frederic Weisbecker) - Deadline scheduler improvements and fixes (Luca Abeni, Daniel Bristot de Oliveira) - Simplify and optimize the topology setup code (Lauro Ramos Venancio) - Debloat and decouple scheduler code some more (Nicolas Pitre) - Simplify code by making better use of llist primitives (Byungchul Park) - ... plus other fixes and improvements" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits) sched/cputime: Refactor the cputime_adjust() code sched/debug: Expose the number of RT/DL tasks that can migrate sched/numa: Hide numa_wake_affine() from UP build sched/fair: Remove effective_load() sched/numa: Implement NUMA node level wake_affine() sched/fair: Simplify wake_affine() for the single socket case sched/numa: Override part of migrate_degrades_locality() when idle balancing sched/rt: Move RT related code from sched/core.c to sched/rt.c sched/deadline: Move DL related code from sched/core.c to sched/deadline.c sched/cpuset: Only offer CONFIG_CPUSETS if SMP is enabled sched/fair: Spare idle load balancing on nohz_full CPUs nohz: Move idle balancer registration to the idle path sched/loadavg: Generalize "_idle" naming to "_nohz" sched/core: Drop the unused try_get_task_struct() helper function sched/fair: WARN() and refuse to set buddy when !se->on_rq sched/debug: Fix SCHED_WARN_ON() to return a value on !CONFIG_SCHED_DEBUG as well sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming sched/wait: Move bit_wait_table[] and related functionality from sched/core.c to sched/wait_bit.c sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into <linux/wait_bit.h> sched/wait: Re-adjust macro line continuation backslashes in <linux/wait.h> ...
2017-07-03Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds1-50/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The sole purpose of these changes is to shrink and simplify the RCU code base, which has suffered from creeping bloat over the past couple of years. The end result is a net removal of ~2700 lines of code: 79 files changed, 1496 insertions(+), 4211 deletions(-) Plus there's a marked reduction in the Kconfig space complexity as well, here's the number of matches on 'grep RCU' in the .config: before after x86-defconfig 17 15 x86-allmodconfig 33 20" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (86 commits) rcu: Remove RCU CPU stall warnings from Tiny RCU rcu: Remove event tracing from Tiny RCU rcu: Move RCU debug Kconfig options to kernel/rcu rcu: Move RCU non-debug Kconfig options to kernel/rcu rcu: Eliminate NOCBs CPU-state Kconfig options rcu: Remove debugfs tracing srcu: Remove Classic SRCU srcu: Fix rcutorture-statistics typo rcu: Remove SPARSE_RCU_POINTER Kconfig option rcu: Remove the now-obsolete PROVE_RCU_REPEATEDLY Kconfig option rcu: Remove typecheck() from RCU locking wrapper functions rcu: Remove #ifdef moving rcu_end_inkernel_boot from rcupdate.h rcu: Remove nohz_full full-system-idle state machine rcu: Remove the RCU_KTHREAD_PRIO Kconfig option rcu: Remove *_SLOW_* Kconfig options srcu: Use rnp->lock wrappers to replace explicit memory barriers rcu: Move rnp->lock wrappers for SRCU use rcu: Convert rnp->lock wrappers to macros for SRCU use rcu: Refactor #includes from include/linux/rcupdate.h bcm47xx: Fix build regression ...
2017-06-30posix_clocks: Use get_itimerspec64() and put_itimerspec64()Deepa Dinamani1-28/+16
Usage of these apis and their compat versions makes the syscalls: timer_settime and timer_gettime and their compat implementations simpler. This patch also serves as a preparatory patch for changing syscalls to use new time_t data types to support the y2038 effort by isolating the processing of user pointers through these apis. Signed-off-by: Deepa Dinamani <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-30nanosleep: Use get_timespec64() and put_timespec64()Deepa Dinamani4-37/+25
Usage of these apis and their compat versions makes the syscalls: clock_nanosleep and nanosleep and their compat implementations simpler. This is a preparatory patch to isolate data conversions to struct timespec64 at userspace boundaries. This helps contain the changes needed to transition to new y2038 safe types. Signed-off-by: Deepa Dinamani <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-30posix-timers: Use get_timespec64() and put_timespec64()Deepa Dinamani2-76/+70
Usage of these apis and their compat versions makes the syscalls: clock_gettime, clock_settime, clock_getres and their compat implementations simpler. This is a preparatory patch to isolate data conversions to struct timespec64 at userspace boundaries. This helps contain the changes needed to transition to new y2038 safe types. Signed-off-by: Deepa Dinamani <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-29timers: Make the cpu base lock rawSebastian Andrzej Siewior1-24/+24
The timers cpu base lock could not be converted to a raw spinlock becaue the lock held time was non-deterministic due to cascading and long lasting timer wheel traversals. The rework of the timer wheel to the new non-cascading model removed also the wheel traversals and the lock held times are deterministic now. This allows to make the lock raw and thereby unbreaks NOHz* on preempt-RT. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2017-06-25posix-stubs: Conditionally include COMPAT_SYS_NI definesDeepa Dinamani1-6/+7
These apis only need to be defined if CONFIG_COMPAT is enabled. Signed-off-by: Deepa Dinamani <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-25time: introduce {get,put}_itimerspec64Deepa Dinamani1-0/+30
As we change the user space type for the timerfd and posix timer functions to newer data types, we need some form of conversion helpers to avoid duplicating that logic. Suggested-by: Arnd Bergmann <[email protected]> Signed-off-by: Deepa Dinamani <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-25time: add get_timespec64 and put_timespec64Deepa Dinamani1-0/+28
Add helper functions to convert between struct timespec64 and struct timespec at userspace boundaries. This is a preparatory patch to use timespec64 as the basic type internally in the kernel as timespec is not y2038 safe on 32 bit systems. The patch helps the cause by containing all data conversions at the userspace boundaries within these functions. Suggested-by: Arnd Bergmann <[email protected]> Signed-off-by: Deepa Dinamani <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-06-22nohz: Move idle balancer registration to the idle pathFrederic Weisbecker1-2/+3
The idle load balancing registration path assumes that we only stop the tick when the CPU is idle, ignoring the nohz full case. As a result, a nohz full CPU that is running a task may be chosen to perform idle load balancing. Lets make sure that only CPUs in dynticks idle mode can be picked as idle load balancers. Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-06-22sched/loadavg: Generalize "_idle" naming to "_nohz"Frederic Weisbecker1-2/+2
The loadavg naming code still assumes that nohz == idle whereas its code is actually handling well both nohz idle and nohz full. So lets fix the naming according to what the code actually does, to unconfuse the reader. Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-06-21Merge branch 'fortglx/4.13/time' of ↵Thomas Gleixner1-20/+26
https://git.linaro.org/people/john.stultz/linux into timers/core Merge time(keeping) updates from John Stultz: "Just a small set of changes, the biggest changes being the MONOTONIC_RAW handling cleanup, and a new kselftest from Miroslav. Also a a clear warning deprecating CONFIG_GENERIC_TIME_VSYSCALL_OLD, which affects ppc and ia64."
2017-06-21Merge branch 'timers/urgent' into timers/coreThomas Gleixner3-28/+49
Pick up dependent changes.
2017-06-20time: Add warning about imminent deprecation of CONFIG_GENERIC_TIME_VSYSCALL_OLDJohn Stultz1-0/+1
CONFIG_GENERIC_TIME_VSYSCALL_OLD was introduced five years ago to allow a transition from the old vsyscall implementations to the new method (which simplified internal accounting and made timekeeping more precise). However, PPC and IA64 have yet to make the transition, despite in some cases me sending test patches to try to help it along. http://patches.linaro.org/patch/30501/ http://patches.linaro.org/patch/35412/ If its helpful, my last pass at the patches can be found here: https://git.linaro.org/people/john.stultz/linux.git dev/oldvsyscall-cleanup So I think its time to set a deadline and make it clear this is going away. So this patch adds warnings about this functionality being dropped. Likely to be in v4.15. Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Miroslav Lichvar <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Marcelo Tosatti <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Anton Blanchard <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Tony Luck <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Fenghua Yu <[email protected]> Signed-off-by: John Stultz <[email protected]>
2017-06-20time: Clean up CLOCK_MONOTONIC_RAW time handlingJohn Stultz1-20/+25
Now that we fixed the sub-ns handling for CLOCK_MONOTONIC_RAW, remove the duplicitive tk->raw_time.tv_nsec, which can be stored in tk->tkr_raw.xtime_nsec (similarly to how its handled for monotonic time). Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Miroslav Lichvar <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: Kevin Brodsky <[email protected]> Cc: Will Deacon <[email protected]> Cc: Daniel Mentz <[email protected]> Tested-by: Daniel Mentz <[email protected]> Signed-off-by: John Stultz <[email protected]>
2017-06-20posix-cpu-timers: Make timespec to nsec conversion safeThomas Gleixner1-1/+5
The expiry time of a posix cpu timer is supplied through sys_timer_set() via a struct timespec. The timespec is validated for correctness. In the actual set timer implementation the timespec is converted to a scalar nanoseconds value. If the tv_sec part of the time spec is large enough the conversion to nanoseconds (sec * NSEC_PER_SEC) overflows 64bit. Mitigate that by using the timespec_to_ktime() conversion function, which checks the tv_sec part for a potential mult overflow and clamps the result to KTIME_MAX, which is about 292 years. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Xishi Qiu <[email protected]> Cc: John Stultz <[email protected]> Link: http://lkml.kernel.org/r/[email protected]
2017-06-20itimer: Make timeval to nsec conversion range limitedThomas Gleixner1-2/+6
The expiry time of a itimer is supplied through sys_setitimer() via a struct timeval. The timeval is validated for correctness. In the actual set timer implementation the timeval is converted to a scalar nanoseconds value. If the tv_sec part of the time spec is large enough the conversion to nanoseconds (sec * NSEC_PER_SEC) overflows 64bit. Mitigate that by using the timeval_to_ktime() conversion function, which checks the tv_sec part for a potential mult overflow and clamps the result to KTIME_MAX, which is about 292 years. Reported-by: Xishi Qiu <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: John Stultz <[email protected]> Link: http://lkml.kernel.org/r/[email protected]
2017-06-20timers: Fix parameter description of try_to_del_timer_sync()Peter Meerwald-Stadler1-1/+1
Signed-off-by: Peter Meerwald-Stadler <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Cc: John Stultz <[email protected]> Cc: [email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2017-06-20Merge branch 'WIP.sched/core' into sched/coreIngo Molnar4-14/+30
Conflicts: kernel/sched/Makefile Pick up the waitqueue related renames - it didn't get much feedback, so it appears to be uncontroversial. Famous last words? ;-) Signed-off-by: Ingo Molnar <[email protected]>
2017-06-20time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accountingJohn Stultz1-9/+10
Due to how the MONOTONIC_RAW accumulation logic was handled, there is the potential for a 1ns discontinuity when we do accumulations. This small discontinuity has for the most part gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW in their vDSO clock_gettime implementation, we've seen failures with the inconsistency-check test in kselftest. This patch addresses the issue by using the same sub-ns accumulation handling that CLOCK_MONOTONIC uses, which avoids the issue for in-kernel users. Since the ARM64 vDSO implementation has its own clock_gettime calculation logic, this patch reduces the frequency of errors, but failures are still seen. The ARM64 vDSO will need to be updated to include the sub-nanosecond xtime_nsec values in its calculation for this issue to be completely fixed. Signed-off-by: John Stultz <[email protected]> Tested-by: Daniel Mentz <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Kevin Brodsky <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: Will Deacon <[email protected]> Cc: "stable #4 . 8+" <[email protected]> Cc: Miroslav Lichvar <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2017-06-20time: Fix clock->read(clock) race around clocksource changesJohn Stultz1-16/+36
In tests, which excercise switching of clocksources, a NULL pointer dereference can be observed on AMR64 platforms in the clocksource read() function: u64 clocksource_mmio_readl_down(struct clocksource *c) { return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask; } This is called from the core timekeeping code via: cycle_now = tkr->read(tkr->clock); tkr->read is the cached tkr->clock->read() function pointer. When the clocksource is changed then tkr->clock and tkr->read are updated sequentially. The code above results in a sequential load operation of tkr->read and tkr->clock as well. If the store to tkr->clock hits between the loads of tkr->read and tkr->clock, then the old read() function is called with the new clock pointer. As a consequence the read() function dereferences a different data structure and the resulting 'reg' pointer can point anywhere including NULL. This problem was introduced when the timekeeping code was switched over to use struct tk_read_base. Before that, it was theoretically possible as well when the compiler decided to reload clock in the code sequence: now = tk->clock->read(tk->clock); Add a helper function which avoids the issue by reading tk_read_base->clock once into a local variable clk and then issue the read function via clk->read(clk). This guarantees that the read() function always gets the proper clocksource pointer handed in. Since there is now no use for the tkr.read pointer, this patch also removes it, and to address stopping the fast timekeeper during suspend/resume, it introduces a dummy clocksource to use rather then just a dummy read function. Signed-off-by: John Stultz <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: stable <[email protected]> Cc: Miroslav Lichvar <[email protected]> Cc: Daniel Mentz <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2017-06-14posix-timers: Make nanosleep timespec argument constThomas Gleixner5-7/+7
No nanosleep implementation modifies the rqtp argument. Mark is const. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Al Viro <[email protected]> Cc: John Stultz <[email protected]> Cc: Peter Zijlstra <[email protected]>
2017-06-14posix-cpu-timers: Avoid timespec conversion in do_cpu_nanosleep()Thomas Gleixner1-4/+5
No point in converting the expiry time back and forth. No point either to update the value in the caller supplied variable. mark the rqtp argument const. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Al Viro <[email protected]> Cc: John Stultz <[email protected]> Cc: Peter Zijlstra <[email protected]>