aboutsummaryrefslogtreecommitdiff
path: root/kernel/time
AgeCommit message (Collapse)AuthorFilesLines
2015-12-10ntp: Verify offset doesn't overflow in ntp_update_offsetSasha Levin1-3/+5
We need to make sure that the offset is valid before manipulating it, otherwise it might overflow on the multiplication. Cc: Sasha Levin <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Sasha Levin <[email protected]> [jstultz: Reworked one of the checks so it makes more sense] Signed-off-by: John Stultz <[email protected]>
2015-12-08watchdog: introduce touch_softlockup_watchdog_sched()Tejun Heo1-3/+3
touch_softlockup_watchdog() is used to tell watchdog that scheduler stall is expected. One group of usage is from paths where the task may not be able to yield for a long time such as performing slow PIO to finicky device and coming out of suspend. The other is to account for scheduler and timer going idle. For scheduler softlockup detection, there's no reason to distinguish the two cases; however, workqueue lockup detector is planned and it can use the same signals from the former group while the latter would spuriously prevent detection. This patch introduces a new function touch_softlockup_watchdog_sched() and convert the latter group to call it instead. For now, it just calls touch_softlockup_watchdog() and there's no functional difference. Signed-off-by: Tejun Heo <[email protected]> Cc: Ulrich Obergfell <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Andrew Morton <[email protected]>
2015-12-07clocksource: Add CPU info to clocksource watchdog reportingSeiichi Ikarashi1-2/+2
The clocksource watchdog reporting was improved by 0b046b217ad4c6. I want to add the info of CPU where the watchdog detects a deviation because it is necessary to identify the trouble spot if the clocksource is TSC. Signed-off-by: Seiichi Ikarashi <[email protected]> [jstultz: Tweaked commit message] Signed-off-by: John Stultz <[email protected]>
2015-12-07time: Avoid signed overflow in timekeeping_get_ns()David Gibson1-2/+1
1e75fa8 "time: Condense timekeeper.xtime into xtime_sec" replaced a call to clocksource_cyc2ns() from timekeeping_get_ns() with an open-coded version of the same logic to avoid keeping a semi-redundant struct timespec in struct timekeeper. However, the commit also introduced a subtle semantic change - where clocksource_cyc2ns() uses purely unsigned math, the new version introduces a signed temporary, meaning that if (delta * tk->mult) has a 63-bit overflow the following shift will still give a negative result. The choice of 'maxsec' in __clocksource_updatefreq_scale() means this will generally happen if there's a ~10 minute pause in examining the clocksource. This can be triggered on a powerpc KVM guest by stopping it from qemu for a bit over 10 minutes. After resuming time has jumped backwards several minutes causing numerous problems (jiffies does not advance, msleep()s can be extended by minutes..). It doesn't happen on x86 KVM guests, because the guest TSC is effectively frozen while the guest is stopped, which is not the case for the powerpc timebase. Obviously an unsigned (64 bit) overflow will only take twice as long as a signed, 63-bit overflow. I don't know the time code well enough to know if that will still cause incorrect calculations, or if a 64-bit overflow is avoided elsewhere. Still, an incorrect forwards clock adjustment will cause less trouble than time going backwards. So, this patch removes the potential for intermediate signed overflow. Cc: [email protected] (3.7+) Suggested-by: Laurent Vivier <[email protected]> Tested-by: Laurent Vivier <[email protected]> Signed-off-by: David Gibson <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-12-04sched/cputime: Rename vtime_accounting_enabled() to ↵Frederic Weisbecker1-1/+1
vtime_accounting_cpu_enabled() vtime_accounting_enabled() checks if vtime is running on the current CPU and is as such a misnomer. Lets rename it to a function that reflect its locality. We are going to need the current name for a function that tells if vtime runs at all on some CPU. Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Hiroshi Shimamoto <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luiz Capitulino <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul E . McKenney <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-12-03alarmtimer: Avoid unexpected rtc interrupt when system resume from S3zhuo-hao1-0/+17
Before the system go to suspend (S3), if user create a timer with clockid CLOCK_REALTIME_ALARM/CLOCK_BOOTTIME_ALARM and set a "large" timeout value to this timer. The function alarmtimer_suspend will be called to setup a timeout value to RTC timer to avoid the system sleep over time. However, if the system wakeup early than RTC timeout, the RTC timer will not be cleared. And this will cause the hpet_rtc_interrupt come unexpectedly until the RTC timeout. To fix this problem, just adding alarmtimer_resume to cancel the RTC timer. This was noticed because the HPET RTC emulation fires an interrupt every 16ms(=1/2^DEFAULT_RTC_SHIFT) up to the point where the alarm time is reached. This program always hits this situation (https://lkml.org/lkml/2015/11/8/326), if system wake up earlier than alarm time. Cc: Thomas Gleixner <[email protected]> Cc: John Stultz <[email protected]> Signed-off-by: Zhuo-hao Lee <[email protected]> [jstultz: Tweak commit subject & formatting slightly] Signed-off-by: John Stultz <[email protected]>
2015-11-25nohz: Clarify magic in tick_nohz_stop_sched_tick()Peter Zijlstra1-2/+18
While going through the nohz code I got stumped by some of it. This patch adds a few comments clarifying the code; based on discussion with Thomas. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-11-23sched/fair: Consider missed ticks in NOHZ_FULL in update_cpu_load_nohz()Byungchul Park1-4/+4
Usually the tick can be stopped for an idle CPU in NOHZ. However in NOHZ_FULL mode, a non-idle CPU's tick can also be stopped. However, update_cpu_load_nohz() does not consider the case a non-idle CPU's tick has been stopped at all. This patch makes the update_cpu_load_nohz() know if the calling path comes from NOHZ_FULL or idle NOHZ. Signed-off-by: Byungchul Park <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-11-15Merge branches 'irq-urgent-for-linus' and 'timers-urgent-for-linus' of ↵Linus Torvalds1-3/+19
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq and timer fixes from Thomas Gleixner: - An irq regression fix to restore the wakeup behaviour of chained interrupts. - A timer fix for a long standing race versus timers scheduled on a target cpu which got exposed by recent changes in the workqueue implementation. * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/PM: Restore system wake up from chained interrupts * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers: Use proper base migration in add_timer_on()
2015-11-09remove abs64()Andrew Morton2-2/+2
Switch everything to the new and more capable implementation of abs(). Mainly to give the new abs() a bit of a workout. Cc: Michal Nazarewicz <[email protected]> Cc: John Stultz <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-11-04timers: Use proper base migration in add_timer_on()Tejun Heo1-3/+19
Regardless of the previous CPU a timer was on, add_timer_on() currently simply sets timer->flags to the new CPU. As the caller must be seeing the timer as idle, this is locally fine, but the timer leaving the old base while unlocked can lead to race conditions as follows. Let's say timer was on cpu 0. cpu 0 cpu 1 ----------------------------------------------------------------------------- del_timer(timer) succeeds del_timer(timer) lock_timer_base(timer) locks cpu_0_base add_timer_on(timer, 1) spin_lock(&cpu_1_base->lock) timer->flags set to cpu_1_base operates on @timer operates on @timer This triggered with mod_delayed_work_on() which contains "if (del_timer()) add_timer_on()" sequence eventually leading to the following oops. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff810ca6e9>] detach_if_pending+0x69/0x1a0 ... Workqueue: wqthrash wqthrash_workfunc [wqthrash] task: ffff8800172ca680 ti: ffff8800172d0000 task.ti: ffff8800172d0000 RIP: 0010:[<ffffffff810ca6e9>] [<ffffffff810ca6e9>] detach_if_pending+0x69/0x1a0 ... Call Trace: [<ffffffff810cb0b4>] del_timer+0x44/0x60 [<ffffffff8106e836>] try_to_grab_pending+0xb6/0x160 [<ffffffff8106e913>] mod_delayed_work_on+0x33/0x80 [<ffffffffa0000081>] wqthrash_workfunc+0x61/0x90 [wqthrash] [<ffffffff8106dba8>] process_one_work+0x1e8/0x650 [<ffffffff8106e05e>] worker_thread+0x4e/0x450 [<ffffffff810746af>] kthread+0xef/0x110 [<ffffffff8185980f>] ret_from_fork+0x3f/0x70 Fix it by updating add_timer_on() to perform proper migration as __mod_timer() does. Reported-and-tested-by: Jeff Layton <[email protected]> Signed-off-by: Tejun Heo <[email protected]> Cc: Chris Worley <[email protected]> Cc: [email protected] Cc: Michael Skralivetsky <[email protected]> Cc: Trond Myklebust <[email protected]> Cc: Shaohua Li <[email protected]> Cc: Jeff Layton <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-11-03Merge branch 'timers-core-for-linus' of ↵Linus Torvalds8-45/+78
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "The timer departement provides: - More y2038 work in the area of ntp and pps. - Optimization of posix cpu timers - New time related selftests - Some new clocksource drivers - The usual pile of fixes, cleanups and improvements" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) timeconst: Update path in comment timers/x86/hpet: Type adjustments clocksource/drivers/armada-370-xp: Implement ARM delay timer clocksource/drivers/tango_xtal: Add new timer for Tango SoCs clocksource/drivers/imx: Allow timer irq affinity change clocksource/drivers/exynos_mct: Use container_of() instead of this_cpu_ptr() clocksource/drivers/h8300_*: Remove unneeded memset()s clocksource/drivers/sh_cmt: Remove unneeded memset() in sh_cmt_setup() clocksource/drivers/em_sti: Remove unneeded memset()s clocksource/drivers/mediatek: Use GPT as sched clock source clockevents/drivers/mtk: Fix spurious interrupt leading to crash posix_cpu_timer: Reduce unnecessary sighand lock contention posix_cpu_timer: Convert cputimer->running to bool posix_cpu_timer: Check thread timers only when there are active thread timers posix_cpu_timer: Optimize fastpath_timer_check() timers, kselftest: Add 'adjtick' test to validate adjtimex() tick adjustments timers: Use __fls in apply_slack() clocksource: Remove return statement from void functions net: sfc: avoid using timespec ntp/pps: use y2038 safe types in pps_event_time ...
2015-10-26timeconst: Update path in commentJason A. Donenfeld1-1/+1
Signed-off-by: Jason A. Donenfeld <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-10-20Merge branch 'fortglx/4.4/time' of ↵Thomas Gleixner3-16/+16
https://git.linaro.org/people/john.stultz/linux into timers/core Time updates from John Stultz: - More 2038 work from Arnd Bergmann around ntp and pps
2015-10-16timekeeping: Increment clock_was_set_seq in timekeeping_init()Thomas Gleixner1-1/+1
timekeeping_init() can set the wall time offset, so we need to increment the clock_was_set_seq counter. That way hrtimers will pick up the early offset immediately. Otherwise on a machine which does not set wall time later in the boot process the hrtimer offset is stale at 0 and wall time timers are going to expire with a delay of 45 years. Fixes: 868a3e915f7f "hrtimer: Make offset update smarter" Reported-and-tested-by: Heiko Carstens <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Stefan Liebler <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: John Stultz <[email protected]>
2015-10-15posix_cpu_timer: Reduce unnecessary sighand lock contentionJason Low1-2/+24
It was found while running a database workload on large systems that significant time was spent trying to acquire the sighand lock. The issue was that whenever an itimer expired, many threads ended up simultaneously trying to send the signal. Most of the time, nothing happened after acquiring the sighand lock because another thread had just already sent the signal and updated the "next expire" time. The fastpath_timer_check() didn't help much since the "next expire" time was updated after the threads exit fastpath_timer_check(). This patch addresses this by having the thread_group_cputimer structure maintain a boolean to signify when a thread in the group is already checking for process wide timers, and adds extra logic in the fastpath to check the boolean. Signed-off-by: Jason Low <[email protected]> Reviewed-by: Oleg Nesterov <[email protected]> Reviewed-by: George Spelvin <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-10-15posix_cpu_timer: Convert cputimer->running to boolJason Low1-2/+2
In the next patch in this series, a new field 'checking_timer' will be added to 'struct thread_group_cputimer'. Both this and the existing 'running' integer field are just used as boolean values. To save space in the structure, we can make both of these fields booleans. This is a preparatory patch to convert the existing running integer field to a boolean. Suggested-by: George Spelvin <[email protected]> Signed-off-by: Jason Low <[email protected]> Reviewed: George Spelvin <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-10-15posix_cpu_timer: Check thread timers only when there are active thread timersJason Low1-6/+16
The fastpath_timer_check() contains logic to check for if any timers are set by checking if !task_cputime_zero(). Similarly, we can do this before calling check_thread_timers(). In the case where there are only process-wide timers, this will skip all of the computations for per-thread timers when there are no per-thread timers. As suggested by George, we can put the task_cputime_zero() check in check_thread_timers(), since that is more of an optization to the function. Similarly, we move the existing check of cputimer->running to check_process_timers(). Signed-off-by: Jason Low <[email protected]> Reviewed-by: Oleg Nesterov <[email protected]> Reviewed-by: George Spelvin <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-10-15posix_cpu_timer: Optimize fastpath_timer_check()Jason Low1-8/+3
In fastpath_timer_check(), the task_cputime() function is always called to compute the utime and stime values. However, this is not necessary if there are no per-thread timers to check for. This patch modifies the code such that we compute the task_cputime values only when there are per-thread timers set. Signed-off-by: Jason Low <[email protected]> Reviewed-by: Oleg Nesterov <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Reviewed-by: Davidlohr Bueso <[email protected]> Reviewed-by: George Spelvin <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-10-12Merge tag 'v4.3-rc5' into timers/core, to pick up fixes before applying new ↵Ingo Molnar1-1/+1
changes Signed-off-by: Ingo Molnar <[email protected]>
2015-10-11timers: Use __fls in apply_slack()Rasmus Villemoes1-1/+1
In apply_slack(), find_last_bit() is applied to a bitmask consisting of precisely BITS_PER_LONG bits. Since mask is non-zero, we might as well eliminate the function call and use __fls() directly. On x86_64, this shaves 23 bytes of the only caller, mod_timer(). This also gets rid of Coverity CID 1192106, but that is a false positive: Coverity is not aware that mask != 0 implies that find_last_bit will not return BITS_PER_LONG. Signed-off-by: Rasmus Villemoes <[email protected]> Cc: John Stultz <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-10-11clocksource: Remove return statement from void functionsGuillaume Gomez1-3/+2
Signed-off-by: Guillaume Gomez <[email protected]> Cc: John Stultz <[email protected]> Link: http://lkml.kernel.org/r/CAAOQCfSDgmqSWDBsetau%2ByF8x0%[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-10-03Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "An abs64() fix in the watchdog driver, and two clocksource driver NO_IRQ assumption fixes" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource: Fix abs() usage w/ 64bit values clocksource/drivers/keystone: Fix bad NO_IRQ usage clocksource/drivers/rockchip: Fix bad NO_IRQ usage
2015-10-02clocksource: Fix abs() usage w/ 64bit valuesJohn Stultz1-1/+1
This patch fixes one cases where abs() was being used with 64-bit nanosecond values, where the result may be capped at 32-bits. This potentially could cause watchdog false negatives on 32-bit systems, so this patch addresses the issue by using abs64(). Signed-off-by: John Stultz <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-10-01ntp: use timespec64 in sync_cmos_clockArnd Bergmann1-2/+2
The sync_cmos_clock has one use of struct timespec, which we want to eventually replace with timespec64 or similar in the kernel. There is no way this one can overflow, but the conversion to timespec64 is trivial and has no other dependencies. Acked-by: Richard Cochran <[email protected]> Acked-by: David S. Miller <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-10-01ntp/pps: replace getnstime_raw_and_real with 64-bit versionArnd Bergmann1-6/+6
There is exactly one caller of getnstime_raw_and_real in the kernel, which is the pps_get_ts function. This changes the caller and the implementation to work on timespec64 types rather than timespec, to avoid the time_t overflow on 32-bit architectures. For consistency with the other new functions (ktime_get_seconds, ktime_get_real_*, ...), I'm renaming the function to ktime_get_raw_and_real_ts64. We still need to convert from the internal 64-bit type to 32 bit types in the caller, but this conversion is now pushed out from getnstime_raw_and_real to pps_get_ts. A follow-up patch changes the remaining pps code to completely avoid the conversion. Acked-by: Richard Cochran <[email protected]> Acked-by: David S. Miller <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-10-01ntp/pps: use timespec64 for hardpps()Arnd Bergmann3-8/+8
There is only one user of the hardpps function in the kernel, so it makes sense to atomically change it over to using 64-bit timestamps for y2038 safety. In the hardpps implementation, we also need to change the pps_normtime structure, which is similar to struct timespec and also requires a 64-bit seconds portion. This introduces two temporary variables in pps_kc_event() to do the conversion, they will be removed again in the next step, which seemed preferable to having a larger patch changing it all at the same time. Acked-by: Richard Cochran <[email protected]> Acked-by: David S. Miller <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-09-22timers: Fix data race in timer_stats_account_timer()Dmitry Vyukov1-2/+9
timer_stats_account_timer() reads timer->start_site, then checks it for NULL and then re-reads it again, while timer_stats_timer_clear_start_info() can concurrently reset timer->start_site to NULL. This should not lead to crashes, but can double number of entries in timer stats as start_site is used during comparison, the doubled entries will have unuseful NULL start_site. Read timer->start_site only once in timer_stats_account_timer(). The data race was found with KernelThreadSanitizer (KTSAN). Signed-off-by: Dmitry Vyukov <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-09-22time: Fix spelling in commentsZhen Lei3-4/+4
Signed-off-by: Zhen Lei <[email protected]> Cc: Hanjun Guo <[email protected]> Cc: John Stultz <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tianhong Ding <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Xinwei Hu <[email protected]> Cc: Xunlei Pang <[email protected]> Cc: Zefan Li <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Fixed yet another typo in one of the sentences fixed. ] Signed-off-by: Ingo Molnar <[email protected]>
2015-09-17Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds4-73/+26
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "A fix for an abs()/abs64() bug that caused too slow NTP convergence on 32-bit kernels, plus a removal of an obsolete clockevents driver facility after all users got converted during the merge window" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clockevents: Remove unused set_mode() callback time: Fix timekeeping_freqadjust()'s incorrect use of abs() instead of abs64()
2015-09-14clockevents: Remove unused set_mode() callbackViresh Kumar3-72/+25
All users are migrated to the per-state callbacks, get rid of the unused interface and the core support code. Signed-off-by: Viresh Kumar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: John Stultz <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/fd60de14cf6d125489c031207567bb255ad946f6.1441943991.git.viresh.kumar@linaro.org Signed-off-by: Ingo Molnar <[email protected]>
2015-09-13time: Fix timekeeping_freqadjust()'s incorrect use of abs() instead of abs64()John Stultz1-1/+1
The internal clocksteering done for fine-grained error correction uses a logarithmic approximation, so any time adjtimex() adjusts the clock steering, timekeeping_freqadjust() quickly approximates the correct clock frequency over a series of ticks. Unfortunately, the logic in timekeeping_freqadjust(), introduced in commit: dc491596f639 ("timekeeping: Rework frequency adjustments to work better w/ nohz") used the abs() function with a s64 error value to calculate the size of the approximated adjustment to be made. Per include/linux/kernel.h: "abs() should not be used for 64-bit types (s64, u64, long long) - use abs64()". Thus on 32-bit platforms, this resulted in the clocksteering to take a quite dampended random walk trying to converge on the proper frequency, which caused the adjustments to be made much slower then intended (most easily observed when large adjustments are made). This patch fixes the issue by using abs64() instead. Reported-by: Nuno Gonçalves <[email protected]> Tested-by: Nuno Goncalves <[email protected]> Signed-off-by: John Stultz <[email protected]> Cc: <[email protected]> # v3.17+ Cc: Linus Torvalds <[email protected]> Cc: Miroslav Lichvar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-09-02nohz: Assert existing housekeepers when nohz full enabledFrederic Weisbecker1-4/+11
The code ensures that when nohz full is running, at least the boot CPU serves as a housekeeper and it can't be later offlined. Let's assert this assumption to make sure that we have CPUs to handle unbound jobs like workqueues and timers while nohz full CPUs run undisturbed. Also improve the comments on housekeeper offlining prevention. Signed-off-by: Frederic Weisbecker <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Preeti U Murthy <[email protected]> Cc: Vatika Harlalka <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-09-01Merge branch 'timers-core-for-linus' of ↵Linus Torvalds7-75/+82
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner: "Rather large, but nothing exiting: - new range check for settimeofday() to prevent that boot time becomes negative. - fix for file time rounding - a few simplifications of the hrtimer code - fix for the proc/timerlist code so the output of clock realtime timers is accurate - more y2038 work - tree wide conversion of clockevent drivers to the new callbacks" * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (88 commits) hrtimer: Handle failure of tick_init_highres() gracefully hrtimer: Unconfuse switch_hrtimer_base() a bit hrtimer: Simplify get_target_base() by returning current base hrtimer: Drop return code of hrtimer_switch_to_hres() time: Introduce timespec64_to_jiffies()/jiffies_to_timespec64() time: Introduce current_kernel_time64() time: Introduce struct itimerspec64 time: Add the common weak version of update_persistent_clock() time: Always make sure wall_to_monotonic isn't positive time: Fix nanosecond file time rounding in timespec_trunc() timer_list: Add the base offset so remaining nsecs are accurate for non monotonic timers cris/time: Migrate to new 'set-state' interface kernel: broadcast-hrtimer: Migrate to new 'set-state' interface xtensa/time: Migrate to new 'set-state' interface unicore/time: Migrate to new 'set-state' interface um/time: Migrate to new 'set-state' interface sparc/time: Migrate to new 'set-state' interface sh/localtimer: Migrate to new 'set-state' interface score/time: Migrate to new 'set-state' interface s390/time: Migrate to new 'set-state' interface ...
2015-08-31Merge branch 'timers-nohz-for-linus' of ↵Linus Torvalds2-48/+34
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull NOHZ updates from Ingo Molnar: "The main changes, mostly written by Frederic Weisbecker, include: - Fix some jiffies based cputime assumptions. (No real harm because the concerned code isn't used by full dynticks.) - Simplify jiffies <-> usecs conversions. Remove dead code. - Remove early hacks on nohz full code that avoided messing up idle nohz internals. Now nohz integrates well full and idle and such hack have become needless. - Restart nohz full tick from irq exit. (A simplification and a preparation for future optimization on scheduler kick to nohz full) - Code cleanups. - Tile driver isolation enhancement on top of nohz. (Chris Metcalf)" * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: nohz: Remove useless argument on tick_nohz_task_switch() nohz: Move tick_nohz_restart_sched_tick() above its users nohz: Restart nohz full tick from irq exit nohz: Remove idle task special case nohz: Prevent tilegx network driver interrupts alpha: Fix jiffies based cputime assumption apm32: Fix cputime == jiffies assumption jiffies: Remove HZ > USEC_PER_SEC special case
2015-08-31Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds1-2/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main RCU changes in this cycle are: - the combination of tree geometry-initialization simplifications and OS-jitter-reduction changes to expedited grace periods. These two are stacked due to the large number of conflicts that would otherwise result. - privatize smp_mb__after_unlock_lock(). This commit moves the definition of smp_mb__after_unlock_lock() to kernel/rcu/tree.h, in recognition of the fact that RCU is the only thing using this, that nothing else is likely to use it, and that it is likely to go away completely. - documentation updates. - torture-test updates. - misc fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) rcu,locking: Privatize smp_mb__after_unlock_lock() rcu: Silence lockdep false positive for expedited grace periods rcu: Don't disable CPU hotplug during OOM notifiers scripts: Make checkpatch.pl warn on expedited RCU grace periods rcu: Update MAINTAINERS entry rcu: Clarify CONFIG_RCU_EQS_DEBUG help text rcu: Fix backwards RCU_LOCKDEP_WARN() in synchronize_rcu_tasks() rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN() rcu: Make rcu_is_watching() really notrace cpu: Wait for RCU grace periods concurrently rcu: Create a synchronize_rcu_mult() rcu: Fix obsolete priority-boosting comment rcu: Use WRITE_ONCE in RCU_INIT_POINTER rcu: Hide RCU_NOCB_CPU behind RCU_EXPERT rcu: Add RCU-sched flavors of get-state and cond-sync rcu: Add fastpath bypassing funnel locking rcu: Rename RCU_GP_DONE_FQS to RCU_GP_DOING_FQS rcu: Pull out wait_event*() condition into helper function documentation: Describe new expedited stall warnings rcu: Add stall warnings to synchronize_sched_expedited() ...
2015-08-22hrtimer: Handle failure of tick_init_highres() gracefullyGuenter Roeck1-0/+1
Commit 75e3b37d0598 ("hrtimer: Drop return code of hrtimer_switch_to_hres()") drops the return code of hrtimer_switch_to_hres(). While doing so, it also drops the return statement itself on failure. This may cause a system hang. Seen when running arm:multi_v7_defconfig in qemu with devicetree file vexpress-v2p-ca9. Fixes: 75e3b37d0598 ("hrtimer: Drop return code of hrtimer_switch_to_hres()") Cc: Luiz Capitulino <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-08-20Merge branch 'fortglx/4.3/time' of ↵Thomas Gleixner4-29/+40
https://git.linaro.org/people/john.stultz/linux into timers/core - A handful or y2038 related items - A walltime to monotonic limit - Small fixes for timespec_trunc() and timer_list output
2015-08-18hrtimer: Unconfuse switch_hrtimer_base() a bitFrederic Weisbecker1-8/+17
The variable called "this_base" is confusing because its name suggests it's of "struct hrtimer_clock_base" type, along with "base" and "new_base" which doesn't help understanding this complicated function. Make its name clearer and fix the misleading comment while at it. [ tglx: Fixed the comment for real ] Signed-off-by: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-08-18hrtimer: Simplify get_target_base() by returning current baseFrederic Weisbecker1-2/+2
Instead of fetching again the current cpu base, just take it from the parameter. Signed-off-by: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-08-18timer: Write timer->flags atomicallyEric Dumazet1-2/+2
lock_timer_base() cannot prevent the following : CPU1 ( in __mod_timer() timer->flags |= TIMER_MIGRATING; spin_unlock(&base->lock); base = new_base; spin_lock(&base->lock); // The next line clears TIMER_MIGRATING timer->flags &= ~TIMER_BASEMASK; CPU2 (in lock_timer_base()) see timer base is cpu0 base spin_lock_irqsave(&base->lock, *flags); if (timer->flags == tf) return base; // oops, wrong base timer->flags |= base->cpu // too late We must write timer->flags in one go, otherwise we can fool other cpus. Fixes: bc7a34b8b9eb ("timer: Reduce timer migration overhead if disabled") Signed-off-by: Eric Dumazet <[email protected]> Cc: Jon Christopherson <[email protected]> Cc: David Miller <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Sander Eikelenboom <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]> Cc: Thomas Gleixner <[email protected]>
2015-08-17hrtimer: Drop return code of hrtimer_switch_to_hres()Luiz Capitulino1-4/+2
It's not checked by the caller. Signed-off-by: Luiz Capitulino <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-08-17time: Introduce timespec64_to_jiffies()/jiffies_to_timespec64()Baolin Wang1-8/+13
The conversion between struct timespec and jiffies is not year 2038 safe on 32bit systems. Introduce timespec64_to_jiffies() and jiffies_to_timespec64() functions which use struct timespec64 to make it ready for 2038 issue. Cc: Prarit Bhargava <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Baolin Wang <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-08-17time: Introduce current_kernel_time64()Baolin Wang1-3/+3
The current_kernel_time() is not year 2038 safe on 32bit systems since it returns a timespec value. Introduce current_kernel_time64() which returns a timespec64 value. Cc: Prarit Bhargava <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Baolin Wang <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-08-17time: Add the common weak version of update_persistent_clock()Xunlei Pang1-0/+5
The weak update_persistent_clock64() calls update_persistent_clock(), if the architecture defines an update_persistent_clock64() to replace and remove its update_persistent_clock() version, when building the kernel the linker will throw an undefined symbol error, that is, any arch that switches to update_persistent_clock64() will have this issue. To solve the issue, we add the common weak update_persistent_clock(). Cc: Prarit Bhargava <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Arnd Bergmann <[email protected]> Signed-off-by: Xunlei Pang <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-08-17time: Always make sure wall_to_monotonic isn't positiveWang YanQing1-3/+10
Two issues were found on an IMX6 development board without an enabled RTC device(resulting in the boot time and monotonic time being initialized to 0). Issue 1:exportfs -a generate: "exportfs: /opt/nfs/arm does not support NFS export" Issue 2:cat /proc/stat: "btime 4294967236" The same issues can be reproduced on x86 after running the following code: int main(void) { struct timeval val; int ret; val.tv_sec = 0; val.tv_usec = 0; ret = settimeofday(&val, NULL); return 0; } Two issues are different symptoms of same problem: The reason is a positive wall_to_monotonic pushes boot time back to the time before Epoch, and getboottime will return negative value. In symptom 1: negative boot time cause get_expiry() to overflow time_t when input expire time is 2147483647, then cache_flush() always clears entries just added in ip_map_parse. In symptom 2: show_stat() uses "unsigned long" to print negative btime value returned by getboottime. This patch fix the problem by prohibiting time from being set to a value which would cause a negative boot time. As a result one can't set the CLOCK_REALTIME time prior to (1970 + system uptime). Cc: Prarit Bhargava <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Wang YanQing <[email protected]> [jstultz: reworded commit message] Signed-off-by: John Stultz <[email protected]>
2015-08-17time: Fix nanosecond file time rounding in timespec_trunc()Karsten Blees1-14/+8
timespec_trunc() avoids rounding if granularity <= nanoseconds-per-jiffie (or TICK_NSEC). This optimization assumes that: 1. current_kernel_time().tv_nsec is already rounded to TICK_NSEC (i.e. with HZ=1000 you'd get 1000000, 2000000, 3000000... but never 1000001). This is no longer true (probably since hrtimers introduced in 2.6.16). 2. TICK_NSEC is evenly divisible by all possible granularities. This may be true for HZ=100, 250, 1000, but obviously not for HZ=300 / TICK_NSEC=3333333 (introduced in 2.6.20). Thus, sub-second portions of in-core file times are not rounded to on-disk granularity. I.e. file times may change when the inode is re-read from disk or when the file system is remounted. This affects all file systems with file time granularities > 1 ns and < 1s, e.g. CEPH (1000 ns), UDF (1000 ns), CIFS (100 ns), NTFS (100 ns) and FUSE (configurable from user mode via struct fuse_init_out.time_gran). Steps to reproduce with e.g. UDF: $ dd if=/dev/zero of=udfdisk count=10000 && mkudffs udfdisk $ mkdir udf && mount udfdisk udf $ touch udf/test && stat -c %y udf/test 2015-06-09 10:22:56.130006767 +0200 $ umount udf && mount udfdisk udf $ stat -c %y udf/test 2015-06-09 10:22:56.130006000 +0200 Remounting truncates the mtime to 1 µs. Fix the rounding in timespec_trunc() and update the documentation. timespec_trunc() is exclusively used to calculate inode's [acm]time (mostly via current_fs_time()), and always with super_block.s_time_gran as second argument. So this can safely be changed without side effects. Note: This does _not_ fix the issue for FAT's 2 second mtime resolution, as super_block.s_time_gran isn't prepared to handle different ctime / mtime / atime resolutions nor resolutions > 1 second. Cc: Prarit Bhargava <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Karsten Blees <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-08-17timer_list: Add the base offset so remaining nsecs are accurate for non ↵John Stultz1-1/+1
monotonic timers I noticed for non-monotonic timers in timer_list, some of the output looked a little confusing. For example: #1: <0000000000000000>, posix_timer_fn, S:01, hrtimer_start_range_ns, leap-a-day/2360 # expires at 1434412800000000000-1434412800000000000 nsecs [in 1434410725062375469 to 1434410725062375469 nsecs] You'll note the relative time till the expiration "[in xxx to yyy nsecs]" is incorrect. This is because its printing the delta between CLOCK_MONOTONIC time to the CLOCK_REALTIME expiration. This patch fixes this issue by adding the clock offset to the "now" time which we use to calculate the delta. Cc: Prarit Bhargava <[email protected]> Cc: Daniel Bristot de Oliveira <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Jan Kara <[email protected]> Cc: Jiri Bohac <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-08-12Merge branch 'for-mingo' of ↵Ingo Molnar1-2/+0
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU changes from Paul E. McKenney: - The combination of tree geometry-initialization simplifications and OS-jitter-reduction changes to expedited grace periods. These two are stacked due to the large number of conflicts that would otherwise result. [ With one addition, a temporary commit to silence a lockdep false positive. Additional changes to the expedited grace-period primitives (queued for 4.4) remove the cause of this false positive, and therefore include a revert of this temporary commit. ] - Documentation updates. - Torture-test updates. - Miscellaneous fixes. Signed-off-by: Ingo Molnar <[email protected]>
2015-08-10kernel: broadcast-hrtimer: Migrate to new 'set-state' interfaceViresh Kumar1-29/+20
Migrate broadcast-hrtimer driver to the new 'set-state' interface provided by clockevents core, the earlier 'set-mode' interface is marked obsolete now. Cc: Thomas Gleixner <[email protected]> Signed-off-by: Viresh Kumar <[email protected]> Signed-off-by: Daniel Lezcano <[email protected]>