aboutsummaryrefslogtreecommitdiff
path: root/kernel/hrtimer.c
AgeCommit message (Collapse)AuthorFilesLines
2008-05-03hrtimer: remove duplicate helper functionOliver Hartkopp1-9/+0
The helper function hrtimer_callback_running() is used in kernel/hrtimer.c as well as in the updated net/can/bcm.c which now supports hrtimers. Moving the helper function to hrtimer.h removes the duplicate definition in the C-files. Signed-off-by: Oliver Hartkopp <[email protected]> Cc: David Miller <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2008-04-30add hrtimer specific debugobjects codeThomas Gleixner1-20/+157
hrtimers have now dynamic users in the network code. Put them under debugobjects surveillance as well. Add calls to the generic object debugging infrastructure and provide fixup functions which allow to keep the system alive when recoverable problems have been detected by the object debugging core code. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Greg KH <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Kay Sievers <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-28hrtimer: raise softirq unlocked to avoid circular lock dependencyThomas Gleixner1-2/+17
The scheduler hrtimer bits in 2.6.25 introduced a circular lock dependency in a rare code path: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.25-sched-devel.git-x86-latest.git #19 ------------------------------------------------------- X/2980 is trying to acquire lock: (&rq->rq_lock_key#2){++..}, at: [<ffffffff80230146>] task_rq_lock+0x56/0xa0 but task is already holding lock: (&cpu_base->lock){++..}, at: [<ffffffff80257ae1>] lock_hrtimer_base+0x31/0x60 which lock already depends on the new lock. The scenario which leads to this is: posix-timer signal is delivered -> posix-timer is rearmed timer is already expired in hrtimer_enqueue() -> softirq is raised To prevent this we need to move the raise of the softirq out of the base->lock protected code path. Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Acked-by: Peter Zijlstra <[email protected]>
2008-04-27hrtimer: timeout too long when using HRTIMER_CB_SOFTIRQBodo Stroesser1-2/+13
When using hrtimer with timer->cb_mode == HRTIMER_CB_SOFTIRQ in some cases the clockevent is not programmed. This happens, if: - a timer is rearmed while it's state is HRTIMER_STATE_CALLBACK - hrtimer_reprogram() returns -ETIME, when it is called after CALLBACK is finished. This occurs if the new timer->expires is in the past when CALLBACK is done. In this case, the timer needs to be removed from the tree and put onto the pending list again. The patch is against 2.6.22.5, but AFAICS, it is relevant for 2.6.25 also (in run_hrtimer_pending()). Signed-off-by: Bodo Stroesser <[email protected]> Cc: [email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2008-04-21hrtimer: optimize the softirq time optimizationThomas Gleixner1-4/+3
The previous optimization did not take the case into account where a clock provides its own softirq_get_time() function. Check for the availablitiy of the clock get time function first and then check if we need to retrieve the time for both clocks via hrtimer_softirq_gettime() to avoid a double evaluation of time in that case as well. Signed-off-by: Thomas Gleixner <[email protected]>
2008-04-21hrtimer: reduce calls to hrtimer_get_softirq_time()Dimitri Sivanich1-32/+32
It seems that hrtimer_run_queues() is calling hrtimer_get_softirq_time() more often than it needs to. This can cause frequent contention on systems with large numbers of processors/cores. With this patch, hrtimer_run_queues only calls hrtimer_get_softirq_time() if there is a pending timer in one of the hrtimer bases, and only once. This also combines hrtimer_run_queues() and the inline run_hrtimer_queue() into one function. [ [email protected]: coding style ] Signed-off-by: Dimitri Sivanich <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2008-04-17hrtimers: simplify lockdep handlingOleg Nesterov1-5/+4
In order to avoid the false positive from lockdep, each per-cpu base->lock has the separate lock class and migrate_hrtimers() uses double_spin_lock(). This is overcomplicated: except for migrate_hrtimers() we never take 2 locks at once, and migrate_hrtimers() can use spin_lock_nested(). Signed-off-by: Oleg Nesterov <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2008-04-17hrtimer: use nanosleep specific restart_block fieldsThomas Gleixner1-7/+6
Convert all the nanosleep related users of restart_block to the new nanosleep specific restart_block fields. Signed-off-by: Thomas Gleixner <[email protected]>
2008-02-14hrtimer: catch expired CLOCK_REALTIME timers earlyThomas Gleixner1-0/+11
A CLOCK_REALTIME timer, which has an absolute expiry time less than the clock realtime offset calls with a negative delta into the clock events code and triggers the WARN_ON() there. This is a false positive and needs to be prevented. Check the result of timer->expires - timer->base->offset right away and return -ETIME right away. Thanks to Frans Pop, who reported the problem and tested the fixes. Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Frans Pop <[email protected]>
2008-02-14hrtimer: check relative timeouts for overflowThomas Gleixner1-17/+20
Various user space callers ask for relative timeouts. While we fixed that overflow issue in hrtimer_start(), the sites which convert relative user space values to absolute timeouts themself were uncovered. Instead of putting overflow checks into each place add a function which does the sanity checking and convert all affected callers to use it. Thanks to Frans Pop, who reported the problem and tested the fixes. Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Ingo Molnar <[email protected]> Tested-by: Frans Pop <[email protected]>
2008-02-10hrtimer: don't modify restart_block->fn in restart functionsOleg Nesterov1-4/+0
hrtimer_nanosleep_restart() clears/restores restart_block->fn. This is pointless and complicates its usage. Note that if sys_restart_syscall() doesn't actually happen, we have a bogus "pending" restart->fn anyway, this is harmless. Signed-off-by: Oleg Nesterov <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Pavel Emelyanov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Toyo Abe <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2008-02-10hrtimer: fix *rmtp handling in hrtimer_nanosleep()Oleg Nesterov1-24/+27
Spotted by Pavel Emelyanov and Alexey Dobriyan. hrtimer_nanosleep() sets restart_block->arg1 = rmtp, but this rmtp points to the local variable which lives in the caller's stack frame. This means that if sys_restart_syscall() actually happens and it is interrupted as well, we don't update the user-space variable, but write into the already dead stack frame. Introduced by commit 04c227140fed77587432667a574b14736a06dd7f hrtimer: Rework hrtimer_nanosleep to make sys_compat_nanosleep easier Change the callers to pass "__user *rmtp" to hrtimer_nanosleep(), and change hrtimer_nanosleep() to use copy_to_user() to actually update *rmtp. Small problem remains. man 2 nanosleep states that *rtmp should be written if nanosleep() was interrupted (it says nothing whether it is OK to update *rmtp if nanosleep returns 0), but (with or without this patch) we can dirty *rem even if nanosleep() returns 0. NOTE: this patch doesn't change compat_sys_nanosleep(), because it has other bugs. Fixed by the next patch. Signed-off-by: Oleg Nesterov <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Pavel Emelyanov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Toyo Abe <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> include/linux/hrtimer.h | 2 - kernel/hrtimer.c | 51 +++++++++++++++++++++++++----------------------- kernel/posix-timers.c | 14 +------------ 3 files changed, 30 insertions(+), 37 deletions(-)
2008-02-05timerfd: new timerfd APIDavide Libenzi1-5/+4
This is the new timerfd API as it is implemented by the following patch: int timerfd_create(int clockid, int flags); int timerfd_settime(int ufd, int flags, const struct itimerspec *utmr, struct itimerspec *otmr); int timerfd_gettime(int ufd, struct itimerspec *otmr); The timerfd_create() API creates an un-programmed timerfd fd. The "clockid" parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME. The timerfd_settime() API give new settings by the timerfd fd, by optionally retrieving the previous expiration time (in case the "otmr" parameter is not NULL). The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit is set in the "flags" parameter. Otherwise it's a relative time. The timerfd_gettime() API returns the next expiration time of the timer, or {0, 0} if the timerfd has not been set yet. Like the previous timerfd API implementation, read(2) and poll(2) are supported (with the same interface). Here's a simple test program I used to exercise the new timerfd APIs: http://www.xmailserver.org/timerfd-test2.c [[email protected]: coding-style cleanups] [[email protected]: fix ia64 build] [[email protected]: fix m68k build] [[email protected]: fix mips build] [[email protected]: fix alpha, arm, blackfin, cris, m68k, s390, sparc and sparc64 builds] [[email protected]: fix s390] [[email protected]: fix powerpc build] [[email protected]: fix sparc64 more] Signed-off-by: Davide Libenzi <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Davide Libenzi <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Martin Schwidefsky <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Davide Libenzi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-02-01hrtimer: fix hrtimer_init_sleeper() usersPeter Zijlstra1-0/+2
this patch: commit 37bb6cb4097e29ffee970065b74499cbf10603a3 Author: Peter Zijlstra <[email protected]> Date: Fri Jan 25 21:08:32 2008 +0100 hrtimer: unlock hrtimer_wakeup Broke hrtimer_init_sleeper() users. It forgot to fix up the futex caller of this function to detect the failed queueing and messed up the do_nanosleep() caller in that it could leak a TASK_INTERRUPTIBLE state. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-01-25hrtimer: unlock hrtimer_wakeupPeter Zijlstra1-1/+3
hrtimer_wakeup creates a base->lock rq->lock lock dependancy. Avoid this by switching to HRTIMER_CB_IRQSAFE_NO_SOFTIRQ which doesn't hold base->lock. This fully untangles hrtimer locks from the scheduler locks, and allows hrtimer usage in the scheduler proper. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-01-25hrtimer: fixup the HRTIMER_CB_IRQSAFE_NO_SOFTIRQ fallbackPeter Zijlstra1-131/+139
Currently all highres=off timers are run from softirq context, but HRTIMER_CB_IRQSAFE_NO_SOFTIRQ timers expect to run from irq context. Fix this up by splitting it similar to the highres=on case. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-01-25hrtimer: clean up cpu->base locking tricksPeter Zijlstra1-1/+19
In order to more easily allow for the scheduler to use timers, clean up the locking a bit. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-01-21hrtimer: fix section mismatchRandy Dunlap1-1/+1
Fix section mismatch in hrtimer.c: WARNING: vmlinux.o(.text+0x50c61): Section mismatch: reference to .init.text: (between 'hrtimer_cpu_notify' and 'down_read_trylock') Noticed by Johannes Berg and confirmed by Sam Ravnborg. Signed-off-by: Randy Dunlap <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <torvalds@[email protected]>
2007-12-07hrtimers: avoid overflow for large relative timeoutsThomas Gleixner1-0/+8
Relative hrtimers with a large timeout value might end up as negative timer values, when the current time is added in hrtimer_start(). This in turn is causing the clockevents_set_next() function to set an huge timeout and sleep for quite a long time when we have a clock source which is capable of long sleeps like HPET. With PIT this almost goes unnoticed as the maximum delta is ~27ms. The non-hrt/nohz code sorts this out in the next timer interrupt, so we never noticed that problem which has been there since the first day of hrtimers. This bug became more apparent in 2.6.24 which activates HPET on more hardware. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2007-10-29Quieten hrtimer printk: "Switched to high resolution mode .."Michael Ellerman1-1/+1
Change the hrtimer printk "Switched to high resolution mode .." to be KERN_DEBUG, rather than KERN_INFO. If users need to see this they can pass "loglevel" or "debug" on the command line, or check dmesg. Signed-off-by: Michael Ellerman <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> kernel/hrtimer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
2007-10-20fix comment: unlock_hrtimer_base is the counterpart of lock_hrtimer_baseUwe Kleine-König1-1/+1
Signed-off-by: Uwe Kleine-König <[email protected]> Signed-off-by: Adrian Bunk <[email protected]>
2007-10-19Fix misspellings of "system", "controller", "interrupt" and "necessary".Robert P. J. Day1-1/+1
Fix the various misspellings of "system", controller", "interrupt" and "[un]necessary". Signed-off-by: Robert P. J. Day <[email protected]> Signed-off-by: Adrian Bunk <[email protected]>
2007-10-18hrtimer: Rework hrtimer_nanosleep to make sys_compat_nanosleep easierAnton Blanchard1-13/+16
Pull the copy_to_user out of hrtimer_nanosleep and into the callers (common_nsleep, sys_nanosleep) in preparation for converting compat_sys_nanosleep to use hrtimers. Signed-off-by: Anton Blanchard <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2007-10-10[KTIME]: Introduce ktime_sub_ns and ktime_sub_usArnaldo Carvalho de Melo1-0/+24
First user will be the DCCP transport networking protocol. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2007-07-25Cache xtime every call to update_wall_timejohn stultz1-4/+0
This avoids xtime lag seen with dynticks, because while 'xtime' itself is still not updated often, we keep a 'xtime_cache' variable around that contains the approximate real-time that _is_ updated each time we do a 'update_wall_time()', and is thus never off by more than one tick. IOW, this restores the original semantics for 'xtime' users, as long as you use the proper abstraction functions (ie 'current_kernel_time()' or 'get_seconds()' depending on whether you want a timespec or just the seconds field). [ Updated Patch. As penance for my sins I've also yanked another #ifdef that was added to avoid the xtime lag w/ hrtimers. ] Signed-off-by: John Stultz <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-07-25Cleanup non-arch xtime uses, use get_seconds() or current_kernel_time().john stultz1-1/+1
This avoids use of the kernel-internal "xtime" variable directly outside of the actual time-related functions. Instead, use the helper functions that we already have available to us. This doesn't actually change any behaviour, but this will allow us to fix the fact that "xtime" isn't updated very often with CONFIG_NO_HZ (because much of the realtime information is maintained as separate offsets to 'xtime'), which has caused interfaces that use xtime directly to get a time that is out of sync with the real-time clock by up to a third of a second or so. Signed-off-by: John Stultz <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-07-21hrtimer: speedup hrtimer_enqueueIngo Molnar1-4/+6
Speedup hrtimer_enqueue by evaluating the rbtree insertion result. Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: john stultz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-07-21highres: improve debug outputIngo Molnar1-1/+4
Add some more debug information to the hrtimer and clock events code. Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: john stultz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-07-16[HRTIMER] Fix cpu pointer arg to clockevents_notify()David Miller1-1/+1
All of the clockevent notifiers expect a pointer to an "unsigned int" cpu argument, but hrtimer_cpu_notify() passes in a pointer to a long. [ Discussed with and ok by Thomas Gleixner ] Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-05-09Add suspend-related notifications for CPU hotplugRafael J. Wysocki1-0/+2
Since nonboot CPUs are now disabled after tasks and devices have been frozen and the CPU hotplug infrastructure is used for this purpose, we need special CPU hotplug notifications that will help the CPU-hotplug-aware subsystems distinguish normal CPU hotplug events from CPU hotplug events related to a system-wide suspend or resume operation in progress. This patch introduces such notifications and causes them to be used during suspend and resume transitions. It also changes all of the CPU-hotplug-aware subsystems to take these notifications into consideration (for now they are handled in the same way as the corresponding "normal" ones). [[email protected]: cleanups] Signed-off-by: Rafael J. Wysocki <[email protected]> Cc: Gautham R Shenoy <[email protected]> Cc: Pavel Machek <[email protected]> Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-05-08export hrtimer_forwardStas Sergeev1-0/+1
Other symbols of the hrtimers API are already exported. Signed-off-by: Stas Sergeev <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-04-27[NET]: Fix networking compilation errorsDavid Howells1-0/+2
Fix miscellaneous networking compilation errors. (*) Export ktime_add_ns() for modules. (*) wext_proc_init() should have an ANSI declaration. Signed-off-by: David Howells <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2007-04-25[NET_SCHED]: Use ktime as clocksourcePatrick McHardy1-0/+1
Get rid of the manual clock source selection mess and use ktime. Also use a scalar representation, which allows to clean up pkt_sched.h a bit more and results in less ktime_to_ns() calls in most cases. The PSCHED_US2JIFFIE/PSCHED_JIFFIE2US macros are implemented quite inefficient by this patch, following patches will convert all qdiscs to hrtimers and get rid of them entirely. Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2007-04-07[PATCH] high-res timers: resume fixIngo Molnar1-0/+12
Soeren Sonnenburg reported that upon resume he is getting this backtrace: [<c0119637>] smp_apic_timer_interrupt+0x57/0x90 [<c0142d30>] retrigger_next_event+0x0/0xb0 [<c0104d30>] apic_timer_interrupt+0x28/0x30 [<c0142d30>] retrigger_next_event+0x0/0xb0 [<c0140068>] __kfifo_put+0x8/0x90 [<c0130fe5>] on_each_cpu+0x35/0x60 [<c0143538>] clock_was_set+0x18/0x20 [<c0135cdc>] timekeeping_resume+0x7c/0xa0 [<c02aabe1>] __sysdev_resume+0x11/0x80 [<c02ab0c7>] sysdev_resume+0x47/0x80 [<c02b0b05>] device_power_up+0x5/0x10 it turns out that on resume we mistakenly re-enable interrupts too early. Do the timer retrigger only on the current CPU. Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Soeren Sonnenburg <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-03-28[PATCH] hrtimers: fix reprogramming SMP raceIngo Molnar1-1/+6
hrtimer_start() incorrectly set the 'reprogram' flag to enqueue_hrtimer(), which should only be 1 if the hrtimer is queued to the current CPU. Doing otherwise could result in a reprogramming of the current CPU's clockevents device, with a timer that is not queued to it - resulting in a bogus next expiry value. Signed-off-by: Ingo Molnar <[email protected]> Cc: Michal Piotrowski <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-03-16[PATCH] hrtimer: fix up unlocked access to wall_to_monotonicThomas Gleixner1-2/+3
commit f4304ab21513b834c8fe3403927c60c2b81a72d7 (HZ free NTP) moved the access to wall_to_monotonic in hrtimer_get_softirq_time() out of the xtime_lock protection. Move it back into the seq_lock section. Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: John Stultz <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-03-16[PATCH] hrtimer: prevent overrun DoS in hrtimer_forward()Thomas Gleixner1-0/+6
hrtimer_forward() does not check for the possible overflow of timer->expires. This can happen on 64 bit machines with large interval values and results currently in an endless loop in the softirq because the expiry value becomes negative and therefor the timer is expired all the time. Check for this condition and set the expiry value to the max. expiry time in the future. The fix should be applied to stable kernel series as well. Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-03-06[PATCH] highres: do not run the TIMER_SOFTIRQ after switching to highres modeThomas Gleixner1-5/+10
The TIMER_SOFTIRQ runs the hrtimers during bootup until a usable clocksource and clock event sources are registered. The switch to high resolution mode happens inside of the TIMER_SOFTIRQ, but runs the softirq afterwards. That way the tick emulation timer, which was set up in the switch to highres might be executed in the softirq context, which is a BUG. The rbtree has not to be touched by the softirq after the highres switch. This BUG was observed by Andres Salomon, who provided the information to debug it. Return early from the softirq, when the switch was sucessful. [[email protected]: add debug warning] [[email protected]: make debug warning compile] Signed-off-by: Thomas Gleixner <[email protected]> Cc: Andres Salomon <[email protected]> Acked-by: Ingo Molnar <[email protected]> Signed-off-by: Andres Salomon <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-03-05[PATCH] timer/hrtimer: take per cpu locks in sane orderHeiko Carstens1-5/+4
Doing something like this on a two cpu system # echo 0 > /sys/devices/system/cpu/cpu0/online # echo 1 > /sys/devices/system/cpu/cpu0/online # echo 0 > /sys/devices/system/cpu/cpu1/online will give me this: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.21-rc2-g562aa1d4-dirty #7 ------------------------------------------------------- bash/1282 is trying to acquire lock: (&cpu_base->lock_key){.+..}, at: [<000000000005f17e>] hrtimer_cpu_notify+0xc6/0x240 but task is already holding lock: (&cpu_base->lock_key#2){.+..}, at: [<000000000005f174>] hrtimer_cpu_notify+0xbc/0x240 which lock already depends on the new lock. This happens because we have the following code in kernel/hrtimer.c: migrate_hrtimers(int cpu) [...] old_base = &per_cpu(hrtimer_bases, cpu); new_base = &get_cpu_var(hrtimer_bases); [...] spin_lock(&new_base->lock); spin_lock(&old_base->lock); Which means the spinlocks are taken in an order which depends on which cpu gets shut down from which other cpu. Therefore lockdep complains that there might be an ABBA deadlock. Since migrate_hrtimers() gets only called on cpu hotplug it's safe to assume that it isn't executed concurrently on a The same problem exists in kernel/timer.c: migrate_timers(). As pointed out by Christian Borntraeger one possible solution to avoid the locking order complaints would be to make sure that the locks are always taken in the same order. E.g. by taking the lock of the cpu with the lower number first. To achieve this we introduce two new spinlock functions double_spin_lock and double_spin_unlock which lock or unlock two locks in a given order. Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Roman Zippel <[email protected]> Cc: John Stultz <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Martin Schwidefsky <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] Add debugging feature /proc/timer_statIngo Molnar1-0/+26
Add /proc/timer_stats support: debugging feature to profile timer expiration. Both the starting site, process/PID and the expiration function is captured. This allows the quick identification of timer event sources in a system. Sample output: # echo 1 > /proc/timer_stats # cat /proc/timer_stats Timer Stats Version: v0.1 Sample period: 4.010 s 24, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick) 11, 0 swapper sk_reset_timer (tcp_delack_timer) 6, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick) 2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) 17, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick) 2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) 4, 2050 pcscd do_nanosleep (hrtimer_wakeup) 5, 4179 sshd sk_reset_timer (tcp_write_timer) 4, 2248 yum-updatesd schedule_timeout (process_timeout) 18, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick) 3, 0 swapper sk_reset_timer (tcp_delack_timer) 1, 1 swapper neigh_table_init_no_netlink (neigh_periodic_timer) 2, 1 swapper e1000_up (e1000_watchdog) 1, 1 init schedule_timeout (process_timeout) 100 total events, 25.24 events/sec [ cleanups and hrtimers support from Thomas Gleixner <[email protected]> ] [[email protected]: nr_entries can become static] Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: john stultz <[email protected]> Cc: Roman Zippel <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Adrian Bunk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] hrtimers: add high resolution timer supportThomas Gleixner1-48/+520
Implement high resolution timers on top of the hrtimers infrastructure and the clockevents / tick-management framework. This provides accurate timers for all hrtimer subsystem users. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: john stultz <[email protected]> Cc: Roman Zippel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] tick-management: dyntick / highres functionalityThomas Gleixner1-6/+14
With Ingo Molnar <[email protected]> Add functions to provide dynamic ticks and high resolution timers. The code which keeps track of jiffies and handles the long idle periods is shared between tick based and high resolution timer based dynticks. The dyntick functionality can be disabled on the kernel commandline. Provide also the infrastructure to support high resolution timers. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: john stultz <[email protected]> Cc: Roman Zippel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] clockevents: add core functionalityThomas Gleixner1-10/+3
Architectures register their clock event devices, in the clock events core. Users of the clockevents core can get clock event devices for their use. The clockevents core code provides notification mechanisms for various clock related management events. This allows to control the clock event devices without the architectures having to worry about the details of function assignment. This is also a preliminary for high resolution timers and dynamic ticks to allow the core code to control the clock functionality without intrusive changes to the architecture code. [Fixes-by: Ingo Molnar <[email protected]>] Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: Roman Zippel <[email protected]> Cc: john stultz <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] hrtimers: clean up callback trackingThomas Gleixner1-9/+1
Reintroduce ktimers feature "optimized away" by the ktimers review process: remove the curr_timer pointer from the cpu-base and use the hrtimer state. No functional changes. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: Roman Zippel <[email protected]> Cc: john stultz <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] hrtimers; add state trackingThomas Gleixner1-8/+32
Reintroduce ktimers feature "optimized away" by the ktimers review process: multiple hrtimer states to enable the running of hrtimers without holding the cpu-base-lock. (The "optimized" rbtree hack carried only 2 states worth of information and we need 4 for high resolution timers and dynamic ticks.) No functional changes. Build-fixes-from: Andrew Morton <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: Roman Zippel <[email protected]> Cc: john stultz <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] hrtimers: cleanup lockingThomas Gleixner1-86/+98
Improve kernel/hrtimers.c locking: use a per-CPU base with a lock to control locking of all clocks belonging to a CPU. This simplifies code that needs to lock all clocks at once. This makes life easier for high-res timers and dyntick. No functional changes. [ optimization change from Andrew Morton <[email protected]> ] Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: john stultz <[email protected]> Cc: Roman Zippel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] hrtimers: namespace and enum cleanupThomas Gleixner1-9/+9
- hrtimers did not use the hrtimer_restart enum and relied on the implict int representation. Fix the prototypes and the functions using the enums. - Use seperate name spaces for the enumerations - Convert hrtimer_restart macro to inline function - Add comments No functional changes. [[email protected]: fix input driver] Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: john stultz <[email protected]> Cc: Roman Zippel <[email protected]> Cc: Dmitry Torokhov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] Extend next_timer_interrupt() to use a reference jiffieThomas Gleixner1-1/+1
For CONFIG_NO_HZ we need to calculate the next timer wheel event based on a given jiffie value. Extend the existing code to allow the extra 'now' argument. Provide a compability function for the existing implementations to call the function with now == jiffies. (This also solves the racyness of the original code vs. jiffies changing during the iteration.) No functional changes to existing users of this infrastructure. [ remove WARN_ON() that triggered on s390, by Carsten Otte <[email protected]> ] [ made new helper static, Adrian Bunk <[email protected]> ] Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: john stultz <[email protected]> Cc: Roman Zippel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] GTOD: persistent clock supportJohn Stultz1-0/+8
Persistent clock support: do proper timekeeping across suspend/resume. [[email protected]: cleanup] Signed-off-by: John Stultz <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: Roman Zippel <[email protected]> Cc: Adrian Bunk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] HZ free ntpjohn stultz1-3/+8
Distangle the NTP update from HZ. This is necessary for dynamic tick enabled kernels. Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: john stultz <[email protected]> Cc: Roman Zippel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>