aboutsummaryrefslogtreecommitdiff
path: root/kernel/timer.c
AgeCommit message (Collapse)AuthorFilesLines
2008-04-17timers: simplify lockdep handlingOleg Nesterov1-12/+4
In order to avoid the false positive from lockdep, each per-cpu base->lock has the separate lock class and migrate_timers() uses double_spin_lock(). This all is overcomplicated: except for migrate_timers() we never take 2 locks at once, and migrate_timers() can use spin_lock_nested(). Signed-off-by: Oleg Nesterov <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2008-03-26NOHZ: reevaluate idle sleep length after add_timer_on()Thomas Gleixner1-1/+9
add_timer_on() can add a timer on a CPU which is currently in a long idle sleep, but the timer wheel is not reevaluated by the nohz code on that CPU. So a timer can be delayed for quite a long time. This triggered a false positive in the clocksource watchdog code. To avoid this we need to wake up the idle CPU and enforce the reevaluation of the timer wheel for the next timer event. Add a function, which checks a given CPU for idle state, marks the idle task with NEED_RESCHED and sends a reschedule IPI to notify the other CPU of the change in the timer wheel. Call this function from add_timer_on(). Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: [email protected] -- include/linux/sched.h | 6 ++++++ kernel/sched.c | 43 +++++++++++++++++++++++++++++++++++++++++++ kernel/timer.c | 10 +++++++++- 3 files changed, 58 insertions(+), 1 deletion(-)
2008-02-08kernel: remove fastcall in kernel/*Harvey Harrison1-3/+3
[[email protected]: coding-style fixes] Signed-off-by: Harvey Harrison <[email protected]> Acked-by: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-02-08Pidns: make full use of xxx_vnr() callsPavel Emelyanov1-1/+1
Some time ago the xxx_vnr() calls (e.g. pid_vnr or find_task_by_vpid) were _all_ converted to operate on the current pid namespace. After this each call like xxx_nr_ns(foo, current->nsproxy->pid_ns) is nothing but a xxx_vnr(foo) one. Switch all the xxx_nr_ns() callers to use the xxx_vnr() calls where appropriate. Signed-off-by: Pavel Emelyanov <[email protected]> Reviewed-by: Oleg Nesterov <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Balbir Singh <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-02-06taskstats scaled time cleanupMichael Neuling1-4/+6
This moves the ability to scale cputime into generic code. This allows us to fix the issue in kernel/timer.c (noticed by Balbir) where we could only add an unscaled value to the scaled utime/stime. This adds a cputime_to_scaled function. As before, the POWERPC version does the scaling based on the last SPURR/PURR ratio calculated. The generic and s390 (only other arch to implement asm/cputime.h) versions are both NOPs. Also moves the SPURR and PURR snapshots closer. Signed-off-by: Michael Neuling <[email protected]> Cc: Jay Lan <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Martin Schwidefsky <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-02-01Merge branch 'task_killable' of ↵Linus Torvalds1-0/+7
git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc * 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc: (22 commits) Remove commented-out code copied from NFS NFS: Switch from intr mount option to TASK_KILLABLE Add wait_for_completion_killable Add wait_event_killable Add schedule_timeout_killable Use mutex_lock_killable in vfs_readdir Add mutex_lock_killable Use lock_page_killable Add lock_page_killable Add fatal_signal_pending Add TASK_WAKEKILL exit: Use task_is_* signal: Use task_is_* sched: Use task_contributes_to_load, TASK_ALL and TASK_NORMAL ptrace: Use task_is_* power: Use task_is_* wait: Use TASK_NORMAL proc/base.c: Use task_is_* proc/array.c: Use TASK_REPORT perfmon: Use task_is_* ... Fixed up conflicts in NFS/sunrpc manually..
2008-01-30time: timer cleanupsPavel Machek1-1/+1
Small cleanups to tick-related code. Wrong preempt count is followed by BUG(), so it is hardly KERN_WARNING. Signed-off-by: Pavel Machek <[email protected]> Cc: john stultz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2008-01-30time: clean hungarian notation from timersPavel Machek1-41/+39
Clean up hungarian notation from timer code. Signed-off-by: Pavel Machek <[email protected]> Cc: john stultz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2008-01-25hrtimer: fixup the HRTIMER_CB_IRQSAFE_NO_SOFTIRQ fallbackPeter Zijlstra1-1/+2
Currently all highres=off timers are run from softirq context, but HRTIMER_CB_IRQSAFE_NO_SOFTIRQ timers expect to run from irq context. Fix this up by splitting it similar to the highres=on case. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2008-01-21timer: fix section mismatchRandy Dunlap1-1/+1
The caller is __cpuinit. Also, this code block and its caller are inside #ifdef CONFIG_HOTPLUG_CPU blocks, so this code should reflect that config symbol's usage. WARNING: vmlinux.o(.text+0x4252f): Section mismatch: reference to .init.text: (between 'timer_cpu_notify' and 'msleep') Signed-off-by: Randy Dunlap <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <torvalds@[email protected]>
2008-01-13remove task_ppid_nr_nsRoland McGrath1-1/+1
task_ppid_nr_ns is called in three places. One of these should never have called it. In the other two, using it broke the existing semantics. This was presumably accidental. If the function had not been there, it would have been much more obvious to the eye that those patches were changing the behavior. We don't need this function. In task_state, the pid of the ptracer is not the ppid of the ptracer. In do_task_stat, ppid is the tgid of the real_parent, not its pid. I also moved the call outside of lock_task_sighand, since it doesn't need it. In sys_getppid, ppid is the tgid of the real_parent, not its pid. Signed-off-by: Roland McGrath <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-12-18timer: kernel/timer.c section fixesAdrian Bunk1-2/+2
This patch fixes the following section mismatches with CONFIG_HOTPLUG=n, CONFIG_HOTPLUG_CPU=y: ... WARNING: vmlinux.o(.text+0x41cd3): Section mismatch: reference to .init.data:tvec_base_done.22610 (between 'timer_cpu_notify' and 'run_timer_softirq') WARNING: vmlinux.o(.text+0x41d67): Section mismatch: reference to .init.data:tvec_base_done.22610 (between 'timer_cpu_notify' and 'run_timer_softirq') ... Signed-off-by: Adrian Bunk <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2007-12-06Add schedule_timeout_killableMatthew Wilcox1-0/+7
Signed-off-by: Matthew Wilcox <[email protected]>
2007-11-09sched: restore deterministic CPU accounting on powerpcPaul Mackerras1-7/+14
Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been broken on powerpc, because we end up counting user time twice: once in timer_interrupt() and once in update_process_times(). This fixes the problem by pulling the code in update_process_times that updates utime and stime into a separate function called account_process_tick. If CONFIG_VIRT_CPU_ACCOUNTING is not defined, there is a version of account_process_tick in kernel/timer.c that simply accounts a whole tick to either utime or stime as before. If CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to implement account_process_tick. This also lets us simplify the s390 code a bit; it means that the s390 timer interrupt can now call update_process_times even when CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a suitable account_process_tick(). account_process_tick() now takes the task_struct * as an argument. Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING. Signed-off-by: Paul Mackerras <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2007-11-05time: fix inconsistent function names in commentsLi Zefan1-1/+1
Signed-off-by: Li Zefan <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: john stultz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-19pid namespaces: changes to show virtual ids to userPavel Emelyanov1-3/+4
This is the largest patch in the set. Make all (I hope) the places where the pid is shown to or get from user operate on the virtual pids. The idea is: - all in-kernel data structures must store either struct pid itself or the pid's global nr, obtained with pid_nr() call; - when seeking the task from kernel code with the stored id one should use find_task_by_pid() call that works with global pids; - when showing pid's numerical value to the user the virtual one should be used, but however when one shows task's pid outside this task's namespace the global one is to be used; - when getting the pid from userspace one need to consider this as the virtual one and use appropriate task/pid-searching functions. [[email protected]: build fix] [[email protected]: nuther build fix] [[email protected]: yet nuther build fix] [[email protected]: remove unneeded casts] Signed-off-by: Pavel Emelyanov <[email protected]> Signed-off-by: Alexey Dobriyan <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul Menage <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18Add scaled time to taskstats based process accountingMichael Neuling1-2/+5
This adds items to the taststats struct to account for user and system time based on scaling the CPU frequency and instruction issue rates. Adds account_(user|system)_time_scaled callbacks which architectures can use to account for time using this mechanism. Signed-off-by: Michael Neuling <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Jay Lan <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-18whitespace fixes: system timersDaniel Walker1-1/+1
Signed-off-by: Daniel Walker <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-07-20Pull ia64-clocksource into release branchTony Luck1-188/+0
2007-07-20[IA64] remove time interpolatorBob Picco1-188/+0
Remove time_interpolator code (This is generic code, but only user was ia64. It has been superseded by the CONFIG_GENERIC_TIME code). Signed-off-by: Bob Picco <[email protected]> Signed-off-by: John Stultz <[email protected]> Signed-off-by: Peter Keilty <[email protected]> Signed-off-by: Tony Luck <[email protected]>
2007-07-19timer.c: cleanup recently introduced whitespace damageThomas Gleixner1-12/+12
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-07-17Slab allocators: Replace explicit zeroing with __GFP_ZEROChristoph Lameter1-2/+2
kmalloc_node() and kmem_cache_alloc_node() were not available in a zeroing variant in the past. But with __GFP_ZERO it is possible now to do zeroing while allocating. Use __GFP_ZERO to remove the explicit clearing of memory via memset whereever we can. Signed-off-by: Christoph Lameter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-07-16Add a flag to indicate deferrable timers in /proc/timer_statsVenki Pallipadi1-0/+14
Add a flag in /proc/timer_stats to indicate deferrable timers. This will let developers/users to differentiate between types of tiemrs in /proc/timer_stats. Deferrable timer and normal timer will appear in /proc/timer_stats as below. 10D, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) 10, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) Also version of timer_stats changes from v0.1 to v0.2 Signed-off-by: Venkatesh Pallipadi <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: john stultz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-07-16Use boot based time for uptime in /procTomas Janousek1-0/+1
Commit 411187fb05cd11676b0979d9fbf3291db69dbce2 caused uptime not to increase during suspend. This may cause confusion so I restore the old behaviour by using the boot based time instead of monotonic for uptime. Signed-off-by: Tomas Janousek <[email protected]> Acked-by: John Stultz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-05-29NOHZ: prevent multiplication overflow - stop timer for huge timeoutsThomas Gleixner1-1/+9
get_next_timer_interrupt() returns a delta of (LONG_MAX > 1) in case there is no timer pending. On 64 bit machines this results in a multiplication overflow in tick_nohz_stop_sched_tick(). Reported by: Dave Miller <[email protected]> Make the return value a constant and limit the return value to a 32 bit value. When the max timeout value is returned, we can safely stop the tick timer device. The max jiffies delta results in a 12 days timeout for HZ=1000. In the long term the get_next_timer_interrupt() code needs to be reworked to return ktime instead of jiffies, but we have to wait until the last users of the original NO_IDLE_HZ code are converted. Signed-off-by: Thomas Gleixner <[email protected]> Acked-off-by: David S. Miller <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-05-14timekeeping fix patch got mis-appliedThomas Gleixner1-2/+0
The time keeping code move to kernel/time/timekeeping.c broke the clocksource resume logic patch, which got applied to the old file by a fuzzy application. Fix it up and move the clocksource_resume() call to the appropriate place. Signed-off-by: Thomas Gleixner <[email protected]> [ tssk, tssk, everybody should use --fuzz=0 ] Signed-off-by: Linus Torvalds <[email protected]>
2007-05-10timer: revert parenthesis fix in tbase_get_deferrable() etc[email protected]1-5/+5
On 09-05-2007 21:10, Pallipadi, Venkatesh wrote: ... > On a 64 bit system, converting pointer to int causes unnecessary > compiler warning, and intermediate long conversion was to avoid that. > I will have to rephrase my comment to remove 32 bit value and use int, > as that is what the function returns. So, this patch reverts all changes done by my previous patch. I apologize for my wrong comment about "logical error" here. Cc: "Pallipadi, Venkatesh" <[email protected]> Cc: Satyam Sharma <[email protected]> Cc: Oleg Nesterov <[email protected]> Signed-off-by: Jarek Poplawski <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-05-09clocksource: fix resume logicThomas Gleixner1-0/+2
We need to make sure that the clocksources are resumed, when timekeeping is resumed. The current resume logic does not guarantee this. Add a resume function pointer to the clocksource struct, so clocksource drivers which need to reinitialize the clocksource can provide a resume function. Add a resume function, which calls the maybe available clocksource resume functions and resets the watchdog function, so a stable TSC can be used accross suspend/resume. Signed-off-by: Thomas Gleixner <[email protected]> Cc: john stultz <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-05-09Add suspend-related notifications for CPU hotplugRafael J. Wysocki1-0/+2
Since nonboot CPUs are now disabled after tasks and devices have been frozen and the CPU hotplug infrastructure is used for this purpose, we need special CPU hotplug notifications that will help the CPU-hotplug-aware subsystems distinguish normal CPU hotplug events from CPU hotplug events related to a system-wide suspend or resume operation in progress. This patch introduces such notifications and causes them to be used during suspend and resume transitions. It also changes all of the CPU-hotplug-aware subsystems to take these notifications into consideration (for now they are handled in the same way as the corresponding "normal" ones). [[email protected]: cleanups] Signed-off-by: Rafael J. Wysocki <[email protected]> Cc: Gautham R Shenoy <[email protected]> Cc: Pavel Machek <[email protected]> Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-05-09timer: parenthesis fix in tbase_get_deferrable() etcJarek Poplawski1-5/+5
Signed-off-by: Jarek Poplawski <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-05-08Introduce a handy list_first_entry macroPavel Emelianov1-2/+2
There are many places in the kernel where the construction like foo = list_entry(head->next, struct foo_struct, list); are used. The code might look more descriptive and neat if using the macro list_first_entry(head, type, member) \ list_entry((head)->next, type, member) Here is the macro itself and the examples of its usage in the generic code. If it will turn out to be useful, I can prepare the set of patches to inject in into arch-specific code, drivers, networking, etc. Signed-off-by: Pavel Emelianov <[email protected]> Signed-off-by: Kirill Korotaev <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Zach Brown <[email protected]> Cc: Davide Libenzi <[email protected]> Cc: John McCutchan <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: john stultz <[email protected]> Cc: Ram Pai <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-05-08Move timekeeping code to timekeeping.cjohn stultz1-458/+1
Move the timekeeping code out of kernel/timer.c and into kernel/time/timekeeping.c. I made no cleanups or other changes in transit. [[email protected]: build fix] Signed-off-by: John Stultz <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-05-08Add support for deferrable timersVenki Pallipadi1-8/+57
Introduce a new flag for timers - deferrable: Timers that work normally when system is busy. But, will not cause CPU to come out of idle (just to service this timer), when CPU is idle. Instead, this timer will be serviced when CPU eventually wakes up with a subsequent non-deferrable timer. The main advantage of this is to avoid unnecessary timer interrupts when CPU is idle. If the routine currently called by a timer can wait until next event without any issues, this new timer can be used to setup timer event for that routine. This, with dynticks, allows CPUs to be lazy, allowing them to stay in idle for extended period of time by reducing unnecesary wakeup and thereby reducing the power consumption. This patch: Builds this new timer on top of existing timer infrastructure. It uses last bit in 'base' pointer of timer_list structure to store this deferrable timer flag. __next_timer_interrupt() function skips over these deferrable timers when CPU looks for next timer event for which it has to wake up. This is exported by a new interface init_timer_deferrable() that can be called in place of regular init_timer(). [[email protected]: Privatise a #define] Signed-off-by: Venkatesh Pallipadi <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Dave Jones <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-04-26[AF_RXRPC]: Make it possible to merely try to cancel timers from a moduleDavid Howells1-0/+2
Export try_to_del_timer_sync() for use by the AF_RXRPC module. Signed-off-by: David Howells <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2007-04-07[PATCH] high-res timers: resume fixIngo Molnar1-1/+1
Soeren Sonnenburg reported that upon resume he is getting this backtrace: [<c0119637>] smp_apic_timer_interrupt+0x57/0x90 [<c0142d30>] retrigger_next_event+0x0/0xb0 [<c0104d30>] apic_timer_interrupt+0x28/0x30 [<c0142d30>] retrigger_next_event+0x0/0xb0 [<c0140068>] __kfifo_put+0x8/0x90 [<c0130fe5>] on_each_cpu+0x35/0x60 [<c0143538>] clock_was_set+0x18/0x20 [<c0135cdc>] timekeeping_resume+0x7c/0xa0 [<c02aabe1>] __sysdev_resume+0x11/0x80 [<c02ab0c7>] sysdev_resume+0x47/0x80 [<c02b0b05>] device_power_up+0x5/0x10 it turns out that on resume we mistakenly re-enable interrupts too early. Do the timer retrigger only on the current CPU. Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Soeren Sonnenburg <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-03-25[PATCH] dynticks: fix hrtimer rounding error in next_timer_interruptThomas Gleixner1-3/+16
The rework of next_timer_interrupt() fixed the timer wheel bugs, but invented a rounding error versus the next hrtimer event. This is caused by the conversion of the hrtimer internal representation to relative jiffies. This causes bug #8100: http://bugzilla.kernel.org/show_bug.cgi?id=8100 next_timer_interrupt() returns "now" in such a case and causes the code in tick_nohz_stop_sched_tick() to trigger the timer softirq, which is bogus as no timer is due for expiry. This results in an endless context switching between idle and ksoftirqd until a timer is due for expiry. Modify the hrtimer evaluation so that, it returns now + 1, when the conversion results in a delta < 1 jiffie. It's confirmed to resolve bug #8100 Reported-by: Emil Karlson <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-03-06[PATCH] fix vsyscall settimeofdayDaniel Walker1-0/+2
I've only seen this on x86_64. The vsyscall state only gets updated when a timer interrupts comes in. So if the time is set long before the next timer, there will be a period when a gettimeofday() won't reflect the correct time. I added an explicit update_vsyscall() during the settimeofday(), that way the vsyscall state doesn't get stale. Signed-off-by: Daniel Walker <[email protected]> Cc: Thomas Gleixner <[email protected]> Acked-by: Ingo Molnar <[email protected]> Acked-by: John Stultz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-03-06[PATCH] Save/restore periodic tick information over suspend/resumeThomas Gleixner1-0/+6
The programming of periodic tick devices needs to be saved/restored across suspend/resume - otherwise we might end up with a system coming up that relies on getting a PIT (or HPET) interrupt, while those devices default to 'no interrupts' after powerup. (To confuse things it worked to a certain degree on some systems because the lapic gets initialized as a side-effect of SMP bootup.) This suspend / resume thing was dropped unintentionally during the last-minute -mm code reshuffling. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-03-05[PATCH] timer/hrtimer: take per cpu locks in sane orderHeiko Carstens1-4/+4
Doing something like this on a two cpu system # echo 0 > /sys/devices/system/cpu/cpu0/online # echo 1 > /sys/devices/system/cpu/cpu0/online # echo 0 > /sys/devices/system/cpu/cpu1/online will give me this: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.21-rc2-g562aa1d4-dirty #7 ------------------------------------------------------- bash/1282 is trying to acquire lock: (&cpu_base->lock_key){.+..}, at: [<000000000005f17e>] hrtimer_cpu_notify+0xc6/0x240 but task is already holding lock: (&cpu_base->lock_key#2){.+..}, at: [<000000000005f174>] hrtimer_cpu_notify+0xbc/0x240 which lock already depends on the new lock. This happens because we have the following code in kernel/hrtimer.c: migrate_hrtimers(int cpu) [...] old_base = &per_cpu(hrtimer_bases, cpu); new_base = &get_cpu_var(hrtimer_bases); [...] spin_lock(&new_base->lock); spin_lock(&old_base->lock); Which means the spinlocks are taken in an order which depends on which cpu gets shut down from which other cpu. Therefore lockdep complains that there might be an ABBA deadlock. Since migrate_hrtimers() gets only called on cpu hotplug it's safe to assume that it isn't executed concurrently on a The same problem exists in kernel/timer.c: migrate_timers(). As pointed out by Christian Borntraeger one possible solution to avoid the locking order complaints would be to make sure that the locks are always taken in the same order. E.g. by taking the lock of the cpu with the lower number first. To achieve this we introduce two new spinlock functions double_spin_lock and double_spin_unlock which lock or unlock two locks in a given order. Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Roman Zippel <[email protected]> Cc: John Stultz <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Martin Schwidefsky <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-03-01[PATCH] kernel-doc fixes for 2.6.20-git15 (non-drivers)Randy Dunlap1-0/+1
Fix kernel-doc warnings in 2.6.20-git15 (lib/, mm/, kernel/, include/). Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-03-01[PATCH] update timekeeping_is_continuous commentDaniel Walker1-1/+1
Signed-off-by: Daniel Walker <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] generic: vsyscall-gtod support for GENERIC_TIMEjohn stultz1-0/+1
Provides generic infrastructure for vsyscall-gtod. [[email protected]: cleanup] Signed-off-by: John Stultz <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Roman Zippel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] Add debugging feature /proc/timer_statIngo Molnar1-2/+29
Add /proc/timer_stats support: debugging feature to profile timer expiration. Both the starting site, process/PID and the expiration function is captured. This allows the quick identification of timer event sources in a system. Sample output: # echo 1 > /proc/timer_stats # cat /proc/timer_stats Timer Stats Version: v0.1 Sample period: 4.010 s 24, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick) 11, 0 swapper sk_reset_timer (tcp_delack_timer) 6, 0 swapper hrtimer_stop_sched_tick (hrtimer_sched_tick) 2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) 17, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick) 2, 1 swapper queue_delayed_work_on (delayed_work_timer_fn) 4, 2050 pcscd do_nanosleep (hrtimer_wakeup) 5, 4179 sshd sk_reset_timer (tcp_write_timer) 4, 2248 yum-updatesd schedule_timeout (process_timeout) 18, 0 swapper hrtimer_restart_sched_tick (hrtimer_sched_tick) 3, 0 swapper sk_reset_timer (tcp_delack_timer) 1, 1 swapper neigh_table_init_no_netlink (neigh_periodic_timer) 2, 1 swapper e1000_up (e1000_watchdog) 1, 1 init schedule_timeout (process_timeout) 100 total events, 25.24 events/sec [ cleanups and hrtimers support from Thomas Gleixner <[email protected]> ] [[email protected]: nr_entries can become static] Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: john stultz <[email protected]> Cc: Roman Zippel <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Adrian Bunk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] tick-management: dyntick / highres functionalityThomas Gleixner1-2/+3
With Ingo Molnar <[email protected]> Add functions to provide dynamic ticks and high resolution timers. The code which keeps track of jiffies and handles the long idle periods is shared between tick based and high resolution timer based dynticks. The dyntick functionality can be disabled on the kernel commandline. Provide also the infrastructure to support high resolution timers. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: john stultz <[email protected]> Cc: Roman Zippel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] clockevents: add core functionalityThomas Gleixner1-1/+3
Architectures register their clock event devices, in the clock events core. Users of the clockevents core can get clock event devices for their use. The clockevents core code provides notification mechanisms for various clock related management events. This allows to control the clock event devices without the architectures having to worry about the details of function assignment. This is also a preliminary for high resolution timers and dynamic ticks to allow the core code to control the clock functionality without intrusive changes to the architecture code. [Fixes-by: Ingo Molnar <[email protected]>] Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: Roman Zippel <[email protected]> Cc: john stultz <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] Extend next_timer_interrupt() to use a reference jiffieThomas Gleixner1-3/+11
For CONFIG_NO_HZ we need to calculate the next timer wheel event based on a given jiffie value. Extend the existing code to allow the extra 'now' argument. Provide a compability function for the existing implementations to call the function with now == jiffies. (This also solves the racyness of the original code vs. jiffies changing during the iteration.) No functional changes to existing users of this infrastructure. [ remove WARN_ON() that triggered on s390, by Carsten Otte <[email protected]> ] [ made new helper static, Adrian Bunk <[email protected]> ] Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: john stultz <[email protected]> Cc: Roman Zippel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] Fix cascade lookup of next_timer_interruptThomas Gleixner1-70/+81
When searching for the next pending timer in the timer wheel we need to take the cascade into account. The current code has several problems: 1. it looks into the previous cascade 2. it ignores a pending cascade 3. it ignores multiple cascades Change the cascade lookup, so it calculates the array index from the point of the next cascade and always look at the cascade buckets, when the cascade is pending, i.e. gets executed in the next timer softirq. When multiple cascades are pending, then lookup the next buckets too. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: john stultz <[email protected]> Cc: Roman Zippel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] clocksource: Add verification (watchdog) helperThomas Gleixner1-23/+22
The TSC needs to be verified against another clocksource. Instead of using hardwired assumptions of available hardware, provide a generic verification mechanism. The verification uses the best available clocksource and handles the usability for high resolution timers / dynticks of the clocksource which needs to be verified. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: john stultz <[email protected]> Cc: Roman Zippel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] clocksource: Remove the update callbackThomas Gleixner1-2/+0
The clocksource code allows direct updates of the rating of a given clocksource now. Change TSC unstable tracking to use this interface and remove the update callback. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: john stultz <[email protected]> Cc: Roman Zippel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-02-16[PATCH] clocksource: replace is_continuous by a flag fieldThomas Gleixner1-1/+1
Using a flag filed allows to encode more than one information into a variable. Preparatory patch for the generic clocksource verification. [[email protected]: convert vmitime.c to the new clocksource flag] Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: john stultz <[email protected]> Cc: Roman Zippel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>