Age | Commit message (Collapse) | Author | Files | Lines |
|
Correct an issue with /proc/timer_list reported by Holger.
When reading from the proc file with a sufficiently small buffer, 2k so
not really that small, there was one could get hung trying to read the
file a chunk at a time.
The timer_list_start function failed to account for the possibility that
the offset was adjusted outside the timer_list_next.
Signed-off-by: Nathan Zimmer <[email protected]>
Reported-by: Holger Hans Peter Freyther <[email protected]>
Cc: John Stultz <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Berke Durak <[email protected]>
Cc: Jeff Layton <[email protected]>
Tested-by: Al Viro <[email protected]>
Cc: <[email protected]> # 3.10.x
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When running with 4096 cores attemping to read /proc/timer_list will fail
with an ENOMEM condition. On a sufficantly large systems the total amount
of data is more then 4mb, so it won't fit into a single buffer. The
failure can also occur on smaller systems when memory fragmentation is
high as reported by Dave Jones.
Convert /proc/timer_list to a proper seq_file with its own iterator. This
is a little more complex given that we have to make two passes with two
separate headers.
sysrq_timer_list_show also needed to be updated to reflect the fact that
now timer_list_show only does one cpu at at time.
Signed-off-by: Nathan Zimmer <[email protected]>
Reported-by: Dave Jones <[email protected]>
Cc: John Stultz <[email protected]>
Cc: Stephen Boyd <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
Split timer_list_show_tickdevices() into the header printout and pull
the rest up to timer_list_show. This is a preparatory patch for
converting timer_list to a proper seqfile with its own iterator
Signed-off-by: Nathan Zimmer <[email protected]>
Reported-by: Dave Jones <[email protected]>
Cc: John Stultz <[email protected]>
Cc: Stephen Boyd <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
Now that idle and nohz logics are going to be independant each others,
ts->idle_tick becomes too much a biased name to describe the field that
saves the last scheduled tick on top of which we re-calculate the next
tick to schedule when the timer is restarted.
We want to reuse this even to stop the tick outside idle cases. So let's
rename it to some more generic name: ts->last_tick.
This changes a bit the timer list stat export so we need to increase its
version.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Alessio Igor Bogani <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Avi Kivity <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Daniel Lezcano <[email protected]>
Cc: Geoff Levand <[email protected]>
Cc: Gilad Ben Yossef <[email protected]>
Cc: Hakan Akkan <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: Max Krasnyansky <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Sven-Thorsten Dietrich <[email protected]>
Cc: Thomas Gleixner <[email protected]>
|
|
In the continuing effort to avoid kernel addresses leaking to
unprivileged users, this patch switches to %pK for
/proc/timer_list reporting.
Signed-off-by: Kees Cook <[email protected]>
Cc: John Stultz <[email protected]>
Cc: Dan Rosenberg <[email protected]>
Cc: Eugene Teo <[email protected]>
Cc: Linus Torvalds <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Converts the hrtimer code to use the new timerlist infrastructure
Signed-off-by: John Stultz <[email protected]>
LKML Reference: <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
CC: Alessandro Zummo <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Richard Cochran <[email protected]>
|
|
For the ondemand cpufreq governor, it is desired that the iowait
time is microaccounted in a similar way as idle time is.
This patch introduces the infrastructure to account and expose
this information via the get_cpu_iowait_time_us() function.
[[email protected]: fix CONFIG_NO_HZ=n build]
Signed-off-by: Arjan van de Ven <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Rik van Riel <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: [email protected]
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The current logic which handles clock events programming failures can
increase min_delta_ns unlimited and even can cause overflows.
Sanitize it by:
- prevent zero increase when min_delta_ns == 1
- limiting min_delta_ns to a jiffie
- bail out if the jiffie limit is hit
- add retries stats for /proc/timer_list so we can gather data
Reported-by: Uwe Kleine-Koenig <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
struct cpumask will be undefined soon with CONFIG_CPUMASK_OFFSTACK=y,
to avoid them being declared on the stack.
cpumask_bits() does what we want here (of course, this code is crap).
Signed-off-by: Rusty Russell <[email protected]>
To: Thomas Gleixner <[email protected]>
|
|
Convert locks which cannot be sleeping locks in preempt-rt to
raw_spinlocks.
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
|
|
The hrtimer_interrupt hang logic adjusts min_delta_ns based on the
execution time of the hrtimer callbacks.
This is error-prone for virtual machines, where a guest vcpu can be
scheduled out during the execution of the callbacks (and the callbacks
themselves can do operations that translate to blocking operations in
the hypervisor), which in can lead to large min_delta_ns rendering the
system unusable.
Replace the current heuristics with something more reliable. Allow the
interrupt code to try 3 times to catch up with the lost time. If that
fails use the total time spent in the interrupt handler to defer the
next timer interrupt so the system can catch up with other things
which got delayed. Limit that deferment to 100ms.
The retry events and the maximum time spent in the interrupt handler
are recorded and exposed via /proc/timer_list
Inspired by a patch from Marcelo.
Reported-by: Michael Tokarev <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Marcelo Tosatti <[email protected]>
Cc: [email protected]
|
|
In the dynamic tick code, "max_delta_ns" (member of the
"clock_event_device" structure) represents the maximum sleep time
that can occur between timer events in nanoseconds.
The variable, "max_delta_ns", is defined as an unsigned long
which is a 32-bit integer for 32-bit machines and a 64-bit
integer for 64-bit machines (if -m64 option is used for gcc).
The value of max_delta_ns is set by calling the function
"clockevent_delta2ns()" which returns a maximum value of LONG_MAX.
For a 32-bit machine LONG_MAX is equal to 0x7fffffff and in
nanoseconds this equates to ~2.15 seconds. Hence, the maximum
sleep time for a 32-bit machine is ~2.15 seconds, where as for
a 64-bit machine it will be many years.
This patch changes the type of max_delta_ns to be "u64" instead of
"unsigned long" so that this variable is a 64-bit type for both 32-bit
and 64-bit machines. It also changes the maximum value returned by
clockevent_delta2ns() to KTIME_MAX. Hence this allows a 32-bit
machine to sleep for longer than ~2.15 seconds. Please note that this
patch also changes "min_delta_ns" to be "u64" too and although this is
unnecessary, it makes the patch simpler as it avoids to fixup all
callers of clockevent_delta2ns().
[ tglx: changed "unsigned long long" to u64 as we use this data type
through out the time code ]
Signed-off-by: Jon Hunter <[email protected]>
Cc: John Stultz <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
The mult and shift factors of clock events differ in their data type
from those of clock sources for no reason. u32 is sufficient for
both. shift is always <= 32 and mult is limited to 2^32-1 to avoid
64bit multiplication overflows in the conversion.
Preparatory patch for a generic mult/shift factor calculation
function.
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Mikael Pettersson <[email protected]>
Acked-by: Ralf Baechle <[email protected]>
Acked-by: Linus Walleij <[email protected]>
Cc: John Stultz <[email protected]>
LKML-Reference: <[email protected]>
|
|
[[email protected]: fix KVM]
Signed-off-by: Alexey Dobriyan <[email protected]>
Acked-by: Mike Frysinger <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
/proc/timer_list and /proc/slabinfo are not supposed to be
written, so there should be no write permissions on it.
Signed-off-by: WANG Cong <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Vegard Nossum <[email protected]>
Cc: Eduard - Gabriel Munteanu <[email protected]>
Cc: [email protected]
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Amerigo Wang <[email protected]>
Cc: Matt Mackall <[email protected]>
Cc: Arjan van de Ven <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Conflicts:
kernel/time/tick-sched.c
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
The base address of a (per cpu) clock base is a useful debug info.
Add it and bump the version number of timer_lists.
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
The per cpu clock events device output of timer_list lacks an
association of the device to the cpu which is annoying when looking at
the output of /proc/timer_list from a 128 way system.
Add the CPU number info and mark the broadcast device in the device
list printout.
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
The current timer_list output prints the address of the on stack copy
of the active hrtimer instead of the hrtimer itself.
Print the address of the real timer instead.
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
to help debugging and visibility of timer ranges, show them
in the existing timer list in /proc/timer_list
Signed-off-by: Arjan van de Ven <[email protected]>
|
|
In order to be able to do range hrtimers we need to use accessor functions
to the "expire" member of the hrtimer struct.
This patch converts kernel/* to these accessors.
Signed-off-by: Arjan van de Ven <[email protected]>
|
|
Use proc_create()/proc_create_data() to make sure that ->proc_fops and ->data
be setup before gluing PDE to main tree.
Signed-off-by: Denis V. Lunev <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Relative expiry time can get negative, so it should be signed.
Signed-off-by: Pavel Machek <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
To allow better diagnosis of tick-sched related, especially NOHZ
related problems, we need to know when the last wakeup via an irq
happened and when the CPU left the idle state.
Add two fields (idle_waketime, idle_exittime) to the tick_sched
structure and add them to the timer_list output.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
This makes sure printk format strings contain no more than a single
line.
Signed-off-by: Vegard Nossum <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
On every open/close one struct seq_operations leaks.
Kudos to /proc/slab_allocators.
Signed-off-by: Alexey Dobriyan <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
KSYM_NAME_LEN is peculiar in that it does not include the space for the
trailing '\0', forcing all users to use KSYM_NAME_LEN + 1 when allocating
buffer. This is nonsense and error-prone. Moreover, when the caller
forgets that it's very likely to subtly bite back by corrupting the stack
because the last position of the buffer is always cleared to zero.
This patch increments KSYM_NAME_LEN by one and updates code accordingly.
* off-by-one bug in asm-powerpc/kprobes.h::kprobe_lookup_name() macro
is fixed.
* Where MODULE_NAME_LEN and KSYM_NAME_LEN were used together,
MODULE_NAME_LEN was treated as if it didn't include space for the
trailing '\0'. Fix it.
Signed-off-by: Tejun Heo <[email protected]>
Acked-by: Paulo Marques <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
u64 and s64 are not necessarily 'long long' on some 64-bit
platforms, so explicit the type to kill the compiler warnings.
Also consistently use '%Lu' which is unsigned.
Signed-off-by: David S. Miller <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
kallsyms_lookup() can go iterating over modules list unprotected which is OK
for emergency situations (oops), but not OK for regular stuff like
/proc/*/wchan.
Introduce lookup_symbol_name()/lookup_module_symbol_name() which copy symbol
name into caller-supplied buffer or return -ERANGE. All copying is done with
module_mutex held, so...
Signed-off-by: Alexey Dobriyan <[email protected]>
Cc: Rusty Russell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Several kallsyms_lookup() pass dummy arguments but only need, say, module's
name. Make kallsyms_lookup() accept NULLs where possible.
Also, makes picture clearer about what interfaces are needed for all symbol
resolving business.
Signed-off-by: Alexey Dobriyan <[email protected]>
Cc: Rusty Russell <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Fix the print formatting of three unsigned long fields in /proc/timer_list,
which are currently being formatted as signed long.
Signed-off-by: James Morris <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
add /proc/timer_list, which prints all currently pending (high-res) timers,
all clock-event sources and their parameters in a human-readable form.
Sample output:
Timer List Version: v0.1
HRTIMER_MAX_CLOCK_BASES: 2
now at 4246046273872 nsecs
cpu: 0
clock 0:
.index: 0
.resolution: 1 nsecs
.get_time: ktime_get_real
.offset: 1273998312645738432 nsecs
active timers:
clock 1:
.index: 1
.resolution: 1 nsecs
.get_time: ktime_get
.offset: 0 nsecs
active timers:
#0: <f5a90ec8>, hrtimer_sched_tick, hrtimer_stop_sched_tick, swapper/0
# expires at 4246432689566 nsecs [in 386415694 nsecs]
#1: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, pcscd/2050
# expires at 4247018194689 nsecs [in 971920817 nsecs]
#2: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, irqbalance/1909
# expires at 4247351358392 nsecs [in 1305084520 nsecs]
#3: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, crond/2157
# expires at 4249097614968 nsecs [in 3051341096 nsecs]
#4: <f5a90ec8>, it_real_fn, do_setitimer, syslogd/1888
# expires at 4251329900926 nsecs [in 5283627054 nsecs]
.expires_next : 4246432689566 nsecs
.hres_active : 1
.check_clocks : 0
.nr_events : 31306
.idle_tick : 4246020791890 nsecs
.tick_stopped : 1
.idle_jiffies : 986504
.idle_calls : 40700
.idle_sleeps : 36014
.idle_entrytime : 4246019418883 nsecs
.idle_sleeptime : 4178181972709 nsecs
cpu: 1
clock 0:
.index: 0
.resolution: 1 nsecs
.get_time: ktime_get_real
.offset: 1273998312645738432 nsecs
active timers:
clock 1:
.index: 1
.resolution: 1 nsecs
.get_time: ktime_get
.offset: 0 nsecs
active timers:
#0: <f5a90ec8>, hrtimer_sched_tick, hrtimer_restart_sched_tick, swapper/0
# expires at 4246050084568 nsecs [in 3810696 nsecs]
#1: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, atd/2227
# expires at 4261010635003 nsecs [in 14964361131 nsecs]
#2: <f5a90ec8>, hrtimer_wakeup, do_nanosleep, smartd/2332
# expires at 5469485798970 nsecs [in 1223439525098 nsecs]
.expires_next : 4246050084568 nsecs
.hres_active : 1
.check_clocks : 0
.nr_events : 24043
.idle_tick : 4246046084568 nsecs
.tick_stopped : 0
.idle_jiffies : 986510
.idle_calls : 26360
.idle_sleeps : 22551
.idle_entrytime : 4246043874339 nsecs
.idle_sleeptime : 4170763761184 nsecs
tick_broadcast_mask: 00000003
event_broadcast_mask: 00000001
CPU#0's local event device:
Clock Event Device: lapic
capabilities: 0000000e
max_delta_ns: 807385544
min_delta_ns: 1443
mult: 44624025
shift: 32
set_next_event: lapic_next_event
set_mode: lapic_timer_setup
event_handler: hrtimer_interrupt
.installed: 1
.expires: 4246432689566 nsecs
CPU#1's local event device:
Clock Event Device: lapic
capabilities: 0000000e
max_delta_ns: 807385544
min_delta_ns: 1443
mult: 44624025
shift: 32
set_next_event: lapic_next_event
set_mode: lapic_timer_setup
event_handler: hrtimer_interrupt
.installed: 1
.expires: 4246050084568 nsecs
Clock Event Device: hpet
capabilities: 00000007
max_delta_ns: 2147483647
min_delta_ns: 3352
mult: 61496110
shift: 32
set_next_event: hpet_next_event
set_mode: hpet_set_mode
event_handler: handle_nextevt_broadcast
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: john stultz <[email protected]>
Cc: Roman Zippel <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|