aboutsummaryrefslogtreecommitdiff
path: root/kernel/trace
AgeCommit message (Collapse)AuthorFilesLines
2011-12-05tracing: fix event_subsystem ref countingIlya Dryomov1-1/+0
Fix a bug introduced by e9dbfae5, which prevents event_subsystem from ever being released. Ref_count was added to keep track of subsystem users, not for counting events. Subsystem is created with ref_count = 1, so there is no need to increment it for every event, we have nr_events for that. Fix this by touching ref_count only when we actually have a new user - subsystem_open(). Cc: [email protected] Signed-off-by: Ilya Dryomov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-12-01trace_events_filter: Use rcu_assign_pointer() when setting ↵Tejun Heo1-3/+3
ftrace_event_call->filter ftrace_event_call->filter is sched RCU protected but didn't use rcu_assign_pointer(). Use it. TODO: Add proper __rcu annotation to call->filter and all its users. -v2: Use RCU_INIT_POINTER() for %NULL clearing as suggested by Eric. Link: http://lkml.kernel.org/r/[email protected] Cc: Eric Dumazet <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: [email protected] # (2.6.39+) Signed-off-by: Tejun Heo <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2011-11-17tracing: Add entries in buffer and total entries to default output headerSteven Rostedt1-25/+47
Knowing the number of event entries in the ring buffer compared to the total number that were written is useful information. The latency format gives this information and there's no reason that the default format does not. This information is now added to the default header, along with the number of online CPUs: # tracer: nop # # entries-in-buffer/entries-written: 159836/64690869 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | <idle>-0 [000] ...2 49.442971: local_touch_nmi <-cpu_idle <idle>-0 [000] d..2 49.442973: enter_idle <-cpu_idle <idle>-0 [000] d..2 49.442974: atomic_notifier_call_chain <-enter_idle <idle>-0 [000] d..2 49.442976: __atomic_notifier_call_chain <-atomic_notifier The above shows that the trace contains 159836 entries, but 64690869 were written. One could figure out that there were 64531033 entries that were dropped. Signed-off-by: Steven Rostedt <[email protected]>
2011-11-17tracing: Add irq, preempt-count and need resched info to default trace outputSteven Rostedt3-6/+35
People keep asking how to get the preempt count, irq, and need resched info and we keep telling them to enable the latency format. Some developers think that traces without this info is completely useless, and for a lot of tasks it is useless. The first option was to enable the latency trace as the default format, but the header for the latency format is pretty useless for most tracers and it also does the timestamp in straight microseconds from the time the trace started. This is sometimes more difficult to read as the default trace is seconds from the start of boot up. Latency format: # tracer: nop # # nop latency trace v1.1.5 on 3.2.0-rc1-test+ # -------------------------------------------------------------------- # latency: 0 us, #159771/64234230, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:4) # ----------------- # | task: -0 (uid:0 nice:0 policy:0 rt_prio:0) # ----------------- # # _------=> CPU# # / _-----=> irqs-off # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| / delay # cmd pid ||||| time | caller # \ / ||||| \ | / migratio-6 0...2 41778231us+: rcu_note_context_switch <-__schedule migratio-6 0...2 41778233us : trace_rcu_utilization <-rcu_note_context_switch migratio-6 0...2 41778235us+: rcu_sched_qs <-rcu_note_context_switch migratio-6 0d..2 41778236us+: rcu_preempt_qs <-rcu_note_context_switch migratio-6 0...2 41778238us : trace_rcu_utilization <-rcu_note_context_switch migratio-6 0...2 41778239us+: debug_lockdep_rcu_enabled <-__schedule default format: # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | migration/0-6 [000] 50.025810: rcu_note_context_switch <-__schedule migration/0-6 [000] 50.025812: trace_rcu_utilization <-rcu_note_context_switch migration/0-6 [000] 50.025813: rcu_sched_qs <-rcu_note_context_switch migration/0-6 [000] 50.025815: rcu_preempt_qs <-rcu_note_context_switch migration/0-6 [000] 50.025817: trace_rcu_utilization <-rcu_note_context_switch migration/0-6 [000] 50.025818: debug_lockdep_rcu_enabled <-__schedule migration/0-6 [000] 50.025820: debug_lockdep_rcu_enabled <-__schedule The latency format header has latency information that is pretty meaningless for most tracers. Although some of the header is useful, and we can add that later to the default format as well. What is really useful with the latency format is the irqs-off, need-resched hard/softirq context and the preempt count. This commit adds the option irq-info which is on by default that adds this information: # tracer: nop # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | <idle>-0 [000] d..2 49.309305: cpuidle_get_driver <-cpuidle_idle_call <idle>-0 [000] d..2 49.309307: mwait_idle <-cpu_idle <idle>-0 [000] d..2 49.309309: need_resched <-mwait_idle <idle>-0 [000] d..2 49.309310: test_ti_thread_flag <-need_resched <idle>-0 [000] d..2 49.309312: trace_power_start.constprop.13 <-mwait_idle <idle>-0 [000] d..2 49.309313: trace_cpu_idle <-mwait_idle <idle>-0 [000] d..2 49.309315: need_resched <-mwait_idle If a user wants the old format, they can disable the 'irq-info' option: # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | <idle>-0 [000] 49.309305: cpuidle_get_driver <-cpuidle_idle_call <idle>-0 [000] 49.309307: mwait_idle <-cpu_idle <idle>-0 [000] 49.309309: need_resched <-mwait_idle <idle>-0 [000] 49.309310: test_ti_thread_flag <-need_resched <idle>-0 [000] 49.309312: trace_power_start.constprop.13 <-mwait_idle <idle>-0 [000] 49.309313: trace_cpu_idle <-mwait_idle <idle>-0 [000] 49.309315: need_resched <-mwait_idle Requested-by: Thomas Gleixner <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2011-11-11Merge branch 'tip/perf/core' of ↵Ingo Molnar7-13/+68
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core
2011-11-07tracing/latency: Fix header output for latency tracersJiri Olsa4-2/+40
In case the the graph tracer (CONFIG_FUNCTION_GRAPH_TRACER) or even the function tracer (CONFIG_FUNCTION_TRACER) are not set, the latency tracers do not display proper latency header. The involved/fixed latency tracers are: wakeup_rt wakeup preemptirqsoff preemptoff irqsoff The patch adds proper handling of tracer configuration options for latency tracers, and displaying correct header info accordingly. * The current output (for wakeup tracer) with both graph and function tracers disabled is: # tracer: wakeup # <idle>-0 0d.h5 1us+: 0:120:R + [000] 7: 0:R watchdog/0 <idle>-0 0d.h5 3us+: ttwu_do_activate.clone.1 <-try_to_wake_up ... * The fixed output is: # tracer: wakeup # # wakeup latency trace v1.1.5 on 3.1.0-tip+ # -------------------------------------------------------------------- # latency: 55 us, #4/4, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # | task: migration/0-6 (uid:0 nice:0 policy:1 rt_prio:99) # ----------------- # # _------=> CPU# # / _-----=> irqs-off # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| / delay # cmd pid ||||| time | caller # \ / ||||| \ | / cat-1129 0d..4 1us : 1129:120:R + [000] 6: 0:R migration/0 cat-1129 0d..4 2us+: ttwu_do_activate.clone.1 <-try_to_wake_up * The current output (for wakeup tracer) with only function tracer enabled is: # tracer: wakeup # cat-1140 0d..4 1us+: 1140:120:R + [000] 6: 0:R migration/0 cat-1140 0d..4 2us : ttwu_do_activate.clone.1 <-try_to_wake_up * The fixed output is: # tracer: wakeup # # wakeup latency trace v1.1.5 on 3.1.0-tip+ # -------------------------------------------------------------------- # latency: 207 us, #109/109, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2) # ----------------- # | task: watchdog/1-12 (uid:0 nice:0 policy:1 rt_prio:99) # ----------------- # # _------=> CPU# # / _-----=> irqs-off # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| / delay # cmd pid ||||| time | caller # \ / ||||| \ | / <idle>-0 1d.h5 1us+: 0:120:R + [001] 12: 0:R watchdog/1 <idle>-0 1d.h5 3us : ttwu_do_activate.clone.1 <-try_to_wake_up Link: http://lkml.kernel.org/r/[email protected] Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2011-11-07ftrace: Fix hash record accounting bugSteven Rostedt1-1/+3
If the set_ftrace_filter is cleared by writing just whitespace to it, then the filter hash refcounts will be decremented but not updated. This causes two bugs: 1) No functions will be enabled for tracing when they all should be 2) If the users clears the set_ftrace_filter twice, it will crash ftrace: ------------[ cut here ]------------ WARNING: at /home/rostedt/work/git/linux-trace.git/kernel/trace/ftrace.c:1384 __ftrace_hash_rec_update.part.27+0x157/0x1a7() Modules linked in: Pid: 2330, comm: bash Not tainted 3.1.0-test+ #32 Call Trace: [<ffffffff81051828>] warn_slowpath_common+0x83/0x9b [<ffffffff8105185a>] warn_slowpath_null+0x1a/0x1c [<ffffffff810ba362>] __ftrace_hash_rec_update.part.27+0x157/0x1a7 [<ffffffff810ba6e8>] ? ftrace_regex_release+0xa7/0x10f [<ffffffff8111bdfe>] ? kfree+0xe5/0x115 [<ffffffff810ba51e>] ftrace_hash_move+0x2e/0x151 [<ffffffff810ba6fb>] ftrace_regex_release+0xba/0x10f [<ffffffff8112e49a>] fput+0xfd/0x1c2 [<ffffffff8112b54c>] filp_close+0x6d/0x78 [<ffffffff8113a92d>] sys_dup3+0x197/0x1c1 [<ffffffff8113a9a6>] sys_dup2+0x4f/0x54 [<ffffffff8150cac2>] system_call_fastpath+0x16/0x1b ---[ end trace 77a3a7ee73794a02 ]--- Link: http://lkml.kernel.org/r/20111101141420.GA4918@debian Reported-by: Rabin Vincent <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2011-11-07ftrace: Remove force undef config value left for testingSteven Rostedt1-1/+0
A forced undef of a config value was used for testing and was accidently left in during the final commit. This causes x86 to run slower than needed while running function tracing as well as causes the function graph selftest to fail when DYNMAIC_FTRACE is not set. This is because the code in MCOUNT expects the ftrace code to be processed with the config value set that happened to be forced not set. The forced config option was left in by: commit 6331c28c962561aee59e5a493b7556a4bb585957 ftrace: Fix dynamic selftest failure on some archs Link: http://lkml.kernel.org/r/20111102150255.GA6973@debian Cc: [email protected] Reported-by: Rabin Vincent <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2011-11-04tracing: Add boiler plate for subsystem filterSteven Rostedt1-7/+19
The system filter can be used to set multiple event filters that exist within the system. But currently it displays the last filter written that does not necessarily correspond to the filters within the system. The system filter itself is not used to filter any events. The system filter is just a means to set filters of the events within it. Because this causes an ambiguous state when the system filter reads a filter string but the events within the system have different strings it is best to just show a boiler plate: ### global filter ### # Use this to set filters for multiple events. # Only events with the given fields will be affected. # If no events are modified, an error message will be displayed here. If an error occurs while writing to the system filter, the system filter will replace the boiler plate with the error message as it currently does. Signed-off-by: Steven Rostedt <[email protected]>
2011-11-02tracing: Restore system filter behaviorLi Zefan1-1/+6
Though not all events have field 'prev_pid', it was allowed to do this: # echo 'prev_pid == 100' > events/sched/filter but commit 75b8e98263fdb0bfbdeba60d4db463259f1fe8a2 (tracing/filter: Swap entire filter of events) broke it without any reason. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Li Zefan <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2011-10-31kernel: Fix files explicitly needing EXPORT_SYMBOL infrastructurePaul Gortmaker1-0/+1
These files were getting <linux/module.h> via an implicit non-obvious path, but we want to crush those out of existence since they cost time during compiles of processing thousands of lines of headers for no reason. Give them the lightweight header that just contains the EXPORT_SYMBOL infrastructure. Signed-off-by: Paul Gortmaker <[email protected]>
2011-10-31tracing: fix event_subsystem ref countingIlya Dryomov1-1/+0
Fix a bug introduced by e9dbfae5, which prevents event_subsystem from ever being released. Ref_count was added to keep track of subsystem users, not for counting events. Subsystem is created with ref_count = 1, so there is no need to increment it for every event, we have nr_events for that. Fix this by touching ref_count only when we actually have a new user - subsystem_open(). Cc: [email protected] Signed-off-by: Ilya Dryomov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-10-31kernel: Add <linux/module.h> to files using it implicitlyPaul Gortmaker2-0/+2
These files are doing things like module_put and try_module_get so they need to call out the module.h for explicit inclusion, rather than getting it via <linux/device.h> which we ideally want to remove the module.h inclusion from. Signed-off-by: Paul Gortmaker <[email protected]>
2011-10-26Merge branch 'perf-core-for-linus' of ↵Linus Torvalds11-396/+819
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (121 commits) perf symbols: Increase symbol KSYM_NAME_LEN size perf hists browser: Refuse 'a' hotkey on non symbolic views perf ui browser: Use libslang to read keys perf tools: Fix tracing info recording perf hists browser: Elide DSO column when it is set to just one DSO, ditto for threads perf hists: Don't consider filtered entries when calculating column widths perf hists: Don't decay total_period for filtered entries perf hists browser: Honour symbol_conf.show_{nr_samples,total_period} perf hists browser: Do not exit on tab key with single event perf annotate browser: Don't change selection line when returning from callq perf tools: handle endianness of feature bitmap perf tools: Add prelink suggestion to dso update message perf script: Fix unknown feature comment perf hists browser: Apply the dso and thread filters when merging new batches perf hists: Move the dso and thread filters from hist_browser perf ui browser: Honour the xterm colors perf top tui: Give color hints just on the percentage, like on --stdio perf ui browser: Make the colors configurable and change the defaults perf tui: Remove unneeded call to newtCls on startup perf hists: Don't format the percentage on hist_entry__snprintf ... Fix up conflicts in arch/x86/kernel/kprobes.c manually. Ingo's tree did the insane "add volatile to const array", which just doesn't make sense ("volatile const"?). But we could remove the const *and* make the array volatile to make doubly sure that gcc doesn't optimize it away.. Also fix up kernel/trace/ring_buffer.c non-data-conflicts manually: the reader_lock has been turned into a raw lock by the core locking merge, and there was a new user of it introduced in this perf core merge. Make sure that new use also uses the raw accessor functions.
2011-10-26Merge branch 'core-locking-for-linus' of ↵Linus Torvalds3-34/+34
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) rtmutex: Add missing rcu_read_unlock() in debug_rt_mutex_print_deadlock() lockdep: Comment all warnings lib: atomic64: Change the type of local lock to raw_spinlock_t locking, lib/atomic64: Annotate atomic64_lock::lock as raw locking, x86, iommu: Annotate qi->q_lock as raw locking, x86, iommu: Annotate irq_2_ir_lock as raw locking, x86, iommu: Annotate iommu->register_lock as raw locking, dma, ipu: Annotate bank_lock as raw locking, ARM: Annotate low level hw locks as raw locking, drivers/dca: Annotate dca_lock as raw locking, powerpc: Annotate uic->lock as raw locking, x86: mce: Annotate cmci_discover_lock as raw locking, ACPI: Annotate c3_lock as raw locking, oprofile: Annotate oprofilefs lock as raw locking, video: Annotate vga console lock as raw locking, latencytop: Annotate latency_lock as raw locking, timer_stats: Annotate table_lock as raw locking, rwsem: Annotate inner lock as raw locking, semaphores: Annotate inner lock as raw locking, sched: Annotate thread_group_cputimer as raw ... Fix up conflicts in kernel/posix-cpu-timers.c manually: making cputimer->cputime a raw lock conflicted with the ABBA fix in commit bcd5cff7216f ("cputimer: Cure lock inversion").
2011-10-14tracing: Fix returning of duplicate data after EOF in trace_pipe_rawSteven Rostedt1-2/+2
The trace_pipe_raw handler holds a cached page from the time the file is opened to the time it is closed. The cached page is used to handle the case of the user space buffer being smaller than what was read from the ring buffer. The left over buffer is held in the cache so that the next read will continue where the data left off. After EOF is returned (no more data in the buffer), the index of the cached page is set to zero. If a user app reads the page again after EOF, the check in the buffer will see that the cached page is less than page size and will return the cached page again. This will cause reading the trace_pipe_raw again after EOF to return duplicate data, making the output look like the time went backwards but instead data is just repeated. The fix is to not reset the index right after all data is read from the cache, but to reset it after all data is read and more data exists in the ring buffer. Cc: stable <[email protected]> Reported-by: Jeremy Eder <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2011-10-14ftrace: Fix README to state tracing_on to start/stop tracingGeunsik Lim1-2/+2
tracing_enabled option is deprecated. To start/stop tracing, write to /sys/kernel/debug/tracing/tracing_on without tracing_enabled. This patch is based on Linux 3.1.0-rc1 Signed-off-by: Geunsik Lim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-10-12Merge branch 'tip/perf/core' of git://github.com/rostedt/linux into perf/coreIngo Molnar10-383/+766
2011-10-11tracing: Do not allocate buffer for trace_markerSteven Rostedt1-28/+83
When doing intense tracing, the kmalloc inside trace_marker can introduce side effects to what is being traced. As trace_marker() is used by userspace to inject data into the kernel ring buffer, it needs to do so with the least amount of intrusion to the operations of the kernel or the user space application. As the ring buffer is designed to write directly into the buffer without the need to make a temporary buffer, and userspace already went through the hassle of knowing how big the write will be, we can simply pin the userspace pages and write the data directly into the buffer. This improves the impact of tracing via trace_marker tremendously! Thanks to Peter Zijlstra and Thomas Gleixner for pointing out the use of get_user_pages_fast() and kmap_atomic(). Suggested-by: Thomas Gleixner <[email protected]> Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2011-10-11tracing: Warn on output if the function tracer was found corruptedSteven Rostedt3-0/+25
As the function tracer is very intrusive, lots of self checks are performed on the tracer and if something is found to be strange it will shut itself down keeping it from corrupting the rest of the kernel. This shutdown may still allow functions to be traced, as the tracing only stops new modifications from happening. Trying to stop the function tracer itself can cause more harm as it requires code modification. Although a WARN_ON() is executed, a user may not notice it. To help the user see that something isn't right with the tracing of the system a big warning is added to the output of the tracer that lets the user know that their data may be incomplete. Reported-by: Thomas Gleixner <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2011-10-10ftrace/kprobes: Fix not to delete probes if in useMasami Hiramatsu1-9/+49
Fix kprobe-tracer not to delete a probe if the probe is in use. In that case, delete operation will return -EBUSY. This bug can cause a kernel panic if enabled probes are deleted during perf record. (Add some probes on functions) sh-4.2# perf probe --del probe:\* sh-4.2# exit (kernel panic) This is originally reported on the fedora bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=742383 I've also checked that this problem doesn't happen on tracepoints when module removing because perf event locks target module. $ sudo ./perf record -e xfs:\* -aR sh sh-4.2# rmmod xfs ERROR: Module xfs is in use sh-4.2# exit [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.203 MB perf.data (~8862 samples) ] Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Frank Ch. Eigler <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/20111004104438.14591.6553.stgit@fedora15 Signed-off-by: Steven Rostedt <[email protected]>
2011-10-07Merge branch 'pm-runtime' into pm-for-linusRafael J. Wysocki2-0/+23
* pm-runtime: PM / Tracing: build rpm-traces.c only if CONFIG_PM_RUNTIME is set PM / Runtime: Replace dev_dbg() with trace_rpm_*() PM / Runtime: Introduce trace points for tracing rpm_* functions PM / Runtime: Don't run callbacks under lock for power.irq_safe set USB: Add wakeup info to debugging messages PM / Runtime: pm_runtime_idle() can be called in atomic context PM / Runtime: Add macro to test for runtime PM events PM / Runtime: Add might_sleep() to runtime PM functions
2011-09-29PM / Tracing: build rpm-traces.c only if CONFIG_PM_RUNTIME is setMing Lei1-0/+2
Do not build kernel/trace/rpm-traces.c if CONFIG_PM_RUNTIME is not set, which avoids a build failure. [rjw: Added the changelog and modified the subject slightly.] Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2011-09-27PM / Runtime: Introduce trace points for tracing rpm_* functionsMing Lei2-0/+21
This patch introduces 3 trace points to prepare for tracing rpm_idle/rpm_suspend/rpm_resume functions, so we can use these trace points to replace the current dev_dbg(). Signed-off-by: Ming Lei <[email protected]> Acked-by: Steven Rostedt <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2011-09-22tracing: Fix preemptirqsoff tracer to not stop at preempt offSteven Rostedt1-2/+2
If irqs are disabled when preemption count reaches zero, the preemptirqsoff tracer should not flag that as the end. When interrupts are enabled and preemption count is not zero the preemptirqsoff correctly continues its tracing. Signed-off-by: Steven Rostedt <[email protected]>
2011-09-19tracing: Add a counter clock for those that do not trust clocksSteven Rostedt2-0/+13
When debugging tight race conditions, it can be helpful to have a synchronized tracing method. Although in most cases the global clock provides this functionality, if timings is not the issue, it is more comforting to know that the order of events really happened in a precise order. Instead of using a clock, add a "counter" that is simply an incrementing atomic 64bit counter that orders the events as they are perceived to happen. The trace_clock_counter() is added from the attempt by Peter Zijlstra trying to convert the trace_clock_global() to it. I took Peter's counter code and made trace_clock_counter() instead, and added it to the choice of clocks. Just echo counter > /debug/tracing/trace_clock to activate it. Requested-by: Thomas Gleixner <[email protected]> Requested-by: Peter Zijlstra <[email protected]> Reviewed-By: Valdis Kletnieks <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2011-09-13locking, tracing: Annotate tracing locks as rawThomas Gleixner3-34/+34
The tracing locks can be taken in atomic context and therefore cannot be preempted on -rt - annotate it. In mainline this change documents the low level nature of the lock - otherwise there's no functional difference. Lockdep and Sparse checking will work as usual. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-08-30trace: Add ring buffer stats to measure rate of eventsVaibhav Nagarnaik2-1/+82
The stats file under per_cpu folder provides the number of entries, overruns and other statistics about the CPU ring buffer. However, the numbers do not provide any indication of how full the ring buffer is in bytes compared to the overall size in bytes. Also, it is helpful to know the rate at which the cpu buffer is filling up. This patch adds an entry "bytes: " in printed stats for per_cpu ring buffer which provides the actual bytes consumed in the ring buffer. This field includes the number of bytes used by recorded events and the padding bytes added when moving the tail pointer to next page. It also adds the following time stamps: "oldest event ts:" - the oldest timestamp in the ring buffer "now ts:" - the timestamp at the time of reading The field "now ts" provides a consistent time snapshot to the userspace when being read. This is read from the same trace clock used by tracing event timestamps. Together, these values provide the rate at which the buffer is filling up, from the formula: bytes / (now_ts - oldest_event_ts) Signed-off-by: Vaibhav Nagarnaik <[email protected]> Cc: Michael Rubin <[email protected]> Cc: David Sharp <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-08-30trace: Add a new readonly entry to report total buffer sizeVaibhav Nagarnaik1-0/+33
The current file "buffer_size_kb" reports the size of per-cpu buffer and not the overall memory allocated which could be misleading. A new file "buffer_total_size_kb" adds up all the enabled CPU buffer sizes and reports it. This is only a readonly entry. Signed-off-by: Vaibhav Nagarnaik <[email protected]> Cc: Michael Rubin <[email protected]> Cc: David Sharp <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-08-30tracing: Add preempt disable for filter self testSteven Rostedt1-0/+6
The self testing for event filters does not really need preemption disabled as there are no races at the time of testing, but the functions it calls uses rcu_dereference_sched() which will complain if preemption is not disabled. Cc: Jiri Olsa <[email protected]> Signed-off-by: Steven Rostedt <[email protected]>
2011-08-19tracing/filter: Add startup tests for events filterJiri Olsa4-0/+264
Adding automated tests running as late_initcall. Tests are compiled in with CONFIG_FTRACE_STARTUP_TEST option. Adding test event "ftrace_test_filter" used to simulate filter processing during event occurance. String filters are compiled and tested against several test events with different values. Also testing that evaluation of explicit predicates is ommited due to the lazy filter evaluation. Signed-off-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-08-19tracing/filter: Change filter_match_preds function to use walk_pred_treeJiri Olsa1-66/+58
Changing filter_match_preds function to use unified predicates tree processing. Signed-off-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-08-19tracing/filter: Change fold_pred function to use walk_pred_treeJiri Olsa1-35/+33
Changing fold_pred_tree function to use unified predicates tree processing. Signed-off-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-08-19tracing/filter: Change fold_pred_tree function to use walk_pred_treeJiri Olsa1-45/+20
Changing fold_pred_tree function to use unified predicates tree processing. Signed-off-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-08-19tracing/filter: Change count_leafs function to use walk_pred_treeJiri Olsa1-33/+14
Changing count_leafs function to use unified predicates tree processing. Signed-off-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-08-19tracing/filter: Unify predicate tree walking, change check_pred_tree ↵Jiri Olsa1-51/+86
function to use it Adding walk_pred_tree function to be used for walking throught the filter predicates. For each predicate the callback function is called, allowing users to add their own functionality or customize their way through the filter predicates. Changing check_pred_tree function to use walk_pred_tree. Signed-off-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-08-19tracing/filter: Simplify tracepoint event lookupJiri Olsa1-6/+3
We dont need to perform lookup through the ftrace_events list, instead we can use the 'tp_event' field. Each perf_event contains tracepoint event field 'tp_event', which got initialized during the tracepoint event initialization. Signed-off-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-08-19tracing/filter: Remove field_name from filter_pred structJiri Olsa2-51/+13
The field_name was used just for finding event's fields. This way we don't need to care about field_name allocation/free. Signed-off-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-08-19tracing/filter: Separate predicate init and filter additionJiri Olsa1-33/+23
Making the code cleaner by having one function to fully prepare the predicate (create_pred), and another to add the predicate to the filter (filter_add_pred). As a benefit, this way the dry_run flag stays only inside the replace_preds function and is not passed deeper. Signed-off-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-08-19tracing/filter: Use static allocation for filter predicatesJiri Olsa1-41/+16
Don't dynamically allocate filter_pred struct, use static memory. This way we can get rid of the code managing the dynamic filter_pred struct object. The create_pred function integrates create_logical_pred function. This way the static predicate memory is returned only from one place. Signed-off-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]>
2011-08-19Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds1-5/+16
* 'for-linus' of git://git.kernel.dk/linux-block: (23 commits) Revert "cfq: Remove special treatment for metadata rqs." block: fix flush machinery for stacking drivers with differring flush flags block: improve rq_affinity placement blktrace: add FLUSH/FUA support Move some REQ flags to the common bio/request area allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH xen/blkback: Make description more obvious. cfq-iosched: Add documentation about idling block: Make rq_affinity = 1 work as expected block: swim3: fix unterminated of_device_id table block/genhd.c: remove useless cast in diskstats_show() drivers/cdrom/cdrom.c: relax check on dvd manufacturer value drivers/block/drbd/drbd_nl.c: use bitmap_parse instead of __bitmap_parse bsg-lib: add module.h include cfq-iosched: Reduce linked group count upon group destruction blk-throttle: correctly determine sync bio loop: fix deadlock when sysfs and LOOP_CLR_FD race against each other loop: add BLK_DEV_LOOP_MIN_COUNT=%i to allow distros 0 pre-allocated loop devices loop: add management interface for on-demand device allocation loop: replace linked list of allocated devices with an idr index ...
2011-08-11blktrace: add FLUSH/FUA supportNamhyung Kim1-5/+16
Add FLUSH/FUA support to blktrace. As FLUSH precedes WRITE and/or FUA follows WRITE, use the same 'F' flag for both cases and distinguish them by their (relative) position. The end results look like (other flags might be shown also): - WRITE: W - WRITE_FLUSH: FW - WRITE_FUA: WF - WRITE_FLUSH_FUA: FWF Note that we reuse TC_BARRIER due to lack of bit space of act_mask so that the older versions of blktrace tools will report flush requests as barriers from now on. Cc: Steven Rostedt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Reviewed-by: Jeff Moyer <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2011-08-10tracing: Clean up tb_fmt to not give faulty compile warningSteven Rostedt1-9/+10
gcc incorrectly states that the variable "fmt" is uninitialized when CC_OPITMIZE_FOR_SIZE is set. Instead of just blindly setting fmt to NULL, the code is cleaned up a little to be a bit easier for humans to follow, as well as gcc to know the variables are initialized. Signed-off-by: Steven Rostedt <[email protected]>
2011-08-05Merge branch 'linus' into perf/urgentIngo Molnar16-488/+860
Merge reason: Include most of the merge window trees, to do fixes on top. Signed-off-by: Ingo Molnar <[email protected]>
2011-07-26atomic: use <linux/atomic.h>Arun Sharma2-2/+2
This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: David Miller <[email protected]> Cc: Eric Dumazet <[email protected]> Acked-by: Mike Frysinger <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-07-25trace events: Update version number reference to new 3.x scheme for ↵Jesper Juhl1-1/+1
EVENT_POWER_TRACING_DEPRECATED What was scheduled to be 2.6.41 is now going to be 3.1 . Signed-off-by: Jesper Juhl <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Steven Rostedt <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2011-07-21Merge branch 'tip/perf/core' of ↵Ingo Molnar7-152/+419
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
2011-07-21Merge branch 'perf/urgent' into perf/coreIngo Molnar5-30/+135
Merge reason: pick up the latest fixes - they won't make v3.0. Signed-off-by: Ingo Molnar <[email protected]>
2011-07-15tracing/kprobe: Update symbol reference when loading moduleMasami Hiramatsu1-1/+36
Since the address of a module-local variable can only be solved after the target module is loaded, the symbol fetch-argument should be updated when loading target module. Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Link: http://lkml.kernel.org/r/20110627072703.6528.75042.stgit@fedora15 Signed-off-by: Steven Rostedt <[email protected]>
2011-07-15tracing/kprobes: Support module init function probingMasami Hiramatsu1-26/+138
To support probing module init functions, kprobe-tracer allows user to define a probe on non-existed function when it is given with a module name. This also enables user to set a probe on a function on a specific module, even if a same name (but different) function is locally defined in another module. The module name must be in the front of function name and separated by a ':'. e.g. btrfs:btrfs_init_sysfs Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Link: http://lkml.kernel.org/r/20110627072656.6528.89970.stgit@fedora15 Signed-off-by: Steven Rostedt <[email protected]>