Age | Commit message (Collapse) | Author | Files | Lines |
|
Since the commit 36e2c7421f02 ("fs: don't allow splice read/write
without explicit ops") is applied to the kernel, splice() and
sendfile() calls on the trace file (/sys/kernel/debug/tracing
/trace) return EINVAL.
This patch restores these system calls by initializing splice_read
in file_operations of the trace file. This patch only enables such
functionalities for the read case.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: [email protected]
Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops")
Signed-off-by: Sung-hun Kim <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
smatch reports this warning
kernel/trace/ftrace.c:2594:19: warning:
symbol 'direct_ops' was not declared. Should it be static?
The variable direct_ops is only used in ftrace.c, so it should be static
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Signed-off-by: Tom Rix <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
The hwlatd tracer will end up starting multiple per-cpu threads with
the following script:
#!/bin/sh
cd /sys/kernel/debug/tracing
echo 0 > tracing_on
echo hwlat > current_tracer
echo per-cpu > hwlat_detector/mode
echo 100000 > hwlat_detector/width
echo 200000 > hwlat_detector/window
echo 1 > tracing_on
To fix the issue, check if the hwlatd thread for the cpu is already
running, before starting a new one. Along with the previous patch, this
avoids running multiple instances of the same CPU thread on the system.
Link: https://lore.kernel.org/all/[email protected]/
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: f46b16520a087 ("trace/hwlat: Implement the per-cpu mode")
Signed-off-by: Tero Kristo <[email protected]>
Acked-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Do not wipe the contents of the per-cpu kthread data when starting the
tracer, as this will completely forget about already running instances
and can later start new additional per-cpu threads.
Link: https://lore.kernel.org/all/[email protected]/
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: f46b16520a087 ("trace/hwlat: Implement the per-cpu mode")
Signed-off-by: Tero Kristo <[email protected]>
Acked-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
storage-class-specifier to static
smatch reports several similar warnings
kernel/trace/trace_osnoise.c:220:1: warning:
symbol '__pcpu_scope_per_cpu_osnoise_var' was not declared. Should it be static?
kernel/trace/trace_osnoise.c:243:1: warning:
symbol '__pcpu_scope_per_cpu_timerlat_var' was not declared. Should it be static?
kernel/trace/trace_osnoise.c:335:14: warning:
symbol 'interface_lock' was not declared. Should it be static?
kernel/trace/trace_osnoise.c:2242:5: warning:
symbol 'timerlat_min_period' was not declared. Should it be static?
kernel/trace/trace_osnoise.c:2243:5: warning:
symbol 'timerlat_max_period' was not declared. Should it be static?
These variables are only used in trace_osnoise.c, so it should be static
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Signed-off-by: Tom Rix <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Acked-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Overwriting the error code with the deletion result may cause the
function to return 0 despite encountering an error. Commit b111545d26c0
("tracing: Remove the useless value assignment in
test_create_synth_event()") solves a similar issue by
returning the original error code, so this patch does the same.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Signed-off-by: Anton Gusev <[email protected]>
Reviewed-by: Steven Rostedt (Google) <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
A while ago where the trace events had the following:
rcu_read_lock_sched_notrace();
rcu_dereference_sched(...);
rcu_read_unlock_sched_notrace();
If the tracepoint is enabled, it could trigger RCU issues if called in
the wrong place. And this warning was only triggered if lockdep was
enabled. If the tracepoint was never enabled with lockdep, the bug would
not be caught. To handle this, the above sequence was done when lockdep
was enabled regardless if the tracepoint was enabled or not (although the
always enabled code really didn't do anything, it would still trigger a
warning).
But a lot has changed since that lockdep code was added. One is, that
sequence no longer triggers any warning. Another is, the tracepoint when
enabled doesn't even do that sequence anymore.
The main check we care about today is whether RCU is "watching" or not.
So if lockdep is enabled, always check if rcu_is_watching() which will
trigger a warning if it is not (tracepoints require RCU to be watching).
Note, that old sequence did add a bit of overhead when lockdep was enabled,
and with the latest kernel updates, would cause the system to slow down
enough to trigger kernel "stalled" warnings.
Link: http://lore.kernel.org/lkml/[email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Joel Fernandes <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Paul E. McKenney <[email protected]>
Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
When CONFIG_FUNCTION_GRAPH_TRACER is disabled, __kcfi_typeid_ftrace_stub_graph
is missing, causing a link failure:
ld.lld: error: undefined symbol: __kcfi_typeid_ftrace_stub_graph
referenced by arch/x86/kernel/ftrace_64.o:(__cfi_ftrace_stub_graph) in archive vmlinux.a
Mark the reference to it as conditional on the same symbol, as
is done on arm64.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: Peter Zijlstra <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Fixes: 883bbbffa5a4 ("ftrace,kcfi: Separate ftrace_stub() and ftrace_stub_graph()")
See-also: 2598ac6ec493 ("arm64: ftrace: Define ftrace_stub_graph only with FUNCTION_GRAPH_TRACER")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
KASAN reported follow problem:
BUG: KASAN: use-after-free in lookup_rec
Read of size 8 at addr ffff000199270ff0 by task modprobe
CPU: 2 Comm: modprobe
Call trace:
kasan_report
__asan_load8
lookup_rec
ftrace_location
arch_check_ftrace_location
check_kprobe_address_safe
register_kprobe
When checking pg->records[pg->index - 1].ip in lookup_rec(), it can get a
pg which is newly added to ftrace_pages_start in ftrace_process_locs().
Before the first pg->index++, index is 0 and accessing pg->records[-1].ip
will cause this problem.
Don't check the ip when pg->index is 0.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: [email protected]
Fixes: 9644302e3315 ("ftrace: Speed up search by skipping pages by address")
Suggested-by: Steven Rostedt (Google) <[email protected]>
Signed-off-by: Chen Zhongjin <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
The function hist_field_name() cannot handle being passed a NULL field
parameter. It should never be NULL, but due to a previous bug, NULL was
passed to the function and the kernel crashed due to a NULL dereference.
Mark Rutland reported this to me on IRC.
The bug was fixed, but to prevent future bugs from crashing the kernel,
check the field and add a WARN_ON() if it is NULL.
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Andrew Morton <[email protected]>
Reported-by: Mark Rutland <[email protected]>
Fixes: c6afad49d127f ("tracing: Add hist trigger 'sym' and 'sym-offset' modifiers")
Tested-by: Mark Rutland <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Histogram values can not be strings, stacktraces, graphs, symbols,
syscalls, or grouped in buckets or log. Give an error if a value is set to
do so.
Note, the histogram code was not prepared to handle these modifiers for
histograms and caused a bug.
Mark Rutland reported:
# echo 'p:copy_to_user __arch_copy_to_user n=$arg2' >> /sys/kernel/tracing/kprobe_events
# echo 'hist:keys=n:vals=hitcount.buckets=8:sort=hitcount' > /sys/kernel/tracing/events/kprobes/copy_to_user/trigger
# cat /sys/kernel/tracing/events/kprobes/copy_to_user/hist
[ 143.694628] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 143.695190] Mem abort info:
[ 143.695362] ESR = 0x0000000096000004
[ 143.695604] EC = 0x25: DABT (current EL), IL = 32 bits
[ 143.695889] SET = 0, FnV = 0
[ 143.696077] EA = 0, S1PTW = 0
[ 143.696302] FSC = 0x04: level 0 translation fault
[ 143.702381] Data abort info:
[ 143.702614] ISV = 0, ISS = 0x00000004
[ 143.702832] CM = 0, WnR = 0
[ 143.703087] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000448f9000
[ 143.703407] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
[ 143.704137] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 143.704714] Modules linked in:
[ 143.705273] CPU: 0 PID: 133 Comm: cat Not tainted 6.2.0-00003-g6fc512c10a7c #3
[ 143.706138] Hardware name: linux,dummy-virt (DT)
[ 143.706723] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 143.707120] pc : hist_field_name.part.0+0x14/0x140
[ 143.707504] lr : hist_field_name.part.0+0x104/0x140
[ 143.707774] sp : ffff800008333a30
[ 143.707952] x29: ffff800008333a30 x28: 0000000000000001 x27: 0000000000400cc0
[ 143.708429] x26: ffffd7a653b20260 x25: 0000000000000000 x24: ffff10d303ee5800
[ 143.708776] x23: ffffd7a6539b27b0 x22: ffff10d303fb8c00 x21: 0000000000000001
[ 143.709127] x20: ffff10d303ec2000 x19: 0000000000000000 x18: 0000000000000000
[ 143.709478] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 143.709824] x14: 0000000000000000 x13: 203a6f666e692072 x12: 6567676972742023
[ 143.710179] x11: 0a230a6d6172676f x10: 000000000000002c x9 : ffffd7a6521e018c
[ 143.710584] x8 : 000000000000002c x7 : 7f7f7f7f7f7f7f7f x6 : 000000000000002c
[ 143.710915] x5 : ffff10d303b0103e x4 : ffffd7a653b20261 x3 : 000000000000003d
[ 143.711239] x2 : 0000000000020001 x1 : 0000000000000001 x0 : 0000000000000000
[ 143.711746] Call trace:
[ 143.712115] hist_field_name.part.0+0x14/0x140
[ 143.712642] hist_field_name.part.0+0x104/0x140
[ 143.712925] hist_field_print+0x28/0x140
[ 143.713125] event_hist_trigger_print+0x174/0x4d0
[ 143.713348] hist_show+0xf8/0x980
[ 143.713521] seq_read_iter+0x1bc/0x4b0
[ 143.713711] seq_read+0x8c/0xc4
[ 143.713876] vfs_read+0xc8/0x2a4
[ 143.714043] ksys_read+0x70/0xfc
[ 143.714218] __arm64_sys_read+0x24/0x30
[ 143.714400] invoke_syscall+0x50/0x120
[ 143.714587] el0_svc_common.constprop.0+0x4c/0x100
[ 143.714807] do_el0_svc+0x44/0xd0
[ 143.714970] el0_svc+0x2c/0x84
[ 143.715134] el0t_64_sync_handler+0xbc/0x140
[ 143.715334] el0t_64_sync+0x190/0x194
[ 143.715742] Code: a9bd7bfd 910003fd a90153f3 aa0003f3 (f9400000)
[ 143.716510] ---[ end trace 0000000000000000 ]---
Segmentation fault
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Andrew Morton <[email protected]>
Fixes: c6afad49d127f ("tracing: Add hist trigger 'sym' and 'sym-offset' modifiers")
Reported-by: Mark Rutland <[email protected]>
Tested-by: Mark Rutland <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Remove unnecessary NULL assignment int create_new_subsystem().
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Wang ShaoBo <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
In the case of keeping the system running, the preferred method for
tracing the kernel is dynamic tracing (kprobe), but the drawback of
this method is that events are lost, especially when tracing packages
in the network stack.
Livepatching provides a potential solution, which is to reimplement the
function you want to replace and insert a static tracepoint.
In such a way, custom stable static tracepoints can be expanded without
rebooting the system.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Jianlin Lv <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
The canonical location for the tracefs filesystem is at /sys/kernel/tracing.
But, from Documentation/trace/ftrace.rst:
Before 4.1, all ftrace tracing control files were within the debugfs
file system, which is typically located at /sys/kernel/debug/tracing.
For backward compatibility, when mounting the debugfs file system,
the tracefs file system will be automatically mounted at:
/sys/kernel/debug/tracing
Many comments and Kconfig help messages in the tracing code still refer
to this older debugfs path, so let's update them to avoid confusion.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Reviewed-by: Mukesh Ojha <[email protected]>
Signed-off-by: Ross Zwisler <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Fix a small problem with the histogram specification in the
Documentation, and change the example to show output using a
stacktrace field rather than the global stacktrace.
Link: https://lkml.kernel.org/r/f75f807dd4998249e513515f703a2ff7407605f4.1676063532.git.zanussi@kernel.org
Signed-off-by: Tom Zanussi <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
The current code will always use the current stacktrace as a key even
if a stacktrace contained in a specific event field was specified.
For example, we expect to use the 'unsigned long[] stack' field in the
below event in the histogram:
# echo 's:block_lat pid_t pid; u64 delta; unsigned long[] stack;' > /sys/kernel/debug/tracing/dynamic_events
# echo 'hist:keys=delta.buckets=100,stack.stacktrace:sort=delta' > /sys/kernel/debug/tracing/events/synthetic/block_lat/trigger
But in fact, when we type out the trigger, we see that it's using the
plain old global 'stacktrace' as the key, which is just the stacktrace
when the event was hit and not the stacktrace contained in the event,
which is what we want:
# cat /sys/kernel/debug/tracing/events/synthetic/block_lat/trigger
hist:keys=delta.buckets=100,stacktrace:vals=hitcount:sort=delta.buckets=100:size=2048 [active]
And in fact, there's no code to actually retrieve it from the event,
so we need to add HIST_FIELD_FN_STACK and hist_field_stack() to get it
and hook it into the trigger code. For now, since the stack is just
using dynamic strings, this could just use the dynamic string
function, but it seems cleaner to have a dedicated function an be able
to tweak independently as necessary.
Link: https://lkml.kernel.org/r/11aa614c82976adbfa4ea763dbe885b5fb01d59c.1676063532.git.zanussi@kernel.org
Signed-off-by: Tom Zanussi <[email protected]>
[ Fixed 32bit build warning reported by kernel test robot <[email protected]> ]
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Currently, there are a few problems when printing hist triggers and
trace output when using stacktrace variables. This fixes the problems
seen below:
# echo 'hist:keys=delta.buckets=100,stack.stacktrace:sort=delta' > /sys/kernel/debug/tracing/events/synthetic/block_lat/trigger
# cat /sys/kernel/debug/tracing/events/synthetic/block_lat/trigger
hist:keys=delta.buckets=100,stacktrace:vals=hitcount:sort=delta.buckets=100:size=2048 [active]
# echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace if prev_state == 2' >> /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
# cat /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
hist:keys=next_pid:vals=hitcount:ts=common_timestamp.usecs,st=stacktrace.stacktrace:sort=hitcount:size=2048:clock=global if prev_state == 2 [active]
and also in the trace output (should be stack.stacktrace):
{ delta: ~ 100-199, stacktrace __schedule+0xa19/0x1520
Link: https://lkml.kernel.org/r/60bebd4e546728e012a7a2bcbf58716d48ba6edb.1676063532.git.zanussi@kernel.org
Signed-off-by: Tom Zanussi <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
The max string length for a histogram variable is 256 bytes. The max depth
of a stacktrace is 16. With 8byte words, that's 16 * 8 = 128. Which can
easily fit in the string variable. The histogram stacktrace is being
stored in the string value (with the given max length), with the
assumption it will fit. To make sure that this is always the case (in the
case that the stack trace depth increases), add a BUILD_BUG_ON() to test
this.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]/
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Because stacktraces are saved in dynamic strings,
trace_event_raw_event_synth() uses strlen to determine the length of
the stack. Stacktraces may contain 0-bytes, though, in the saved
addresses, so the length found and passed to reserve() will be too
small.
Fix this by using the first unsigned long in the stack variables to
store the actual number of elements in the stack and have
trace_event_raw_event_synth() use that to determine the length of the
stack.
Link: https://lkml.kernel.org/r/1ed6906cd9d6477ef2bd8e63c61de20a9ffe64d7.1676063532.git.zanussi@kernel.org
Signed-off-by: Tom Zanussi <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Add to ftrace_boot_snapshot, "=<instance>" name, where the instance will
get a snapshot buffer, and will take a snapshot at the end of boot (which
will save the boot traces).
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Andrew Morton <[email protected]>
Reviewed-by: Ross Zwisler <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Add a generic trace_array_puts() that can be used to "trace_puts()" into
an allocated trace_array instance. This is just another variant of
trace_array_printk().
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Andrew Morton <[email protected]>
Reviewed-by: Ross Zwisler <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Add the format of:
trace_instance=foo,sched:sched_switch,irq_handler_entry,initcall
That will create the "foo" instance and enable the sched_switch event
(here were the "sched" system is explicitly specified), the
irq_handler_entry event, and all events under the system initcall.
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Andrew Morton <[email protected]>
Reviewed-by: Ross Zwisler <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Add kernel command line to add tracing instances. This only creates
instances at boot but still does not enable any events to them. Later
changes will extend this command line to add enabling of events, filters,
and triggers. As well as possibly redirecting trace_printk()!
Link: https://lkml.kernel.org/r/[email protected]
Cc: Randy Dunlap <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Andrew Morton <[email protected]>
Reviewed-by: Ross Zwisler <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
The test to check if the field is a stack is to be done if it is not a
string. But the code had:
} if (event->fields[i]->is_stack) {
and not
} else if (event->fields[i]->is_stack) {
which would cause it to always be tested. Worse yet, this also included an
"else" statement that was only to be called if the field was not a string
and a stack, but this code allows it to be called if it was a string (and
not a stack).
Also fixed some whitespace issues.
Link: https://lore.kernel.org/all/[email protected]/
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: Tom Zanussi <[email protected]>
Fixes: 00cf3d672a9d ("tracing: Allow synthetic events to pass around stacktraces")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
|
|
smatch reports this representative issue
samples/ftrace/ftrace-ops.c:15:14: warning: symbol 'nr_function_calls' was not declared. Should it be static?
The nr_functions_calls and several other global variables are only
used in ftrace-ops.c, so they should be static.
Remove the instances of initializing static int to 0.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Signed-off-by: Tom Rix <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Calculating the average period requires a 64-bit division that leads
to a link failure on 32-bit architectures:
x86_64-linux-ld: samples/ftrace/ftrace-ops.o: in function `ftrace_ops_sample_init':
ftrace-ops.c:(.init.text+0x23b): undefined reference to `__udivdi3'
Use the div_u64() helper to do this instead. Since this is an init function that
is not called frequently, the runtime overhead is going to be acceptable.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Fixes: b56c68f705ca ("ftrace: Add sample with custom ops")
Signed-off-by: Arnd Bergmann <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
When other architectures without the nospec functionality write their
direct-call functions of samples/ftrace/*.c, the including of
asm/nospec-branch.h must be taken care to fix the no header file found
error in building process.
This commit (ee3e2469b346 "x86/ftrace: Make it call depth tracking aware")
file-globally includes asm/nospec-branch.h providing CALL_DEPTH_ACCOUNT
for only x86 direct-call functions.
It seems better to move the including to `#ifdef CONFIG_X86_64`.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Signed-off-by: Song Shuai <[email protected]>
Reviewed-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
there is one dwc3 trace event declare as below,
DECLARE_EVENT_CLASS(dwc3_log_event,
TP_PROTO(u32 event, struct dwc3 *dwc),
TP_ARGS(event, dwc),
TP_STRUCT__entry(
__field(u32, event)
__field(u32, ep0state)
__dynamic_array(char, str, DWC3_MSG_MAX)
),
TP_fast_assign(
__entry->event = event;
__entry->ep0state = dwc->ep0state;
),
TP_printk("event (%08x): %s", __entry->event,
dwc3_decode_event(__get_str(str), DWC3_MSG_MAX,
__entry->event, __entry->ep0state))
);
the problem is when trace function called, it will allocate up to
DWC3_MSG_MAX bytes from trace event buffer, but never fill the buffer
during fast assignment, it only fill the buffer when output function are
called, so this means if output function are not called, the buffer will
never used.
add __get_buf(len) which acquiree buffer from iter->tmp_seq when trace
output function called, it allow user write string to acquired buffer.
the mentioned dwc3 trace event will changed as below,
DECLARE_EVENT_CLASS(dwc3_log_event,
TP_PROTO(u32 event, struct dwc3 *dwc),
TP_ARGS(event, dwc),
TP_STRUCT__entry(
__field(u32, event)
__field(u32, ep0state)
),
TP_fast_assign(
__entry->event = event;
__entry->ep0state = dwc->ep0state;
),
TP_printk("event (%08x): %s", __entry->event,
dwc3_decode_event(__get_buf(DWC3_MSG_MAX), DWC3_MSG_MAX,
__entry->event, __entry->ep0state))
);.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Signed-off-by: Linyu Yuan <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Most shell command snippets (echo/cat) and their output are already in
literal code blocks. However a few still isn't wrapped, in which the
htmldocs output is ugly.
Wrap the remaining unwrapped snippets, while also fix recent kernel test
robot warnings.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Link: https://lore.kernel.org/linux-doc/[email protected]/
Fixes: 88238513bb2671 ("tracing/histogram: Document variable stacktrace")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Bagas Sanjaya <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
No slack time is being passed, just use schedule_hrtimeout().
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Signed-off-by: Davidlohr Bueso <[email protected]>
Acked-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
The bpf events are created by the same macro magic as tracefs trace
events are. But to hook into bpf, it has its own code. It duplicates many
of the same macros as the tracefs macros and this is an issue because it
misses bug fixes as well as any new enhancements that come with the other
trace macros.
As the trace macros have been put into their own staging files, have bpf
take advantage of this and use the tracefs stage 6 macros that the "fast
ssign" portion of the trace event macro uses.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/lkml/[email protected]/
Cc: [email protected]
Cc: Peter Zijlstra <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Reported-by: Linyu Yuan <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
The perf events are created by the same macro magic as tracefs trace
events are. But to hook into perf, it has its own code. It duplicates many
of the same macros as the tracefs macros and this is an issue because it
misses bug fixes as well as any new enhancements that come with the other
trace macros.
As the trace macros have been put into their own staging files, have perf
take advantage of this and use the tracefs stage 6 macros that the "fast
assign" portion of the trace event macro uses.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lore.kernel.org/lkml/[email protected]/
Cc: Frederic Weisbecker <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Reported-by: Linyu Yuan <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Update the selftests to include a test of passing a stacktrace between the
events of a synthetic event.
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Tom Zanussi <[email protected]>
Cc: Ross Zwisler <[email protected]>
Cc: Ching-lin Yu <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Add a little documentation (and a useful example) of how a stacktrace can
be used within a histogram variable and synthetic event.
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Tom Zanussi <[email protected]>
Cc: Ross Zwisler <[email protected]>
Cc: Ching-lin Yu <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Now that stacktraces can be part of synthetic events, allow a key to be
typed as a stacktrace.
# cd /sys/kernel/tracing
# echo 's:block_lat u64 delta; unsigned long stack[];' >> dynamic_events
# echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace if prev_state == 2' >> events/sched/sched_switch/trigger
# echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,st2=$st:onmatch(sched.sched_switch).trace(block_lat,$delta,$st2)' >> events/sched/sched_switch/trigger
# echo 'hist:keys=delta.buckets=100,stack.stacktrace:sort=delta' > events/synthetic/block_lat/trigger
# cat events/synthetic/block_lat/hist
# event histogram
#
# trigger info: hist:keys=delta.buckets=100,stacktrace:vals=hitcount:sort=delta.buckets=100:size=2048 [active]
#
{ delta: ~ 0-99, stacktrace:
event_hist_trigger+0x464/0x480
event_triggers_call+0x52/0xe0
trace_event_buffer_commit+0x193/0x250
trace_event_raw_event_sched_switch+0xfc/0x150
__traceiter_sched_switch+0x41/0x60
__schedule+0x448/0x7b0
schedule_idle+0x26/0x40
cpu_startup_entry+0x19/0x20
start_secondary+0xed/0xf0
secondary_startup_64_no_verify+0xe0/0xeb
} hitcount: 6
{ delta: ~ 0-99, stacktrace:
event_hist_trigger+0x464/0x480
event_triggers_call+0x52/0xe0
trace_event_buffer_commit+0x193/0x250
trace_event_raw_event_sched_switch+0xfc/0x150
__traceiter_sched_switch+0x41/0x60
__schedule+0x448/0x7b0
schedule_idle+0x26/0x40
cpu_startup_entry+0x19/0x20
__pfx_kernel_init+0x0/0x10
arch_call_rest_init+0xa/0x24
start_kernel+0x964/0x98d
secondary_startup_64_no_verify+0xe0/0xeb
} hitcount: 3
{ delta: ~ 0-99, stacktrace:
event_hist_trigger+0x464/0x480
event_triggers_call+0x52/0xe0
trace_event_buffer_commit+0x193/0x250
trace_event_raw_event_sched_switch+0xfc/0x150
__traceiter_sched_switch+0x41/0x60
__schedule+0x448/0x7b0
schedule+0x5a/0xb0
worker_thread+0xaf/0x380
kthread+0xe9/0x110
ret_from_fork+0x2c/0x50
} hitcount: 1
{ delta: ~ 100-199, stacktrace:
event_hist_trigger+0x464/0x480
event_triggers_call+0x52/0xe0
trace_event_buffer_commit+0x193/0x250
trace_event_raw_event_sched_switch+0xfc/0x150
__traceiter_sched_switch+0x41/0x60
__schedule+0x448/0x7b0
schedule_idle+0x26/0x40
cpu_startup_entry+0x19/0x20
start_secondary+0xed/0xf0
secondary_startup_64_no_verify+0xe0/0xeb
} hitcount: 15
[..]
{ delta: ~ 8500-8599, stacktrace:
event_hist_trigger+0x464/0x480
event_triggers_call+0x52/0xe0
trace_event_buffer_commit+0x193/0x250
trace_event_raw_event_sched_switch+0xfc/0x150
__traceiter_sched_switch+0x41/0x60
__schedule+0x448/0x7b0
schedule_idle+0x26/0x40
cpu_startup_entry+0x19/0x20
start_secondary+0xed/0xf0
secondary_startup_64_no_verify+0xe0/0xeb
} hitcount: 1
Totals:
Hits: 89
Entries: 11
Dropped: 0
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Tom Zanussi <[email protected]>
Cc: Ross Zwisler <[email protected]>
Cc: Ching-lin Yu <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Allow a stacktrace from one event to be displayed by the end event of a
synthetic event. This is very useful when looking for the longest latency
of a sleep or something blocked on I/O.
# cd /sys/kernel/tracing/
# echo 's:block_lat pid_t pid; u64 delta; unsigned long[] stack;' > dynamic_events
# echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace if prev_state == 1||prev_state == 2' > events/sched/sched_switch/trigger
# echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,s=$st:onmax($delta).trace(block_lat,prev_pid,$delta,$s)' >> events/sched/sched_switch/trigger
The above creates a "block_lat" synthetic event that take the stacktrace of
when a task schedules out in either the interruptible or uninterruptible
states, and on a new per process max $delta (the time it was scheduled
out), will print the process id and the stacktrace.
# echo 1 > events/synthetic/block_lat/enable
# cat trace
# TASK-PID CPU# ||||| TIMESTAMP FUNCTION
# | | | ||||| | |
kworker/u16:0-767 [006] d..4. 560.645045: block_lat: pid=767 delta=66 stack=STACK:
=> __schedule
=> schedule
=> pipe_read
=> vfs_read
=> ksys_read
=> do_syscall_64
=> 0x966000aa
<idle>-0 [003] d..4. 561.132117: block_lat: pid=0 delta=413787 stack=STACK:
=> __schedule
=> schedule
=> schedule_hrtimeout_range_clock
=> do_sys_poll
=> __x64_sys_poll
=> do_syscall_64
=> 0x966000aa
<...>-153 [006] d..4. 562.068407: block_lat: pid=153 delta=54 stack=STACK:
=> __schedule
=> schedule
=> io_schedule
=> rq_qos_wait
=> wbt_wait
=> __rq_qos_throttle
=> blk_mq_submit_bio
=> submit_bio_noacct_nocheck
=> ext4_bio_write_page
=> mpage_submit_page
=> mpage_process_page_bufs
=> mpage_prepare_extent_to_map
=> ext4_do_writepages
=> ext4_writepages
=> do_writepages
=> __writeback_single_inode
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Tom Zanussi <[email protected]>
Cc: Ross Zwisler <[email protected]>
Cc: Ching-lin Yu <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Allow to save stacktraces into a histogram variable. This will be used by
synthetic events to allow a stacktrace from one event to be passed and
displayed by another event.
The special keyword "stacktrace" is to be used to trigger a stack
trace for the event that the histogram trigger is attached to.
echo 'hist:keys=pid:st=stacktrace" > events/sched/sched_waking/trigger
Currently nothing can get access to the "$st" variable above that contains
the stack trace, but that will soon change.
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Tom Zanussi <[email protected]>
Cc: Ross Zwisler <[email protected]>
Cc: Ching-lin Yu <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
When tracing a dynamic string field for a synthetic event, the offset
calculation for where to write the next event can use struct_size() to
find what the current size of the structure is.
This simplifies the code and makes it less error prone.
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Tom Zanussi <[email protected]>
Cc: Ross Zwisler <[email protected]>
Cc: Ching-lin Yu <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
In a previous commit 7433632c9ff6, buffer, buffer->buffers and
buffer->buffers[cpu] in ring_buffer_wake_waiters() can be NULL,
and thus the related checks are added.
However, in the same call stack, these variables are also used in
ring_buffer_free_read_page():
tracing_buffers_release()
ring_buffer_wake_waiters(iter->array_buffer->buffer)
cpu_buffer = buffer->buffers[cpu] -> Add checks by previous commit
ring_buffer_free_read_page(iter->array_buffer->buffer)
cpu_buffer = buffer->buffers[cpu] -> No check
Thus, to avod possible null-pointer derefernces, the related checks
should be added.
These results are reported by a static tool designed by myself.
Link: https://lkml.kernel.org/r/[email protected]
Reported-by: TOTE Robot <[email protected]>
Signed-off-by: Jia-Ju Bai <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
When reworking core ftrace code or architectural ftrace code, it's often
necessary to test/analyse/benchmark a number of ftrace_ops
configurations. This patch adds a module which can be used to explore
some of those configurations.
I'm using this to benchmark various options for changing the way
trampolines and handling of ftrace_ops work on arm64, and ensuring other
architectures aren't adversely affected.
For example, in a QEMU+KVM VM running on a 2GHz Xeon E5-2660
workstation, loading the module in various configurations produces:
| # insmod ftrace-ops.ko
| ftrace_ops: registering:
| relevant ops: 1
| tracee: tracee_relevant [ftrace_ops]
| tracer: ops_func_nop [ftrace_ops]
| irrelevant ops: 0
| tracee: tracee_irrelevant [ftrace_ops]
| tracer: ops_func_nop [ftrace_ops]
| saving registers: NO
| assist recursion: NO
| assist RCU: NO
| ftrace_ops: Attempted 100000 calls to tracee_relevant [ftrace_ops] in 1681558ns (16ns / call)
| # insmod ftrace-ops.ko nr_ops_irrelevant=5
| ftrace_ops: registering:
| relevant ops: 1
| tracee: tracee_relevant [ftrace_ops]
| tracer: ops_func_nop [ftrace_ops]
| irrelevant ops: 5
| tracee: tracee_irrelevant [ftrace_ops]
| tracer: ops_func_nop [ftrace_ops]
| saving registers: NO
| assist recursion: NO
| assist RCU: NO
| ftrace_ops: Attempted 100000 calls to tracee_relevant [ftrace_ops] in 1693042ns (16ns / call)
| # insmod ftrace-ops.ko nr_ops_relevant=2
| ftrace_ops: registering:
| relevant ops: 2
| tracee: tracee_relevant [ftrace_ops]
| tracer: ops_func_nop [ftrace_ops]
| irrelevant ops: 0
| tracee: tracee_irrelevant [ftrace_ops]
| tracer: ops_func_nop [ftrace_ops]
| saving registers: NO
| assist recursion: NO
| assist RCU: NO
| ftrace_ops: Attempted 100000 calls to tracee_relevant [ftrace_ops] in 11965582ns (119ns / call)
| # insmod ftrace-ops.ko save_regs=true
| ftrace_ops: registering:
| relevant ops: 1
| tracee: tracee_relevant [ftrace_ops]
| tracer: ops_func_nop [ftrace_ops]
| irrelevant ops: 0
| tracee: tracee_irrelevant [ftrace_ops]
| tracer: ops_func_nop [ftrace_ops]
| saving registers: YES
| assist recursion: NO
| assist RCU: NO
| ftrace_ops: Attempted 100000 calls to tracee_relevant [ftrace_ops] in 4459624ns (44ns / call)
Link: https://lkml.kernel.org/r/[email protected]
Cc: Florent Revest <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Mark Rutland <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
With the new filter logic of passing in the name of a function to match an
instruction pointer (or the address of the function), add a test to make
sure that it is functional.
This is also the first test to test plain filtering. The filtering has
been tested via the trigger logic, which uses the same code, but there was
nothing to test just the event filter, so this test is the first to add
such a case.
Link: https://lkml.kernel.org/r/[email protected]
Cc: Andrew Morton <[email protected]>
Cc: Tom Zanussi <[email protected]>
Cc: Zheng Yejian <[email protected]>
Cc: [email protected]
Suggested-by: Masami Hiramatsu (Google) <[email protected]>
Reviewed-by: Ross Zwisler <[email protected]>
Acked-by: Shuah Khan <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
There's been several times where an event records a function address in
its field and I needed to filter on that address for a specific function
name. It required looking up the function in kallsyms, finding its size,
and doing a compare of "field >= function_start && field < function_end".
But this would change from boot to boot and is unreliable in scripts.
Also, it is useful to have this at boot up, where the addresses will not
be known. For example, on the boot command line:
trace_trigger="initcall_finish.traceoff if func.function == acpi_init"
To implement this, add a ".function" prefix, that will check that the
field is of size long, and the only operations allowed (so far) are "=="
and "!=".
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Tom Zanussi <[email protected]>
Cc: Zheng Yejian <[email protected]>
Reviewed-by: Ross Zwisler <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
The pointer ptr is being initialized with a value that is never read,
it is being updated later on a call to strim. Remove the extraneous
initialization.
Link: https://lkml.kernel.org/r/[email protected]
Cc: Daniel Bristot de Oliveira <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
There's no entry in MAINTAINERS for samples/ftrace. Add one so that the
FTRACE maintainers are kept in the loop.
Link: https://lkml.kernel.org/r/[email protected]
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Mark Rutland <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Use the 'struct' keyword for a struct's kernel-doc notation and
use the correct function parameter name to eliminate kernel-doc
warnings:
kernel/trace/trace_events_filter.c:136: warning: cannot understand function prototype: 'struct prog_entry '
kerne/trace/trace_events_filter.c:155: warning: Excess function parameter 'when_to_branch' description in 'update_preds'
Also correct some trivial punctuation problems.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Fix spelling in lib/ Kconfig files.
(reported by codespell)
Link: https://lkml.kernel.org/r/[email protected]
Cc: Andrew Morton <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: [email protected]
Reviewed-by: Marco Elver <[email protected]>
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Function 'create_hist_field' is called recursively at
trace_events_hist.c:1954 and can return NULL-value that's why we have
to check it to avoid null pointer dereference.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Fixes: 30350d65ac56 ("tracing: Add variable support to hist triggers")
Signed-off-by: Natalia Petrova <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
list_for_each_entry_rcu() has built-in RCU and lock checking.
Pass cond argument to list_for_each_entry_rcu() to silence false lockdep
warning when CONFIG_PROVE_RCU_LIST is enabled.
Execute as follow:
[tracing]# echo osnoise > current_tracer
[tracing]# echo 1 > tracing_on
[tracing]# echo 0 > tracing_on
The trace_types_lock is held when osnoise_tracer_stop() or
timerlat_tracer_stop() are called in the non-RCU read side section.
So, pass lockdep_is_held(&trace_types_lock) to silence false lockdep
warning.
Link: https://lkml.kernel.org/r/[email protected]
Cc: Masami Hiramatsu <[email protected]>
Fixes: dae181349f1e ("tracing/osnoise: Support a list of trace_array *tr")
Acked-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Chuang Wang <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Fix some editorial nits in trace Kconfig.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
The instructions for the ftrace-bisect.sh script, which is used to find
what function is being traced that is causing a kernel crash, and possibly
a triple fault reboot, uses the old method. In 5.1, a new feature was
added that let the user write in the index into available_filter_functions
that maps to the function a user wants to set in set_ftrace_filter (or
set_ftrace_notrace). This takes O(1) to set, as suppose to writing a
function name, which takes O(n) (where n is the number of functions in
available_filter_functions).
The ftrace-bisect.sh requires setting half of the functions in
available_filter_functions, which is O(n^2) using the name method to enable
and can take several minutes to complete. The number method is O(n) which
takes less than a second to complete. Using the number method for any
kernel 5.1 and after is the proper way to do the bisect.
Update the usage to reflect the new change, as well as using the
/sys/kernel/tracing path instead of the obsolete debugfs path.
Link: https://lkml.kernel.org/r/[email protected]
Cc: [email protected]
Cc: Masami Hiramatsu <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Fixes: f79b3f338564e ("ftrace: Allow enabling of filters via index of available_filter_functions")
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|