Age | Commit message (Collapse) | Author | Files | Lines |
|
Currently the callback passed to arch_stack_walk() has an argument called
reliable passed to it to indicate if the stack entry is reliable, a comment
says that this is used by some printk() consumers. However in the current
kernel none of the arch_stack_walk() implementations ever set this flag to
true and the only callback implementation we have is in the generic
stacktrace code which ignores the flag. It therefore appears that this
flag is redundant so we can simplify and clarify things by removing it.
Signed-off-by: Mark Brown <[email protected]>
Reviewed-by: Miroslav Benes <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
If a user task's stack is empty, or if it only has user regs, ORC
reports it as a reliable empty stack. But arch_stack_walk_reliable()
incorrectly treats it as unreliable.
That happens because the only success path for user tasks is inside the
loop, which only iterates on non-empty stacks. Generally, a user task
must end in a user regs frame, but an empty stack is an exception to
that rule.
Thanks to commit 71c95825289f ("x86/unwind/orc: Fix error handling in
__unwind_start()"), unwind_start() now sets state->error appropriately.
So now for both ORC and FP unwinders, unwind_done() and !unwind_error()
always means the end of the stack was successfully reached. So the
success path for kthreads is no longer needed -- it can also be used for
empty user tasks.
Reported-by: Wang ShaoBo <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Wang ShaoBo <[email protected]>
Link: https://lkml.kernel.org/r/f136a4e5f019219cbc4f4da33b30c2f44fa65b84.1594994374.git.jpoimboe@redhat.com
|
|
rather than relying upon the magic in raw_copy_from_user()
Signed-off-by: Al Viro <[email protected]>
|
|
When arch_stack_walk_user() is called from atomic contexts, access_ok() can
trigger the following warning if compiled with CONFIG_DEBUG_ATOMIC_SLEEP=y.
Reproducer:
// CONFIG_DEBUG_ATOMIC_SLEEP=y
# cd /sys/kernel/debug/tracing
# echo 1 > options/userstacktrace
# echo 1 > events/irq/irq_handler_entry/enable
WARNING: CPU: 0 PID: 2649 at arch/x86/kernel/stacktrace.c:103 arch_stack_walk_user+0x6e/0xf6
CPU: 0 PID: 2649 Comm: bash Not tainted 5.3.0-rc1+ #99
RIP: 0010:arch_stack_walk_user+0x6e/0xf6
Call Trace:
<IRQ>
stack_trace_save_user+0x10a/0x16d
trace_buffer_unlock_commit_regs+0x185/0x240
trace_event_buffer_commit+0xec/0x330
trace_event_raw_event_irq_handler_entry+0x159/0x1e0
__handle_irq_event_percpu+0x22d/0x440
handle_irq_event_percpu+0x70/0x100
handle_irq_event+0x5a/0x8b
handle_edge_irq+0x12f/0x3f0
handle_irq+0x34/0x40
do_IRQ+0xa6/0x1f0
common_interrupt+0xf/0xf
</IRQ>
Fix it by calling __range_not_ok() directly instead of access_ok() as
copy_from_user_nmi() does. This is fine here because the actual copy is
inside a pagefault disabled region.
Reported-by: Juri Lelli <[email protected]>
Signed-off-by: Eiichi Tsukata <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
arch_stack_walk_user() checks `if (fp == frame.next_fp)` to prevent a
infinite loop by self reference but it's not enogh for circular reference.
Once a lack of return address is found, there is no point to continue the
loop, so break out.
Fixes: 02b67518e2b1 ("tracing: add support for userspace stacktraces in tracing/iter_ctrl")
Signed-off-by: Eiichi Tsukata <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Replace the stack_trace_save*() functions with the new arch_stack_walk()
interfaces.
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: [email protected]
Cc: Steven Rostedt <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: [email protected]
Cc: David Rientjes <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: [email protected]
Cc: Mike Rapoport <[email protected]>
Cc: Akinobu Mita <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: [email protected]
Cc: Robin Murphy <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: David Sterba <[email protected]>
Cc: Chris Mason <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Mike Snitzer <[email protected]>
Cc: Alasdair Kergon <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: [email protected]
Cc: Joonas Lahtinen <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Cc: [email protected]
Cc: David Airlie <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Tom Zanussi <[email protected]>
Cc: Miroslav Benes <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Terminating the last trace entry with ULONG_MAX is a completely pointless
exercise and none of the consumers can rely on it because it's
inconsistently implemented across architectures. In fact quite some of the
callers remove the entry and adjust stack_trace.nr_entries afterwards.
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.
It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access. But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.
A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model. And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.
This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.
There were a couple of notable cases:
- csky still had the old "verify_area()" name as an alias.
- the iter_iov code had magical hardcoded knowledge of the actual
values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
really used it)
- microblaze used the type argument for a debug printout
but other than those oddities this should be a total no-op patch.
I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something. Any missed conversion should be trivially fixable, though.
Signed-off-by: Linus Torvalds <[email protected]>
|
|
save_stack_trace_reliable now returns "non reliable" when there are
kernel pt_regs on stack. This means an interrupt or exception happened
somewhere down the route. It is a problem for the frame pointer
unwinder, because the frame might not have been set up yet when the irq
happened, so the unwinder might fail to unwind from the interrupted
function.
With ORC, this is not a problem, as ORC has out-of-band data. We can
find ORC data even for the IP in the interrupted function and always
unwind one level up reliably.
So lift the check to apply only when CONFIG_FRAME_POINTER=y is enabled.
Signed-off-by: Jiri Slaby <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/lkml/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Make clear which path is for user tasks and for kthreads and idle
tasks. This will allow easier plug-in of the ORC unwinder in the next
patches.
Note that we added a check for unwind error to the top of the loop, so
that an error is returned also for user tasks (the 'goto success' would
skip the check after the loop otherwise).
Signed-off-by: Jiri Slaby <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/lkml/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The stack unwinding can sometimes fail yet. Especially with the
generated debug info. So do not yell at users -- live patching (the only
user of this interface) will inform the user about the failure
gracefully.
And given this was the only user of the macro, remove the macro proper
too.
Signed-off-by: Jiri Slaby <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/lkml/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Josh pointed out, that there is no way a frame can be after user regs.
So remove the last unwind and the check.
Signed-off-by: Jiri Slaby <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/lkml/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 page table isolation fixes from Thomas Gleixner:
"A couple of urgent fixes for PTI:
- Fix a PTE mismatch between user and kernel visible mapping of the
cpu entry area (differs vs. the GLB bit) and causes a TLB mismatch
MCE on older AMD K8 machines
- Fix the misplaced CR3 switch in the SYSCALL compat entry code which
causes access to unmapped kernel memory resulting in double faults.
- Fix the section mismatch of the cpu_tss_rw percpu storage caused by
using a different mechanism for declaration and definition.
- Two fixes for dumpstack which help to decode entry stack issues
better
- Enable PTI by default in Kconfig. We should have done that earlier,
but it slipped through the cracks.
- Exclude AMD from the PTI enforcement. Not necessarily a fix, but if
AMD is so confident that they are not affected, then we should not
burden users with the overhead"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/process: Define cpu_tss_rw in same section as declaration
x86/pti: Switch to kernel CR3 at early in entry_SYSCALL_compat()
x86/dumpstack: Print registers for first stack frame
x86/dumpstack: Fix partial register dumps
x86/pti: Make sure the user/kernel PTEs match
x86/cpu, x86/pti: Do not enable PTI on AMD processors
x86/pti: Enable PTI by default
|
|
The show_regs_safe() logic is wrong. When there's an iret stack frame,
it prints the entire pt_regs -- most of which is random stack data --
instead of just the five registers at the end.
show_regs_safe() is also poorly named: the on_stack() checks aren't for
safety. Rename the function to show_regs_if_on_stack() and add a
comment to explain why the checks are needed.
These issues were introduced with the "partial register dump" feature of
the following commit:
b02fcf9ba121 ("x86/unwinder: Handle stack overflows more gracefully")
That patch had gone through a few iterations of development, and the
above issues were artifacts from a previous iteration of the patch where
'regs' pointed directly to the iret frame rather than to the (partially
empty) pt_regs.
Tested-by: Alexander Tsoy <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Toralf Förster <[email protected]>
Cc: [email protected]
Fixes: b02fcf9ba121 ("x86/unwinder: Handle stack overflows more gracefully")
Link: http://lkml.kernel.org/r/5b05b8b344f59db2d3d50dbdeba92d60f2304c54.1514736742.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Commit:
1959a60182f4 ("x86/dumpstack: Pin the target stack when dumping it")
changed the behavior of stack traces for zombies. Before that commit,
/proc/<pid>/stack reported the last execution path of the zombie before
it died:
[<ffffffff8105b877>] do_exit+0x6f7/0xa80
[<ffffffff8105bc79>] do_group_exit+0x39/0xa0
[<ffffffff8105bcf0>] __wake_up_parent+0x0/0x30
[<ffffffff8152dd09>] system_call_fastpath+0x16/0x1b
[<00007fd128f9c4f9>] 0x7fd128f9c4f9
[<ffffffffffffffff>] 0xffffffffffffffff
After the commit, it just reports an empty stack trace.
The new behavior is actually probably more correct. If the stack
refcount has gone down to zero, then the task has already gone through
do_exit() and isn't going to run anymore. The stack could be freed at
any time and is basically gone, so reporting an empty stack makes sense.
However, save_stack_trace_tsk_reliable() treats such a missing stack
condition as an error. That can cause livepatch transition stalls if
there are any unreaped zombies. Instead, just treat it as a reliable,
empty stack.
Reported-and-tested-by: Miroslav Benes <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Fixes: af085d9084b4 ("stacktrace/x86: add function for detecting reliable stack traces")
Link: http://lkml.kernel.org/r/e4b09e630e99d0c1080528f0821fc9d9dbaeea82.1513631620.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The save_stack_trace() and save_stack_trace_tsk() wrappers of
__save_stack_trace() add themselves to the call stack, and thus appear in the
recorded stacktraces. This is redundant and wasteful when we have limited space
to record the useful part of the backtrace with e.g. page_owner functionality.
Fix this by making sure __save_stack_trace() is noinline (which matches the
current gcc decision) and bumping the skip in the wrappers
(save_stack_trace_tsk() only when called for the current task). This is similar
to what was done for arm in 3683f44c42e9 ("ARM: stacktrace: avoid listing
stacktrace functions in stacktrace") and is pending for arm64.
Also make sure that __save_stack_trace_reliable() doesn't get this problem in
the future by marking it __always_inline (which matches current gcc decision),
per Josh Poimboeuf.
Signed-off-by: Vlastimil Babka <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Miroslav Benes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
For live patching and possibly other use cases, a stack trace is only
useful if it can be assured that it's completely reliable. Add a new
save_stack_trace_tsk_reliable() function to achieve that.
Note that if the target task isn't the current task, and the target task
is allowed to run, then it could be writing the stack while the unwinder
is reading it, resulting in possible corruption. So the caller of
save_stack_trace_tsk_reliable() must ensure that the task is either
'current' or inactive.
save_stack_trace_tsk_reliable() relies on the x86 unwinder's detection
of pt_regs on the stack. If the pt_regs are not user-mode registers
from a syscall, then they indicate an in-kernel interrupt or exception
(e.g. preemption or a page fault), in which case the stack is considered
unreliable due to the nature of frame pointers.
It also relies on the x86 unwinder's detection of other issues, such as:
- corrupted stack data
- stack grows the wrong way
- stack walk doesn't reach the bottom
- user didn't provide a large enough entries array
Such issues are reported by checking unwind_error() and !unwind_done().
Also add CONFIG_HAVE_RELIABLE_STACKTRACE so arch-independent code can
determine at build time whether the function is implemented.
Signed-off-by: Josh Poimboeuf <[email protected]>
Reviewed-by: Miroslav Benes <[email protected]>
Acked-by: Ingo Molnar <[email protected]> # for the x86 changes
Signed-off-by: Jiri Kosina <[email protected]>
|
|
<linux/sched/task_stack.h>
We are going to split <linux/sched/task_stack.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/task_stack.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
<linux/sched/debug.h>
We are going to split <linux/sched/debug.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/debug.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Convert save_stack_trace_*() to use the new unwinder. dump_trace() has
been deprecated.
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Byungchul Park <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/815494c627d89887db0ce56ceffd58ad16ee6c21.1474045023.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Specifically, pin the stack in save_stack_trace_tsk() and
show_trace_log_lvl().
This will prevent a crash if the target task dies before or while
dumping its stack once we start freeing task stacks early.
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/cf0082cde65d1941a996d026f2b2cdbfaca17bfa.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
|
|
valid_stack_ptr() is buggy: it assumes that all stacks are of size
THREAD_SIZE, which is not true for exception stacks. So the
walk_stack() callbacks will need to know the location of the beginning
of the stack as well as the end.
Another issue is that in general the various features of a stack (type,
size, next stack pointer, description string) are scattered around in
various places throughout the stack dump code.
Encapsulate all that information in a single place with a new stack_info
struct and a get_stack_info() interface.
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Byungchul Park <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/8164dd0db96b7e6a279fa17ae5e6dc375eecb4a9.1473905218.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Historically a lot of these existed because we did not have
a distinction between what was modular code and what was providing
support to modules via EXPORT_SYMBOL and friends. That changed
when we forked out support for the latter into the export.h file.
This means we should be able to reduce the usage of module.h
in code that is obj-y Makefile or bool Kconfig. The advantage
in doing so is that module.h itself sources about 15 other headers;
adding significantly to what we feed cpp, and it can obscure what
headers we are effectively using.
Since module.h was the source for init.h (for __init) and for
export.h (for EXPORT_SYMBOL) we consider each obj-y/bool instance
for the presence of either and replace as needed. Build testing
revealed some implicit header usage that was fixed up accordingly.
Note that some bool/obj-y instances remain since module.h is
the header for some exception table entry stuff, and for things
like __init_or_module (code that is tossed when MODULES=n).
Signed-off-by: Paul Gortmaker <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
. avoid walking the stack when there is no room left in the buffer
. generalize get_perf_callchain() to be called from bpf helper
Signed-off-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Swap the 1st and 2nd parameters of save_stack_trace_regs()
as same as the parameters of save_stack_trace_tsk().
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: [email protected]
Cc: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/20110608070921.17777.31103.stgit@fedora15
Signed-off-by: Steven Rostedt <[email protected]>
|
|
Both warning and warning_symbol are nowhere used.
Let's get rid of them.
Signed-off-by: Richard Weinberger <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Soeren Sandmann Pedersen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: x86 <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Robert Richter <[email protected]>
Cc: Paul Mundt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Frederic Weisbecker <[email protected]>
|
|
Current stack dump code scans entire stack and check each entry
contains a pointer to kernel code. If CONFIG_FRAME_POINTER=y it
could mark whether the pointer is valid or not based on value of
the frame pointer. Invalid entries could be preceded by '?' sign.
However this was not going to happen because scan start point
was always higher than the frame pointer so that they could not
meet.
Commit 9c0729dc8062 ("x86: Eliminate bp argument from the stack
tracing routines") delayed bp acquisition point, so the bp was
read in lower frame, thus all of the entries were marked
invalid.
This patch fixes this by reverting above commit while retaining
stack_frame() helper as suggested by Frederic Weisbecker.
End result looks like below:
before:
[ 3.508329] Call Trace:
[ 3.508551] [<ffffffff814f35c9>] ? panic+0x91/0x199
[ 3.508662] [<ffffffff814f3739>] ? printk+0x68/0x6a
[ 3.508770] [<ffffffff81a981b2>] ? mount_block_root+0x257/0x26e
[ 3.508876] [<ffffffff81a9821f>] ? mount_root+0x56/0x5a
[ 3.508975] [<ffffffff81a98393>] ? prepare_namespace+0x170/0x1a9
[ 3.509216] [<ffffffff81a9772b>] ? kernel_init+0x1d2/0x1e2
[ 3.509335] [<ffffffff81003894>] ? kernel_thread_helper+0x4/0x10
[ 3.509442] [<ffffffff814f6880>] ? restore_args+0x0/0x30
[ 3.509542] [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
[ 3.509641] [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10
after:
[ 3.522991] Call Trace:
[ 3.523351] [<ffffffff814f35b9>] panic+0x91/0x199
[ 3.523468] [<ffffffff814f3729>] ? printk+0x68/0x6a
[ 3.523576] [<ffffffff81a981b2>] mount_block_root+0x257/0x26e
[ 3.523681] [<ffffffff81a9821f>] mount_root+0x56/0x5a
[ 3.523780] [<ffffffff81a98393>] prepare_namespace+0x170/0x1a9
[ 3.523885] [<ffffffff81a9772b>] kernel_init+0x1d2/0x1e2
[ 3.523987] [<ffffffff81003894>] kernel_thread_helper+0x4/0x10
[ 3.524228] [<ffffffff814f6880>] ? restore_args+0x0/0x30
[ 3.524345] [<ffffffff81a97559>] ? kernel_init+0x0/0x1e2
[ 3.524445] [<ffffffff81003890>] ? kernel_thread_helper+0x0/0x10
-v5:
* fix build breakage with oprofile
-v4:
* use 0 instead of regs->bp
* separate out printk changes
-v3:
* apply comment from Frederic
* add a couple of printk fixes
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Acked-by: Frederic Weisbecker <[email protected]>
Cc: Soren Sandmann <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Robert Richter <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The various stack tracing routines take a 'bp' argument in which the
caller is supposed to provide the base pointer to use, or 0 if doesn't
have one. Since bp is garbage whenever CONFIG_FRAME_POINTER is not
defined, this means all callers in principle should either always pass
0, or be conditional on CONFIG_FRAME_POINTER.
However, there are only really three use cases for stack tracing:
(a) Trace the current task, including IRQ stack if any
(b) Trace the current task, but skip IRQ stack
(c) Trace some other task
In all cases, if CONFIG_FRAME_POINTER is not defined, bp should just
be 0. If it _is_ defined, then
- in case (a) bp should be gotten directly from the CPU's register, so
the caller should pass NULL for regs,
- in case (b) the caller should should pass the IRQ registers to
dump_trace(),
- in case (c) bp should be gotten from the top of the task's stack, so
the caller should pass NULL for regs.
Hence, the bp argument is not necessary because the combination of
task and regs is sufficient to determine an appropriate value for bp.
This patch introduces a new inline function stack_frame(task, regs)
that computes the desired bp. This function is then called from the
two versions of dump_stack().
Signed-off-by: Soren Sandmann <[email protected]>
Acked-by: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arjan van de Ven <[email protected]>,
Cc: Frederic Weisbecker <[email protected]>,
Cc: Arnaldo Carvalho de Melo <[email protected]>,
LKML-Reference: <[email protected]>>
Signed-off-by: Frederic Weisbecker <[email protected]>
|
|
Cleanup. Factor the common code in save_stack_address() and
save_stack_address_nosched().
Signed-off-by: Oleg Nesterov <[email protected]>
Cc: Roland McGrath <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Vegard Nossum <[email protected]>
Cc: Ingo Molnar <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
|
|
If CONFIG_FRAME_POINTER=n, print_context_stack() shouldn't neglect the
non-reliable addresses on stack, this is all we have if dump_trace(bp)
is called with the wrong or zero bp.
For example, /proc/pid/stack doesn't work if CONFIG_FRAME_POINTER=n.
This patch obviously has no effect if CONFIG_FRAME_POINTER=y, otherwise
it reverts 1650743c "x86: don't save unreliable stack trace entries".
Also, remove the unnecessary type-cast.
Signed-off-by: Oleg Nesterov <[email protected]>
Cc: Roland McGrath <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Vegard Nossum <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andrew Morton <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
|
|
arch/x86/include/asm/stacktrace.h and arch/x86/kernel/dumpstack.h
declare headers of objects that deal with the same topic.
Actually most of the files that include stacktrace.h also include
dumpstack.h
Although dumpstack.h seems more reserved for internals of stack
traces, those are quite often needed to define specialized stack
trace operations. And perf event arch headers are going to need
access to such low level operations anyway. So don't continue to
bother with dumpstack.h as it's not anymore about isolated deep
internals.
v2: fix struct stack_frame definition conflict in sysprof
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Soeren Sandmann <[email protected]>
|
|
The current print_context_stack helper that does the stack
walking job is good for usual stacktraces as it walks through
all the stack and reports even addresses that look unreliable,
which is nice when we don't have frame pointers for example.
But we have users like perf that only require reliable
stacktraces, and those may want a more adapted stack walker, so
lets make this function a callback in stacktrace_ops that users
can tune for their needs.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Paul Mackerras <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
This will help kmemcheck (and possibly other debugging tools) since we
can now simply pass regs->bp to the stack tracer instead of specifying
the number of stack frames to skip, which is unreliable if gcc decides
to inline functions, etc.
Note that this makes the API incomplete for other architectures, but I
expect that those can be updated lazily, e.g. when they need it.
Cc: Arjan van de Ven <[email protected]>
Signed-off-by: Vegard Nossum <[email protected]>
|
|
If we return -1 in the ops->stack for the stacktrace saving, we end up
breaking out of the loop if the stack we are tracing is in the exception
stack. This causes traces like:
<idle>-0 [002] 34263.745825: raise_softirq_irqoff <-__blk_complete_request
<idle>-0 [002] 34263.745826:
<= 0
<= 0
<= 0
<= 0
<= 0
<= 0
<= 0
By returning "0" instead, the irq stack is saved as well, and we see:
<idle>-0 [003] 883.280992: raise_softirq_irqoff <-__hrtimer_star
t_range_ns
<idle>-0 [003] 883.280992:
<= hrtimer_start_range_ns
<= tick_nohz_restart_sched_tick
<= cpu_idle
<= start_secondary
<=
<= 0
<= 0
[ Impact: record stacks from interrupts ]
Signed-off-by: Steven Rostedt <[email protected]>
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Impact: cleanup
Signed-off-by: Török Edwin <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Impact: add new (default-off) tracing visualization feature
Usage example:
mount -t debugfs nodev /sys/kernel/debug
cd /sys/kernel/debug/tracing
echo userstacktrace >iter_ctrl
echo sched_switch >current_tracer
echo 1 >tracing_enabled
.... run application ...
echo 0 >tracing_enabled
Then read one of 'trace','latency_trace','trace_pipe'.
To get the best output you can compile your userspace programs with
frame pointers (at least glibc + the app you are tracing).
Signed-off-by: Török Edwin <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
fix:
ERROR: "print_stack_trace" [kernel/backtracetest.ko] undefined!
ERROR: "save_stack_trace" [kernel/backtracetest.ko] undefined!
Signed-off-by: Ingo Molnar <[email protected]>
and fix:
Building modules, stage 2.
MODPOST 376 modules
ERROR: "print_stack_trace" [kernel/backtracetest.ko] undefined!
make[1]: *** [__modpost] Error 1
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Currently, there is no way for print_stack_trace() to determine whether
a given stack trace entry was deemed reliable or not, simply because
save_stack_trace() does not record this information. (Perhaps needless
to say, this makes the saved stack traces A LOT harder to read, and
probably with no other benefits, since debugging features that use
save_stack_trace() most likely also require frame pointers, etc.)
This patch reverts to the old behaviour of only recording the reliable trace
entries for saved stack traces.
Signed-off-by: Vegard Nossum <[email protected]>
Acked-by: Arjan van de Ven <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
x86: remove unneeded casts
Signed-off-by: Jan Engelhardt <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
Right now, we take the stack pointer early during the backtrace path, but
only calculate bp several functions deep later, making it hard to reconcile
the stack and bp backtraces (as well as showing several internal backtrace
functions on the stack with bp based backtracing).
This patch moves the bp taking to the same place we take the stack pointer;
sadly this ripples through several layers of the back tracing stack,
but it's not all that bad in the end I hope.
Signed-off-by: Arjan van de Ven <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
For enhancing the 32 bit EBP based backtracer, I need the capability
for the backtracer to tell it's customer that an entry is either
reliable or unreliable, and the backtrace printing code then needs to
print the unreliable ones slightly different.
This patch adds the basic capability, the next patch will add a user
of this capability.
Signed-off-by: Arjan van de Ven <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
LatencyTOP kernel infrastructure; it measures latencies in the
scheduler and tracks it system wide and per process.
Signed-off-by: Arjan van de Ven <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
.. as they're never written to.
[ tglx: arch/x86 adaptation ]
Signed-off-by: Jan Beulich <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
Since the x86 merge, lots of files that referenced their own filenames
are no longer correct. Rather than keep them up to date, just delete
them, as they add no real value.
Additionally:
- fix up comment formatting in scx200_32.c
- Remove a credit from myself in setup_64.c from a time when we had no SCM
- remove longwinded history from tsc_32.c which can be figured out from
git.
Signed-off-by: Dave Jones <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|