Age | Commit message (Collapse) | Author | Files | Lines |
|
Make the type field in pmu-events/arch/s390/mapfile.cvs more generic to
match the created cpuid string for s390.
The pattern also checks for the counter first version number and counter
second version number ([13]\.[1-5]) and the authorization field which
follows.
These numbers do not exist in the cpuid identification string when perf
commands are executed on a z/VM environment (which does not support CPU
counter measurement facility).
CPUID string for LPAR:
cpuid : IBM,3906,704,M03,3.5,002f
CPUID string for z/VM:
cpuid : IBM,2964,702,N96
This allows the removal of s390 specific cpuid compare code and uses the
common compare function with its regular expression matching algorithm.
Signed-off-by: Thomas Richter <[email protected]>
Reviewed-by: Hendrik Brueckner <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
map_groups__fixup_end() was called to set the end addresses of kernel
and module maps. But now since machine__create_modules() sets the end
address of modules properly, the only remaining piece is the kernel map.
We can set it with adjacent module's address directly instead of calling
map_groups__fixup_end(). If there's no module after the kernel map, the
end address will be ~0ULL.
Since it also changes the start address of the kernel map, it needs to
re-insert the map to the kmaps in order to keep a correct ordering. Kim
reported that it caused problems on ARM64.
Reported-by: Kim Phillips <[email protected]>
Tested-by: Kim Phillips <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/20180419235915.GA19067@sejong
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes and improvements from Arnaldo Carvalho de Melo:
- Store context switch out type in PERF_RECORD_SWITCH[_CPU_WIDE].
The percentage of preempting and non-preempting context switches help
understanding the nature of workloads (CPU or IO bound) that are running
on a machine. This adds the kernel facility and userspace changes needed
to show this information in 'perf script' and 'perf report -D' (Alexey Budankov)
- Remove old error messages about things that unlikely to be the root cause
in modern systems (Andi Kleen)
- Synchronize kernel ABI headers, v4.17-rc1 (Ingo Molnar)
- Support MAP_FIXED_NOREPLACE, noticed when updating the tools/include/
copies (Arnaldo Carvalho de Melo)
- Fixup BPF test using epoll_pwait syscall function probe, to cope with
the syscall routines renames performed in this development cycle (Arnaldo Carvalho de Melo)
- Fix sample_max_stack maximum check and do not proceed when an error
has been detect, return them to avoid misidentifying errors (Jiri Olsa)
- Add '\n' at the end of parse-options error messages (Ravi Bangoria)
- Add s390 support for detailed/verbose PMU event description (Thomas Richter)
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
SBOX on some Broadwell CPUs is broken because it's enabled unconditionally
despite the fact that there are no SBOXes available.
Check the Power Control Unit CAPID4 register to determine the number of
available SBOXes on the particular CPU before trying to enable them. If
there are none, nullify the SBOX descriptor so it isn't tried to be
initialized.
Signed-off-by: Oskar Senft <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Mark van Dijk <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
This reverts commit 3b94a891667c ("perf/x86/intel/uncore: Remove
SBOX support for Broadwell server")
Revert because there exists a proper workaround for Broadwell-EP servers
without SBOX now. Note that BDX-DE does not have a SBOX.
Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
Move CoreSight headers to the SPDX identifier.
Signed-off-by: Mathieu Poirier <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Since e145242ea0df ("syscalls/core, syscalls/x86: Clean up syscall stub
naming convention") changed the main syscall function for 'epoll_pwait'
to something other than the expected 'SyS_epoll_pwait the' 'perf test
BPF' entries started failing, fix it by using something called from the
main syscall function instead, 'epoll_wait', which should keep this test
working in older kernels too.
Before:
# perf test BPF
40: BPF filter :
40.1: Basic BPF filtering : FAILED!
40.2: BPF pinning : Skip
40.3: BPF prologue generation : Skip
40.4: BPF relocation checker : Skip
If we use -v for that test we see the problem:
Probe point 'SyS_epoll_pwait' not found.
After:
# perf test BPF
40: BPF filter :
40.1: Basic BPF filtering : Ok
40.2: BPF pinning : Ok
40.3: BPF prologue generation : Ok
40.4: BPF relocation checker : Ok
#
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Dominik Brodowski <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In the 'perf test "mmap interface"' we try creating events for several
tracepoints, but when perf_evsel__new() fails we're not showing which
one is failing, fix that to help diagnosing problems, such as the
syscall tracepoints ones being found and fixes in this merge window.
Now the failing tests shows:
# perf test -v "mmap interface"
4: Read samples using the mmap interface :
--- start ---
test child forked, pid 14311
<SNIP>
perf_evsel__new(sys_enter_getppid)
test child finished with -1
---- end ----
Read samples using the mmap interface: FAILED!
#
Now to check why the syscalls:sys_enter_getppid is failing...
# ls -la /sys/kernel/debug/tracing/events/syscalls/sys_enter_getppid
ls: cannot access '/sys/kernel/debug/tracing/events/syscalls/sys_enter_getppid': No such file or directory
#
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Dominik Brodowski <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Few error messages does not have '\n' at the end and thus next prompt
gets printed in the same line. Ex,
linux~$ perf buildid-cache -verbose --add ./a.out
Error: did you mean `--verbose` (with two dashes ?)linux~$
Fix it.
Signed-off-by: Ravi Bangoria <[email protected]>
Reviewed-by: Masami Hiramatsu <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kate Stewart <[email protected]>
Cc: Krister Johansen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Philippe Ombredanne <[email protected]>
Cc: Sihyeon Jang <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
'perf record' suggests to enable the APIC on errors.
APIC is practically always used today and the problem is usually
somewhere else.
Just remove the outdated suggestion.
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When perf record encounters an error setting up an event it suggests
to enable CONFIG_PERF_EVENTS. This is misleading because:
- Usually it is enabled (it is really hard to disable on x86)
- The problem is usually somewhere else, e.g. the CPU is not supported
or an invalid configuration has been used.
Remove the misleading suggestion.
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Clarify in the browser help that ESC in tui mode may go back to the
previous screen instead of just exiting (was not clear to me)
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
For perf mem report / perf mem record, pass all unknown options
through to the underlying report/record commands. This makes things
like
perf mem record -a sleep 1
work. Matches how c2c and other tools work.
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Introduced in a4ff8e8620d3 ("mm: introduce MAP_FIXED_NOREPLACE"), and
now that we have that define in the just syncronized
tools/arch/*/include/uapi/asm/mman.h files, add support for it.
This should really transition to autogeneration of string tables as
done for various other things:
$ ls /tmp/build/perf/trace/beauty/generated/*.c
arch_errno_name_array.c kcmp_type_array.c madvise_behavior_array.c
pkey_alloc_access_rights_array.c prctl_option_array.c
$ head /tmp/build/perf/trace/beauty/generated/madvise_behavior_array.c
static const char *madvise_advices[] = {
[0] = "NORMAL",
[1] = "RANDOM",
[2] = "SEQUENTIAL",
[3] = "WILLNEED",
[4] = "DONTNEED",
[8] = "FREE",
[9] = "REMOVE",
[10] = "DONTFORK",
[11] = "DOFORK",
$
Till then, add support for this the old way.
Also it has to be ifdef'ed, because arches like mips still don't define
it. The proper solution will be to have per-arch tables for these
values to support cross-analysis.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signef-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
If the get_callchain_buffers fails to allocate the buffer it will
decrease the nr_callchain_events right away.
There's no point of checking the allocation error for
nr_callchain_events > 1. Removing that check.
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The syzbot hit KASAN bug in perf_callchain_store having the entry stored
behind the allocated bounds [1].
We miss the sample_max_stack check for the initial event that allocates
callchain buffers. This missing check allows to create an event with
sample_max_stack value bigger than the global sysctl maximum:
# sysctl -a | grep perf_event_max_stack
kernel.perf_event_max_stack = 127
# perf record -vv -C 1 -e cycles/max-stack=256/ kill
...
perf_event_attr:
size 112
...
sample_max_stack 256
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4
Note the '-C 1', which forces perf record to create just single event.
Otherwise it opens event for every cpu, then the sample_max_stack check
fails on the second event and all's fine.
The fix is to run the sample_max_stack check also for the first event
with callchains.
[1] https://marc.info/?l=linux-kernel&m=152352732920874&w=2
Reported-by: [email protected]
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Fixes: 97c79a38cd45 ("perf core: Per event callchain limit")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Return immediately when we find issue in the user stack checks. The
error value could get overwritten by following check for
PERF_SAMPLE_REGS_INTR.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Fixes: 60e2364e60e8 ("perf: Add ability to sample machine state on interrupt")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
'perf list' with flags -d and -v print a description (-d) or a very
verbose explanation (-v) of CPU specific counter events. These
descriptions are provided with the json files in directory
pmu-events/arch/s390/*.json.
Display of these descriptions on s390 requires the corresponding json
files.
On s390 this does not work because function is_pmu_core() does not
detect the s390 directory name where the CPU specific events are listed.
On x86 it is:
/sys/bus/event_source/devices/cpu
whereas on s390 it is:
/sys/bus/event_source/devices/cpum_cf
/sys/bus/event_source/devices/cpum_sf
Fix this by adding s390 directory name testing to function
is_pmu_core(). This is the same approach as taken for the ARM platform.
Output before:
[root@s35lp76 perf]# ./perf list -d pmu
List of pre-defined events (to be used in -e):
cpum_cf/AES_BLOCKED_CYCLES/ [Kernel PMU event]
cpum_cf/AES_BLOCKED_FUNCTIONS/ [Kernel PMU event]
cpum_cf/AES_CYCLES/ [Kernel PMU event]
cpum_cf/AES_FUNCTIONS/ [Kernel PMU event]
....
cpum_cf/TX_NC_TEND/ [Kernel PMU event]
cpum_cf/VX_BCD_EXECUTION_SLOTS/ [Kernel PMU event]
cpum_sf/SF_CYCLES_BASIC/ [Kernel PMU event]
Output after:
[root@s35lp76 perf]# ./perf list -d pmu
List of pre-defined events (to be used in -e):
cpum_cf/AES_BLOCKED_CYCLES/ [Kernel PMU event]
cpum_cf/AES_BLOCKED_FUNCTIONS/ [Kernel PMU event]
cpum_cf/AES_CYCLES/ [Kernel PMU event]
cpum_cf/AES_FUNCTIONS/ [Kernel PMU event]
....
cpum_cf/TX_NC_TEND/ [Kernel PMU event]
cpum_cf/VX_BCD_EXECUTION_SLOTS/ [Kernel PMU event]
cpum_sf/SF_CYCLES_BASIC/ [Kernel PMU event]
3906:
bcd_dfp_execution_slots
[BCD DFP Execution Slots]
decimal_instructions
[Decimal Instructions]
dtlb2_gpage_writes
[DTLB2 GPAGE Writes]
dtlb2_hpage_writes
[DTLB2 HPAGE Writes]
dtlb2_misses
[DTLB2 Misses]
dtlb2_writes
[DTLB2 Writes]
itlb2_misses
[ITLB2 Misses]
itlb2_writes
[ITLB2 Writes]
l1c_tlb2_misses
[L1C TLB2 Misses]
.....
cfvn 3:
cpu_cycles
[CPU Cycles]
instructions
[Instructions]
l1d_dir_writes
[L1D Directory Writes]
l1d_penalty_cycles
[L1D Penalty Cycles]
l1i_dir_writes
[L1I Directory Writes]
l1i_penalty_cycles
[L1I Penalty Cycles]
problem_state_cpu_cycles
[Problem State CPU Cycles]
problem_state_instructions
[Problem State Instructions]
....
csvn generic:
aes_blocked_cycles
[AES Blocked Cycles]
aes_blocked_functions
[AES Blocked Functions]
aes_cycles
[AES Cycles]
aes_functions
[AES Functions]
dea_blocked_cycles
[DEA Blocked Cycles]
dea_blocked_functions
[DEA Blocked Functions]
....
Signed-off-by: Thomas Richter <[email protected]>
Reviewed-by: Hendrik Brueckner <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Append 'p' sign to 'S' tag designating the type of context switch out event so
'Sp' means preemption context switch. Documentation is extended to cover
new presentation changes.
$ perf script --show-switch-events -F +misc -I -i perf.data:
hdparm 4073 [004] U 762.198265: 380194 cycles:ppp: 7faf727f5a23 strchr (/usr/lib64/ld-2.26.so)
hdparm 4073 [004] K 762.198366: 441572 cycles:ppp: ffffffffb9218435 alloc_set_pte (/lib/modules/4.16.0-rc6+/build/vmlinux)
hdparm 4073 [004] S 762.198391: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [004] 762.198392: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 4073/4073
swapper 0 [004] Sp 762.198477: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 4073/4073
hdparm 4073 [004] 762.198478: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
swapper 0 [007] K 762.198514: 2303073 cycles:ppp: ffffffffb98b0c66 intel_idle (/lib/modules/4.16.0-rc6+/build/vmlinux)
swapper 0 [007] Sp 762.198561: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 1134/1134
kworker/u16:18 1134 [007] 762.198562: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
kworker/u16:18 1134 [007] S 762.198567: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
Signed-off-by: Alexey Budankov <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Print additional 'preempt' tag for PERF_RECORD_SWITCH[_CPU_WIDE] OUT records when
event header misc field contains PERF_RECORD_MISC_SWITCH_OUT_PREEMPT bit set
designating preemption context switch out event:
tools/perf/perf report -D -i perf.data | grep _SWITCH
0 768361415226 0x27f076 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 8/8
4 768362216813 0x28f45e [0x28]: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
4 768362217824 0x28f486 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 4073/4073
0 768362414027 0x27f0ce [0x28]: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 8/8
0 768362414367 0x27f0f6 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
Signed-off-by: Alexey Budankov <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Store preempting context switch out event into Perf trace as a part of
PERF_RECORD_SWITCH[_CPU_WIDE] record.
Percentage of preempting and non-preempting context switches help
understanding the nature of workloads (CPU or IO bound) that are running
on a machine;
The event is treated as preemption one when task->state value of the
thread being switched out is TASK_RUNNING. Event type encoding is
implemented using PERF_RECORD_MISC_SWITCH_OUT_PREEMPT bit;
Signed-off-by: Alexey Budankov <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Sync the following tooling headers with the latest kernel version:
tools/arch/arm/include/uapi/asm/kvm.h
- New ABI: KVM_REG_ARM_*
tools/arch/x86/include/asm/required-features.h
- Removal of NEED_LA57 dependency
tools/arch/x86/include/uapi/asm/kvm.h
- New KVM ABI: KVM_SYNC_X86_*
tools/include/uapi/asm-generic/mman-common.h
- New ABI: MAP_FIXED_NOREPLACE flag
tools/include/uapi/linux/bpf.h
- New ABI: BPF_F_SEQ_NUMBER functions
tools/include/uapi/linux/if_link.h
- New ABI: IFLA tun and rmnet support
tools/include/uapi/linux/kvm.h
- New ABI: hyperv eventfd and CONN_ID_MASK support plus header cleanups
tools/include/uapi/sound/asound.h
- New ABI: SNDRV_PCM_FORMAT_FIRST PCM format specifier
tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
- The x86 system call table description changed due to the ptregs changes and the renames, in:
d5a00528b58c: syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()
5ac9efa3c50d: syscalls/core, syscalls/x86: Clean up compat syscall stub naming convention
ebeb8c82ffaf: syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32
Also fix the x86 syscall table warning:
-Warning: Kernel ABI header at 'tools/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
+Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
None of these changes impact existing tooling code, so we only have to copy the kernel version.
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Brian Robbins <[email protected]>
Cc: Clark Williams <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Dmitriy Vyukov <[email protected]> <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Hendrik Brueckner <[email protected]>
Cc: Jesper Dangaard Brouer <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Kim Phillips <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Li Zhijian <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Martin Liška <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Cc: Miguel Bernal Marin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Naveen N. Rao <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Cc: Takuya Yamamoto <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Wang Nan <[email protected]>
Cc: William Cohen <[email protected]>
Cc: Yonghong Song <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This warning message is not very helpful, as the return value should
already show information about the error. Also, this message will
spam dmesg if the user space does testing in a loop, like:
for x in {0..5}
do
echo p:xx xx+$x >> /sys/kernel/debug/tracing/kprobe_events
done
Reported-by: Vince Weaver <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull tooling improvements and fixes from Arnaldo Carvalho de Melo:
perf annotate fixes and improvements:
- Allow showing offsets in more than just jump targets, use the new
'O' hotkey in the TUI, config ~/.perfconfig annotate.offset_level
for it and for --stdio2 (Arnaldo Carvalho de Melo)
- Use the resolved variable names from objdump disassembled lines to
make them more compact, just like was already done for some instructions,
like "mov", this eventually will be done more generally, but lets now add
some more to the existing mechanism (Arnaldo Carvalho de Melo)
perf record fixes:
- Change warning for missing topology sysfs entry to debug, as not all
architectures have those files, s390 being one of those (Thomas Richter)
perf sched fixes:
- Fix -g/--call-graph documentation (Takuya Yamamoto)
perf stat:
- Enable 1ms interval for printing event counters values in (Alexey Budankov)
perf test fixes:
- Run dwarf unwind on arm32 (Kim Phillips)
- Remove unused ptrace.h include from LLVM test, sidesteping older
clang's lack of support for some asm constructs (Arnaldo Carvalho de Melo)
perf version fixes:
- Do not print info about HAVE_LIBAUDIT_SUPPORT in 'perf version --build-options'
when HAVE_SYSCALL_TABLE_SUPPORT is true, as libaudit won't be used in that
case, print info about syscall_table support instead (Jin Yao)
Build system fixes:
- Use HAVE_..._SUPPORT used consistently (Jin Yao)
- Restore READ_ONCE() C++ compatibility in tools/include (Mark Rutland)
- Give hints about package names needed to build jvmti (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull more btrfs updates from David Sterba:
"We have queued a few more fixes (error handling, log replay,
softlockup) and the rest is SPDX updates that touche almost all files
so the diffstat is long"
* tag 'for-4.17-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: Only check first key for committed tree blocks
btrfs: add SPDX header to Kconfig
btrfs: replace GPL boilerplate by SPDX -- sources
btrfs: replace GPL boilerplate by SPDX -- headers
Btrfs: fix loss of prealloc extents past i_size after fsync log replay
Btrfs: clean up resources during umount after trans is aborted
btrfs: Fix possible softlock on single core machines
Btrfs: bail out on error during replay_dir_deletes
Btrfs: fix NULL pointer dereference in log_dir_items
|
|
Pull cifs fixes from Steve French:
"SMB3 fixes, a few for stable, and some important cleanup work from
Ronnie of the smb3 transport code"
* tag '4.17-rc1SMB3-Fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: change validate_buf to validate_iov
cifs: remove rfc1002 hardcoded constants from cifs_discard_remaining_data()
cifs: Change SMB2_open to return an iov for the error parameter
cifs: add resp_buf_size to the mid_q_entry structure
smb3.11: replace a 4 with server->vals->header_preamble_size
cifs: replace a 4 with server->vals->header_preamble_size
cifs: add pdu_size to the TCP_Server_Info structure
SMB311: Improve checking of negotiate security contexts
SMB3: Fix length checking of SMB3.11 negotiate request
CIFS: add ONCE flag for cifs_dbg type
cifs: Use ULL suffix for 64-bit constant
SMB3: Log at least once if tree connect fails during reconnect
cifs: smb2pdu: Fix potential NULL pointer dereference
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a set of minor (and safe changes) that didn't make the initial
pull request plus some bug fixes.
The status handling code is actually a running regression from the
previous merge window which had an incomplete fix (now reverted) and
most of the remaining bug fixes are for problems older than the
current merge window"
[ Side note: this merge also takes the base kernel git repository to 6+
million objects for the first time. Technically we hit it a couple of
merges ago already if you count all the tag objects, but now it
reaches 6M+ objects reachable from HEAD.
I was joking around that that's when I should switch to 5.0, because
3.0 happened at the 2M mark, and 4.0 happened at 4M objects. But
probably not, even if numerology is about as good a reason as any.
- Linus ]
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: devinfo: Add Microsoft iSCSI target to 1024 sector blacklist
scsi: cxgb4i: silence overflow warning in t4_uld_rx_handler()
scsi: dpt_i2o: Use after free in I2ORESETCMD ioctl
scsi: core: Make scsi_result_to_blk_status() recognize CONDITION MET
scsi: core: Rename __scsi_error_from_host_byte() into scsi_result_to_blk_status()
Revert "scsi: core: return BLK_STS_OK for DID_OK in __scsi_error_from_host_byte()"
scsi: aacraid: Insure command thread is not recursively stopped
scsi: qla2xxx: Correct setting of SAM_STAT_CHECK_CONDITION
scsi: qla2xxx: correctly shift host byte
scsi: qla2xxx: Fix race condition between iocb timeout and initialisation
scsi: qla2xxx: Avoid double completion of abort command
scsi: qla2xxx: Fix small memory leak in qla2x00_probe_one on probe failure
scsi: scsi_dh: Don't look for NULL devices handlers by name
scsi: core: remove redundant assignment to shost->use_blk_mq
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull more Kbuild updates from Masahiro Yamada:
- pass HOSTLDFLAGS when compiling single .c host programs
- build genksyms lexer and parser files instead of using shipped
versions
- rename *-asn1.[ch] to *.asn1.[ch] for suffix consistency
- let the top .gitignore globally ignore artifacts generated by flex,
bison, and asn1_compiler
- let the top Makefile globally clean artifacts generated by flex,
bison, and asn1_compiler
- use safer .SECONDARY marker instead of .PRECIOUS to prevent
intermediate files from being removed
- support -fmacro-prefix-map option to make __FILE__ a relative path
- fix # escaping to prepare for the future GNU Make release
- clean up deb-pkg by using debian tools instead of handrolled
source/changes generation
- improve rpm-pkg portability by supporting kernel-install as a
fallback of new-kernel-pkg
- extend Kconfig listnewconfig target to provide more information
* tag 'kbuild-v4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: extend output of 'listnewconfig'
kbuild: rpm-pkg: use kernel-install as a fallback for new-kernel-pkg
Kbuild: fix # escaping in .cmd files for future Make
kbuild: deb-pkg: split generating packaging and build
kbuild: use -fmacro-prefix-map to make __FILE__ a relative path
kbuild: mark $(targets) as .SECONDARY and remove .PRECIOUS markers
kbuild: rename *-asn1.[ch] to *.asn1.[ch]
kbuild: clean up *-asn1.[ch] patterns from top-level Makefile
.gitignore: move *-asn1.[ch] patterns to the top-level .gitignore
kbuild: add %.dtb.S and %.dtb to 'targets' automatically
kbuild: add %.lex.c and %.tab.[ch] to 'targets' automatically
genksyms: generate lexer and parser during build instead of shipping
kbuild: clean up *.lex.c and *.tab.[ch] patterns from top-level Makefile
.gitignore: move *.lex.c *.tab.[ch] patterns to the top-level .gitignore
kbuild: use HOSTLDFLAGS for single .c executables
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A set of fixes and updates for x86:
- Address a swiotlb regression which was caused by the recent DMA
rework and made driver fail because dma_direct_supported() returned
false
- Fix a signedness bug in the APIC ID validation which caused invalid
APIC IDs to be detected as valid thereby bloating the CPU possible
space.
- Fix inconsisten config dependcy/select magic for the MFD_CS5535
driver.
- Fix a corruption of the physical address space bits when encryption
has reduced the address space and late cpuinfo updates overwrite
the reduced bit information with the original value.
- Dominiks syscall rework which consolidates the architecture
specific syscall functions so all syscalls can be wrapped with the
same macros. This allows to switch x86/64 to struct pt_regs based
syscalls. Extend the clearing of user space controlled registers in
the entry patch to the lower registers"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Fix signedness bug in APIC ID validity checks
x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption
x86/olpc: Fix inconsistent MFD_CS5535 configuration
swiotlb: Use dma_direct_supported() for swiotlb_ops
syscalls/x86: Adapt syscall_wrapper.h to the new syscall stub naming convention
syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()
syscalls/core, syscalls/x86: Clean up compat syscall stub naming convention
syscalls/core, syscalls/x86: Clean up syscall stub naming convention
syscalls/x86: Extend register clearing on syscall entry to lower registers
syscalls/x86: Unconditionally enable 'struct pt_regs' based syscalls on x86_64
syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32
syscalls/core: Prepare CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y for compat syscalls
syscalls/x86: Use 'struct pt_regs' based syscall calling convention for 64-bit syscalls
syscalls/core: Introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y
x86/syscalls: Don't pointlessly reload the system call number
x86/mm: Fix documentation of module mapping range with 4-level paging
x86/cpuid: Switch to 'static const' specifier
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 pti updates from Thomas Gleixner:
"Another series of PTI related changes:
- Remove the manual stack switch for user entries from the idtentry
code. This debloats entry by 5k+ bytes of text.
- Use the proper types for the asm/bootparam.h defines to prevent
user space compile errors.
- Use PAGE_GLOBAL for !PCID systems to gain back performance
- Prevent setting of huge PUD/PMD entries when the entries are not
leaf entries otherwise the entries to which the PUD/PMD points to
and are populated get lost"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/pgtable: Don't set huge PUD/PMD on non-leaf entries
x86/pti: Leave kernel text global for !PCID
x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image
x86/pti: Enable global pages for shared areas
x86/mm: Do not forbid _PAGE_RW before init for __ro_after_init
x86/mm: Comment _PAGE_GLOBAL mystery
x86/mm: Remove extra filtering in pageattr code
x86/mm: Do not auto-massage page protections
x86/espfix: Document use of _PAGE_GLOBAL
x86/mm: Introduce "default" kernel PTE mask
x86/mm: Undo double _PAGE_PSE clearing
x86/mm: Factor out pageattr _PAGE_GLOBAL setting
x86/entry/64: Drop idtentry's manual stack switch for user entries
x86/uapi: Fix asm/bootparam.h userspace compilation errors
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Thomas Gleixner:
"A few scheduler fixes:
- Prevent a bogus warning vs. runqueue clock update flags in
do_sched_rt_period_timer()
- Simplify the helper functions which handle requests for skipping
the runqueue clock updat.
- Do not unlock the tunables mutex in the error path of the cpu
frequency scheduler utils. Its not held.
- Enforce proper alignement for 'struct util_est' in sched_avg to
prevent a misalignment fault on IA64"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Force proper alignment of 'struct util_est'
sched/core: Simplify helpers for rq clock update skip requests
sched/rt: Fix rq->clock_update_flags < RQCF_ACT_SKIP warning
sched/cpufreq/schedutil: Fix error path mutex unlock
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more perf updates from Thomas Gleixner:
"A rather large set of perf updates:
Kernel:
- Fix various initialization issues
- Prevent creating [ku]probes for not CAP_SYS_ADMIN users
Tooling:
- Show only failing syscalls with 'perf trace --failure' (Arnaldo
Carvalho de Melo)
e.g: See what 'openat' syscalls are failing:
# perf trace --failure -e openat
762.323 ( 0.007 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video2) = -1 ENOENT No such file or directory
<SNIP N /dev/videoN open attempts... sigh, where is that improvised camera lid?!? >
790.228 ( 0.008 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video63) = -1 ENOENT No such file or directory
^C#
- Show information about the event (freq, nr_samples, total
period/nr_events) in the annotate --tui and --stdio2 'perf
annotate' output, similar to the first line in the 'perf report
--tui', but just for the samples for a the annotated symbol
(Arnaldo Carvalho de Melo)
- Introduce 'perf version --build-options' to show what features were
linked, aliased as well as a shorter 'perf -vv' (Jin Yao)
- Add a "dso_size" sort order (Kim Phillips)
- Remove redundant ')' in the tracepoint output in 'perf trace'
(Changbin Du)
- Synchronize x86's cpufeatures.h, no effect on toolss (Arnaldo
Carvalho de Melo)
- Show group details on the title line in the annotate browser and
'perf annotate --stdio2' output, so that the per-event columns can
have headers (Arnaldo Carvalho de Melo)
- Fixup vertical line separating metrics from instructions and
cleaning unused lines at the bottom, both in the annotate TUI
browser (Arnaldo Carvalho de Melo)
- Remove duplicated 'samples' in lost samples warning in
'perf report' (Arnaldo Carvalho de Melo)
- Synchronize i915_drm.h, silencing the perf build process,
automagically adding support for the new DRM_I915_QUERY ioctl
(Arnaldo Carvalho de Melo)
- Make auxtrace_queues__add_buffer() allocate struct buffer, from a
patchkit already applied (Adrian Hunter)
- Fix the --stdio2/TUI annotate output to include group details, be
it for a recorded '{a,b,f}' explicit event group or when forcing
group display using 'perf report --group' for a set of events not
recorded as a group (Arnaldo Carvalho de Melo)
- Fix display artifacts in the ui browser (base class for the
annotate and main report/top TUI browser) related to the extra
title lines work (Arnaldo Carvalho de Melo)
- perf auxtrace refactorings, leftovers from a previously partially
processed patchset (Adrian Hunter)
- Fix the builtin clang build (Sandipan Das, Arnaldo Carvalho de
Melo)
- Synchronize i915_drm.h, silencing a perf build warning and in the
process automagically adding support for a new ioctl command
(Arnaldo Carvalho de Melo)
- Fix a strncpy issue in uprobe tracing"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open()
tracing/uprobe_event: Fix strncpy corner case
perf/core: Fix perf_uprobe_init()
perf/core: Fix perf_kprobe_init()
perf/core: Fix use-after-free in uprobe_perf_close()
perf tests clang: Fix function name for clang IR test
perf clang: Add support for recent clang versions
perf tools: Fix perf builds with clang support
perf tools: No need to include namespaces.h in util.h
perf hists browser: Remove leftover from row returned from refresh
perf hists browser: Show extra_title_lines in the 'D' debug hotkey
perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering
tools headers uapi: Synchronize i915_drm.h
perf report: Remove duplicated 'samples' in lost samples warning
perf ui browser: Fixup cleaning unused lines at the bottom
perf annotate browser: Fixup vertical line separating metrics from instructions
perf annotate: Show group details on the title line
perf auxtrace: Make auxtrace_queues__add_buffer() allocate struct buffer
perf/x86/intel: Move regs->flags EXACT bit init
perf trace: Remove redundant ')'
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 EFI bootup fixlet from Thomas Gleixner:
"A single fix for an early boot warning caused by invoking
this_cpu_has() before SMP initialization"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix bogus warning during EFI bootup, use boot_cpu_has() instead of this_cpu_has() in build_cr3_noflush()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq affinity fixes from Thomas Gleixner:
- Fix error path handling in the affinity spreading code
- Make affinity spreading smarter to avoid issues on systems which
claim to have hotpluggable CPUs while in fact they can't hotplug
anything.
So instead of trying to spread the vectors (and thereby the
associated device queues) to all possibe CPUs, spread them on all
present CPUs first. If there are left over vectors after that first
step they are spread among the possible, but not present CPUs which
keeps the code backwards compatible for virtual decives and NVME
which allocate a queue per possible CPU, but makes the spreading
smarter for devices which have less queues than possible or present
CPUs.
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/affinity: Spread irq vectors among present CPUs as far as possible
genirq/affinity: Allow irq spreading from a given starting point
genirq/affinity: Move actual irq vector spreading into a helper function
genirq/affinity: Rename *node_to_possible_cpumask as *node_to_cpumask
genirq/affinity: Don't return with empty affinity masks on error
|
|
Pull OpenRISC fixlet from Stafford Horne:
"Just one small thing here, it came in a while back but I didnt have
anything in my 4.16 queue, still its the only thing for 4.17 so
sending it alone.
Small cleanup: remove unused __ARCH_HAVE_MMU define"
* tag 'for-linus' of git://github.com/openrisc/linux:
openrisc: remove unused __ARCH_HAVE_MMU define
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix crashes when loading modules built with a different
CONFIG_RELOCATABLE value by adding CONFIG_RELOCATABLE to vermagic.
- Fix busy loops in the OPAL NVRAM driver if we get certain error
conditions from firmware.
- Remove tlbie trace points from KVM code that's called in real mode,
because it causes crashes.
- Fix checkstops caused by invalid tlbiel on Power9 Radix.
- Ensure the set of CPU features we "know" are always enabled is
actually the minimal set when we build with support for firmware
supplied CPU features.
Thanks to: Aneesh Kumar K.V, Anshuman Khandual, Nicholas Piggin.
* tag 'powerpc-4.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Fix CPU_FTRS_ALWAYS vs DT CPU features
powerpc/mm/radix: Fix checkstops caused by invalid tlbiel
KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode
powerpc/8xx: Fix build with hugetlbfs enabled
powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops
powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops
powerpc/fscr: Enable interrupts earlier before calling get_user()
powerpc/64s: Fix section mismatch warnings from setup_rfi_flush()
powerpc/modules: Fix crashes by adding CONFIG_RELOCATABLE to vermagic
|
|
Merge yet more updates from Andrew Morton:
- various hotfixes
- kexec_file updates and feature work
* emailed patches from Andrew Morton <[email protected]>: (27 commits)
kernel/kexec_file.c: move purgatories sha256 to common code
kernel/kexec_file.c: allow archs to set purgatory load address
kernel/kexec_file.c: remove mis-use of sh_offset field during purgatory load
kernel/kexec_file.c: remove unneeded variables in kexec_purgatory_setup_sechdrs
kernel/kexec_file.c: remove unneeded for-loop in kexec_purgatory_setup_sechdrs
kernel/kexec_file.c: split up __kexec_load_puragory
kernel/kexec_file.c: use read-only sections in arch_kexec_apply_relocations*
kernel/kexec_file.c: search symbols in read-only kexec_purgatory
kernel/kexec_file.c: make purgatory_info->ehdr const
kernel/kexec_file.c: remove checks in kexec_purgatory_load
include/linux/kexec.h: silence compile warnings
kexec_file, x86: move re-factored code to generic side
x86: kexec_file: clean up prepare_elf64_headers()
x86: kexec_file: lift CRASH_MAX_RANGES limit on crash_mem buffer
x86: kexec_file: remove X86_64 dependency from prepare_elf64_headers()
x86: kexec_file: purge system-ram walking from prepare_elf64_headers()
kexec_file,x86,powerpc: factor out kexec_file_ops functions
kexec_file: make use of purgatory optional
proc: revalidate misc dentries
mm, slab: reschedule cache_reap() on the same CPU
...
|
|
The code to verify the new kernels sha digest is applicable for all
architectures. Move it to common code.
One problem is the string.c implementation on x86. Currently sha256
includes x86/boot/string.h which defines memcpy and memset to be gcc
builtins. By moving the sha256 implementation to common code and
changing the include to linux/string.h both functions are no longer
defined. Thus definitions have to be provided in x86/purgatory/string.c
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Philipp Rudo <[email protected]>
Acked-by: Dave Young <[email protected]>
Cc: AKASHI Takahiro <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: Vivek Goyal <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
For s390 new kernels are loaded to fixed addresses in memory before they
are booted. With the current code this is a problem as it assumes the
kernel will be loaded to an 'arbitrary' address. In particular,
kexec_locate_mem_hole searches for a large enough memory region and sets
the load address (kexec_bufer->mem) to it.
Luckily there is a simple workaround for this problem. By returning 1
in arch_kexec_walk_mem, kexec_locate_mem_hole is turned off. This
allows the architecture to set kbuf->mem by hand. While the trick works
fine for the kernel it does not for the purgatory as here the
architectures don't have access to its kexec_buffer.
Give architectures access to the purgatories kexec_buffer by changing
kexec_load_purgatory to take a pointer to it. With this change
architectures have access to the buffer and can edit it as they need.
A nice side effect of this change is that we can get rid of the
purgatory_info->purgatory_load_address field. As now the information
stored there can directly be accessed from kbuf->mem.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Philipp Rudo <[email protected]>
Reviewed-by: Martin Schwidefsky <[email protected]>
Acked-by: Dave Young <[email protected]>
Cc: AKASHI Takahiro <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: Vivek Goyal <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The current code uses the sh_offset field in purgatory_info->sechdrs to
store a pointer to the current load address of the section. Depending
whether the section will be loaded or not this is either a pointer into
purgatory_info->purgatory_buf or kexec_purgatory. This is not only a
violation of the ELF standard but also makes the code very hard to
understand as you cannot tell if the memory you are using is read-only
or not.
Remove this misuse and store the offset of the section in
pugaroty_info->purgatory_buf in sh_offset.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Philipp Rudo <[email protected]>
Acked-by: Dave Young <[email protected]>
Cc: AKASHI Takahiro <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: Vivek Goyal <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The main loop currently uses quite a lot of variables to update the
section headers. Some of them are unnecessary. So clean them up a
little.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Philipp Rudo <[email protected]>
Acked-by: Dave Young <[email protected]>
Cc: AKASHI Takahiro <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: Vivek Goyal <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
To update the entry point there is an extra loop over all section
headers although this can be done in the main loop. So move it there
and eliminate the extra loop and variable to store the 'entry section
index'.
Also, in the main loop, move the usual case, i.e. non-bss section, out
of the extra if-block.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Philipp Rudo <[email protected]>
Reviewed-by: Martin Schwidefsky <[email protected]>
Acked-by: Dave Young <[email protected]>
Cc: AKASHI Takahiro <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: Vivek Goyal <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When inspecting __kexec_load_purgatory you find that it has two tasks
1) setting up the kexec_buffer for the new kernel and,
2) setting up pi->sechdrs for the final load address.
The two tasks are independent of each other. To improve readability
split up __kexec_load_purgatory into two functions, one for each task,
and call them directly from kexec_load_purgatory.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Philipp Rudo <[email protected]>
Acked-by: Dave Young <[email protected]>
Cc: AKASHI Takahiro <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: Vivek Goyal <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When the relocations are applied to the purgatory only the section the
relocations are applied to is writable. The other sections, i.e. the
symtab and .rel/.rela, are in read-only kexec_purgatory. Highlight this
by marking the corresponding variables as 'const'.
While at it also change the signatures of arch_kexec_apply_relocations* to
take section pointers instead of just the index of the relocation section.
This removes the second lookup and sanity check of the sections in arch
code.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Philipp Rudo <[email protected]>
Acked-by: Dave Young <[email protected]>
Cc: AKASHI Takahiro <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: Vivek Goyal <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The stripped purgatory does not contain a symtab. So when looking for
symbols this is done in read-only kexec_purgatory. Highlight this by
marking the corresponding variables as 'const'.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Philipp Rudo <[email protected]>
Acked-by: Dave Young <[email protected]>
Cc: AKASHI Takahiro <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: Vivek Goyal <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The kexec_purgatory buffer is read-only. Thus all pointers into
kexec_purgatory are read-only, too. Point this out by explicitly
marking purgatory_info->ehdr as 'const' and update the comments in
purgatory_info.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Philipp Rudo <[email protected]>
Acked-by: Dave Young <[email protected]>
Cc: AKASHI Takahiro <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: Vivek Goyal <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Before the purgatory is loaded several checks are done whether the ELF
file in kexec_purgatory is valid or not. These checks are incomplete.
For example they don't check for the total size of the sections defined
in the section header table or if the entry point actually points into
the purgatory.
On the other hand the purgatory, although an ELF file on its own, is
part of the kernel. Thus not trusting the purgatory means not trusting
the kernel build itself.
So remove all validity checks on the purgatory and just trust the kernel
build.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Philipp Rudo <[email protected]>
Acked-by: Dave Young <[email protected]>
Cc: AKASHI Takahiro <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: Vivek Goyal <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "kexec_file: Clean up purgatory load", v2.
Following the discussion with Dave and AKASHI, here are the common code
patches extracted from my recent patch set (Add kexec_file_load support
to s390) [1]. The patches were extracted to allow upstream integration
together with AKASHI's common code patches before the arch code gets
adjusted to the new base.
The reason for this series is to prepare common code for adding
kexec_file_load to s390 as well as cleaning up the mis-use of the
sh_offset field during purgatory load. In detail this series contains:
Patch #1&2: Minor cleanups/fixes.
Patch #3-9: Clean up the purgatory load/relocation code. Especially
remove the mis-use of the purgatory_info->sechdrs->sh_offset field,
currently holding a pointer into either kexec_purgatory (ro) or
purgatory_buf (rw) depending on the section. With these patches the
section address will be calculated verbosely and sh_offset will contain
the offset of the section in the stripped purgatory binary
(purgatory_buf).
Patch #10: Allows architectures to set the purgatory load address. This
patch is important for s390 as the kernel and purgatory have to be
loaded to fixed addresses. In current code this is impossible as the
purgatory load is opaque to the architecture.
Patch #11: Moves x86 purgatories sha implementation to common lib/
directory to allow reuse in other architectures.
This patch (of 11)
When building the kernel with CONFIG_KEXEC_FILE enabled gcc prints a
compile warning multiple times.
In file included from <path>/linux/init/initramfs.c:526:0:
<path>/include/linux/kexec.h:120:9: warning: `struct kimage' declared inside parameter list [enabled by default]
unsigned long cmdline_len);
^
This is because the typedefs for kexec_file_load uses struct kimage
before it is declared. Fix this by simply forward declaring struct
kimage.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Philipp Rudo <[email protected]>
Acked-by: Dave Young <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Thiago Jung Bauermann <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: AKASHI Takahiro <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In the previous patches, commonly-used routines, exclude_mem_range() and
prepare_elf64_headers(), were carved out. Now place them in kexec
common code. A prefix "crash_" is given to each of their names to avoid
possible name collisions.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: AKASHI Takahiro <[email protected]>
Acked-by: Dave Young <[email protected]>
Tested-by: Dave Young <[email protected]>
Cc: Vivek Goyal <[email protected]>
Cc: Baoquan He <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|