aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2020-09-21tools/io_uring: fix compile breakageDouglas Gilbert1-2/+2
It would seem none of the kernel continuous integration does this: $ cd tools/io_uring $ make Otherwise it may have noticed: cc -Wall -Wextra -g -D_GNU_SOURCE -c -o io_uring-bench.o io_uring-bench.c io_uring-bench.c:133:12: error: static declaration of ‘gettid’ follows non-static declaration 133 | static int gettid(void) | ^~~~~~ In file included from /usr/include/unistd.h:1170, from io_uring-bench.c:27: /usr/include/x86_64-linux-gnu/bits/unistd_ext.h:34:16: note: previous declaration of ‘gettid’ was here 34 | extern __pid_t gettid (void) __THROW; | ^~~~~~ make: *** [<builtin>: io_uring-bench.o] Error 1 The problem on Ubuntu 20.04 (with lk 5.9.0-rc5) is that unistd.h already defines gettid(). So prefix the local definition with "lk_". Signed-off-by: Douglas Gilbert <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-09-20Merge tag 'objtool_urgent_for_v5.9_rc6' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Borislav Petkov: "Fix noreturn detection for ignored sibling functions (Josh Poimboeuf)" * tag 'objtool_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Fix noreturn detection for ignored functions
2020-09-19selftests/vm: fix display of page size in map_hugetlbChristophe Leroy1-1/+1
The displayed size is in bytes while the text says it is in kB. Shift it by 10 to really display kBytes. Fixes: fa7b9a805c79 ("tools/selftest/vm: allow choosing mem size and page size in map_hugetlb") Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/e27481224564a93d14106e750de31189deaa8bc8.1598861977.git.christophe.leroy@csgroup.eu Signed-off-by: Linus Torvalds <[email protected]>
2020-09-19selftests/seccomp: powerpc: Fix seccomp return value testingKees Cook1-0/+15
On powerpc, the errno is not inverted, and depends on ccr.so being set. Add this to a powerpc definition of SYSCALL_RET_SET(). Co-developed-by: Thadeu Lima de Souza Cascardo <[email protected]> Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]> Link: https://lore.kernel.org/linux-kselftest/[email protected]/ Fixes: 5d83c2b37d43 ("selftests/seccomp: Add powerpc support") Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Reviewed-by: Michael Ellerman <[email protected]>
2020-09-19selftests/seccomp: Remove SYSCALL_NUM_RET_SHARE_REG in favor of SYSCALL_RET_SETKees Cook1-10/+23
Instead of special-casing the specific case of shared registers, create a default SYSCALL_RET_SET() macro (mirroring SYSCALL_NUM_SET()), that writes to the SYSCALL_RET register. For architectures that can't set the return value (for whatever reason), they can define SYSCALL_RET_SET() without an associated SYSCALL_RET() macro. This also paves the way for architectures that need to do special things to set the return value (e.g. powerpc). Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Acked-by: Christian Brauner <[email protected]>
2020-09-19selftests/seccomp: Avoid redundant register flushesKees Cook1-2/+4
When none of the registers have changed, don't flush them back. This can happen if the architecture uses a non-register way to change the syscall (e.g. arm64) , and a return value hasn't been written. Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Acked-by: Christian Brauner <[email protected]>
2020-09-19selftests/seccomp: Convert REGSET calls into ARCH_GETREG/ARCH_SETREGKees Cook1-27/+15
Consolidate the REGSET logic into the new ARCH_GETREG() and ARCH_SETREG() macros, avoiding more #ifdef code in function bodies. Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Acked-by: Christian Brauner <[email protected]>
2020-09-19selftests/seccomp: Convert HAVE_GETREG into ARCH_GETREG/ARCH_SETREGKees Cook1-12/+15
Instead of special-casing the get/set-registers routines, move the HAVE_GETREG logic into the new ARCH_GETREG() and ARCH_SETREG() macros. Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Acked-by: Christian Brauner <[email protected]>
2020-09-19selftests/seccomp: Remove syscall setting #ifdefsKees Cook1-13/+3
With all architectures now using the common SYSCALL_NUM_SET() macro, the arch-specific #ifdef can be removed from change_syscall() itself. Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Acked-by: Christian Brauner <[email protected]>
2020-09-19selftests/seccomp: mips: Remove O32-specific macroKees Cook1-6/+12
Instead of having the mips O32 macro special-cased, pull the logic into the SYSCALL_NUM() macro. Additionally include the ABI headers, since these appear to have been missing, leaving __NR_O32_Linux undefined. Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Acked-by: Christian Brauner <[email protected]>
2020-09-19selftests/seccomp: arm64: Define SYSCALL_NUM_SET macroKees Cook1-14/+13
Remove the arm64 special-case in change_syscall(). Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Acked-by: Christian Brauner <[email protected]>
2020-09-19selftests/seccomp: arm: Define SYSCALL_NUM_SET macroKees Cook1-10/+6
Remove the arm special-case in change_syscall(). Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Acked-by: Christian Brauner <[email protected]>
2020-09-19selftests/seccomp: mips: Define SYSCALL_NUM_SET macroKees Cook1-8/+9
Remove the mips special-case in change_syscall(). Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Acked-by: Christian Brauner <[email protected]>
2020-09-19selftests/seccomp: Provide generic syscall setting macroKees Cook1-2/+13
In order to avoid "#ifdef"s in the main function bodies, create a new macro, SYSCALL_NUM_SET(), where arch-specific logic can live. Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Acked-by: Christian Brauner <[email protected]>
2020-09-19selftests/seccomp: Refactor arch register macros to avoid xtensa special caseKees Cook1-50/+47
To avoid an xtensa special-case, refactor all arch register macros to take the register variable instead of depending on the macro expanding as a struct member name. Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Acked-by: Christian Brauner <[email protected]>
2020-09-19selftests/seccomp: Use __NR_mknodat instead of __NR_mknodKees Cook1-1/+1
The __NR_mknod syscall doesn't exist on arm64 (only __NR_mknodat). Switch to the modern syscall. Fixes: ad5682184a81 ("selftests/seccomp: Check for EPOLLHUP for user_notif") Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Acked-by: Christian Brauner <[email protected]>
2020-09-18tools/bootconfig: Add --init option for bconf2ftrace.shMasami Hiramatsu2-1/+120
Since the ftrace current setting may conflict with the new setting from bootconfig, add the --init option to initialize ftrace before setting for bconf2ftrace.sh. E.g. $ bconf2ftrace.sh --init boottrace.bconf This initialization method copied from selftests/ftrace. Link: https://lkml.kernel.org/r/159704853203.175360.17029578033994278231.stgit@devnote2 Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-09-18tools/bootconfig: Add a script to generates bootconfig from ftraceMasami Hiramatsu1-0/+244
Add a ftrace2bconf.sh under tools/bootconfig/scripts which generates a bootconfig file from the current ftrace settings. To read the ftrace settings, ftrace2bconf.sh requires the root privilege (or sudo). The ftrace2bconf.sh will output the bootconfig to stdout and error messages to stderr, so usually you'll run it as # ftrace2bconf.sh > ftrace.bconf Note that some ftrace configurations are not supported. For example, function-call/callgraph trace/notrace settings are not supported because the wildcard has been expanded and lost in the ftrace anymore. Link: https://lkml.kernel.org/r/159704852163.175360.16738029520293360558.stgit@devnote2 Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-09-18tools/bootconfig: Add a script to generate ftrace shell-command from bootconfigMasami Hiramatsu2-0/+245
Add a bconf2ftrace.sh under tools/bootconfig/scripts which generates a shell script to setup boot-time trace from bootconfig file for testing the bootconfig. bconf2ftrace.sh will take a bootconfig file (includes boot-time tracing) and convert it into a shell-script which is almost same as the boot-time tracer does. If --apply option is given, it also tries to apply those command to the running kernel, which requires the root privilege (or sudo). For example, if you just want to confirm the shell commands, save the output as below. # bconf2ftrace.sh ftrace.bconf > ftrace.sh Or, you can apply it directly. # bconf2ftrace.sh --apply ftrace.bconf Note that some boot-time tracing parameters under kernel.* are not able to set via tracefs nor procfs (e.g. tp_printk, traceoff_on_warning.), so those are ignored. Link: https://lkml.kernel.org/r/159704851101.175360.15119132351139842345.stgit@devnote2 Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-09-18tools/bootconfig: Make all functions staticMasami Hiramatsu1-8/+8
Make all functions static except for main(). This is just a cleanup. Link: https://lkml.kernel.org/r/159704850135.175360.12465608936326167517.stgit@devnote2 Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-09-18tools/bootconfig: Add list optionMasami Hiramatsu1-12/+40
Add list option (-l) to show the bootconfig in the list style. This is same output of /proc/bootconfig. So users can check how their bootconfig will be shown in procfs. This will help them to write a user-space script to parse the /proc/bootconfig. Link: https://lkml.kernel.org/r/159704849087.175360.8761890802048625207.stgit@devnote2 Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-09-18tools/bootconfig: Show bootconfig compact tree from bootconfig fileMasami Hiramatsu1-23/+58
Show the bootconfig compact tree from the bootconfig file instead of an initrd if the given file has no magic number and is smaller than 32KB. User can use this for checking the syntax error or output checking before applying the bootconfig to initrd. Link: https://lkml.kernel.org/r/159704848156.175360.6621139371000789360.stgit@devnote2 Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-09-19tools/bpftool: Support passing BPFTOOL_VERSION to makeTony Ambardar1-1/+1
This change facilitates out-of-tree builds, packaging, and versioning for test and debug purposes. Defining BPFTOOL_VERSION allows self-contained builds within the tools tree, since it avoids use of the 'kernelversion' target in the top-level makefile, which would otherwise pull in several other includes from outside the tools tree. Signed-off-by: Tony Ambardar <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Quentin Monnet <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-19selftests/bpf: Fix endianness issue in test_sockopt_skIlya Leoshkevich1-2/+2
getsetsockopt() calls getsockopt() with optlen == 1, but then checks the resulting int. It is ok on little endian, but not on big endian. Fix by checking char instead. Fixes: 8a027dc0d8f5 ("selftests/bpf: add sockopt test that exercises sk helpers") Signed-off-by: Ilya Leoshkevich <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-18selftests/bpf: Fix endianness issue in sk_assignIlya Leoshkevich1-1/+1
server_map's value size is 8, but the test tries to put an int there. This sort of works on x86 (unless followed by non-0), but hard fails on s390. Fix by using __s64 instead of int. Fixes: 2d7824ffd25c ("selftests: bpf: Add test for sk_assign") Signed-off-by: Ilya Leoshkevich <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-18Merge tag 'powerpc-5.9-5' of ↵Linus Torvalds1-2/+7
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Some more powerpc fixes for 5.9: - Opt us out of the DEBUG_VM_PGTABLE support for now as it's causing crashes. - Fix a long standing bug in our DMA mask handling that was hidden until recently, and which caused problems with some drivers. - Fix a boot failure on systems with large amounts of RAM, and no hugepage support and using Radix MMU, only seen in the lab. - A few other minor fixes. Thanks to Alexey Kardashevskiy, Aneesh Kumar K.V, Gautham R. Shenoy, Hari Bathini, Ira Weiny, Nick Desaulniers, Shirisha Ganta, Vaibhav Jain, and Vaidyanathan Srinivasan" * tag 'powerpc-5.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/papr_scm: Limit the readability of 'perf_stats' sysfs attribute cpuidle: pseries: Fix CEDE latency conversion from tb to us powerpc/dma: Fix dma_map_ops::get_required_mask Revert "powerpc/build: vdso linker warning for orphan sections" powerpc/mm: Remove DEBUG_VM_PGTABLE support on powerpc selftests/powerpc: Skip PROT_SAO test in guests/LPARS powerpc/book3s64/radix: Fix boot failure with large amount of guest memory
2020-09-18objtool: Fix noreturn detection for ignored functionsJosh Poimboeuf1-1/+1
When a function is annotated with STACK_FRAME_NON_STANDARD, objtool doesn't validate its code paths. It also skips sibling call detection within the function. But sibling call detection is actually needed for the case where the ignored function doesn't have any return instructions. Otherwise objtool naively marks the function as implicit static noreturn, which affects the reachability of its callers, resulting in "unreachable instruction" warnings. Fix it by just enabling sibling call detection for ignored functions. The 'insn->ignore' check in add_jump_destinations() is no longer needed after e6da9567959e ("objtool: Don't use ignore flag for fake jumps"). Fixes the following warning: arch/x86/kvm/vmx/vmx.o: warning: objtool: vmx_handle_exit_irqoff()+0x142: unreachable instruction which triggers on an allmodconfig with CONFIG_GCOV_KERNEL unset. Reported-by: Linus Torvalds <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/5b1e2536cdbaa5246b60d7791b76130a74082c62.1599751464.git.jpoimboe@redhat.com
2020-09-18objtool: Ignore unreachable fake jumpsJulien Thierry1-0/+3
It is possible for alternative code to unconditionally jump out of the alternative region. In such a case, if a fake jump is added at the end of the alternative instructions, the fake jump will never be reached. Since the fake jump is just a mean to make sure code validation does not go beyond the set of alternatives, reaching it is not a requirement. Signed-off-by: Julien Thierry <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]>
2020-09-18objtool: Remove useless tests before save_reg()Julien Thierry1-4/+2
save_reg already checks that the register being saved does not already have a saved state. Remove redundant checks before processing a register storing operation. Signed-off-by: Julien Thierry <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]>
2020-09-18selftests: arm64: Add build and documentation for FP testsMark Brown4-1/+123
Integrate the FP tests with the build system and add some documentation for the ones run outside the kselftest infrastructure. The content in the README was largely written by Dave Martin with edits by me. Signed-off-by: Mark Brown <[email protected]> Acked-by: Dave Martin <[email protected]> Acked-by: Shuah Khan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-09-18selftests: arm64: Add wrapper scripts for stress testsMark Brown2-0/+119
Add wrapper scripts which invoke fpsimd-test and sve-test with several copies per CPU such that the context switch code will be appropriately exercised. Signed-off-by: Mark Brown <[email protected]> Acked-by: Dave Martin <[email protected]> Acked-by: Shuah Khan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-09-18selftests: arm64: Add utility to set SVE vector lengthsMark Brown1-0/+155
vlset is a small utility for use in conjunction with tests like the sve-test stress test which allows another executable to be invoked with a configured SVE vector length. Signed-off-by: Mark Brown <[email protected]> Acked-by: Dave Martin <[email protected]> Acked-by: Shuah Khan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-09-18selftests: arm64: Add stress tests for FPSMID and SVE context switchingMark Brown4-0/+1222
Add programs sve-test and fpsimd-test which spin reading and writing to the SVE and FPSIMD registers, verifying the operations they perform. The intended use is to leave them running to stress the context switch code's handling of these registers which isn't compatible with what kselftest does so they're not integrated into the framework but there's no other obvious testsuite where they fit so let's store them here. These tests were written by Dave Martin and lightly adapted by me. Signed-off-by: Mark Brown <[email protected]> Acked-by: Dave Martin <[email protected]> Acked-by: Shuah Khan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-09-18selftests: arm64: Add test for the SVE ptrace interfaceMark Brown2-0/+369
Add a test case that does some basic verification of the SVE ptrace interface, forking off a child with known values in the registers and then using ptrace to inspect and manipulate the SVE registers of the child, including in FPSIMD mode to account for sharing between the SVE and FPSIMD registers. This program was written by Dave Martin and modified for kselftest by me. Signed-off-by: Mark Brown <[email protected]> Acked-by: Dave Martin <[email protected]> Acked-by: Shuah Khan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-09-18selftests: arm64: Test case for enumeration of SVE vector lengthsMark Brown1-0/+58
Add a test case that verifies that we can enumerate the SVE vector lengths on systems where we detect SVE, and that those SVE vector lengths are valid. This program was written by Dave Martin and adapted to kselftest by me. Signed-off-by: Mark Brown <[email protected]> Acked-by: Dave Martin <[email protected]> Acked-by: Shuah Khan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-09-18kselftests/arm64: add PAuth tests for single threaded consistency and ↵Boyan Karatotev1-0/+123
differently initialized keys PAuth adds 5 different keys that can be used to sign addresses. Add a test that verifies that the kernel initializes them to different values and preserves them across context switches. Signed-off-by: Boyan Karatotev <[email protected]> Reviewed-by: Vincenzo Frascino <[email protected]> Reviewed-by: Amit Daniel Kachhap <[email protected]> Acked-by: Shuah Khan <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-09-18kselftests/arm64: add PAuth test for whether exec() changes keysBoyan Karatotev4-0/+198
Kernel documentation states that it will change PAuth keys on exec() calls. Verify that all keys are correctly switched to new ones. Signed-off-by: Boyan Karatotev <[email protected]> Reviewed-by: Vincenzo Frascino <[email protected]> Reviewed-by: Amit Daniel Kachhap <[email protected]> Acked-by: Shuah Khan <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-09-18kselftests/arm64: add nop checks for PAuth testsBoyan Karatotev5-2/+105
PAuth adds sign/verify controls to enable and disable groups of instructions in hardware for compatibility with libraries that do not implement PAuth. The kernel always enables them if it detects PAuth. Add a test that checks that each group of instructions is enabled, if the kernel reports PAuth as detected. Note: For groups, for the purpose of this patch, we intend instructions that use a certain key. Signed-off-by: Boyan Karatotev <[email protected]> Reviewed-by: Vincenzo Frascino <[email protected]> Reviewed-by: Amit Daniel Kachhap <[email protected]> Acked-by: Shuah Khan <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-09-18kselftests/arm64: add a basic Pointer Authentication testBoyan Karatotev6-1/+106
PAuth signs and verifies return addresses on the stack. It does so by inserting a Pointer Authentication code (PAC) into some of the unused top bits of an address. This is achieved by adding paciasp/autiasp instructions at the beginning and end of a function. This feature is partially backwards compatible with earlier versions of the ARM architecture. To coerce the compiler into emitting fully backwards compatible code the main file is compiled to target an earlier ARM version. This allows the tests to check for the feature and print meaningful error messages instead of crashing. Add a test to verify that corrupting the return address results in a SIGSEGV on return. Signed-off-by: Boyan Karatotev <[email protected]> Reviewed-by: Vincenzo Frascino <[email protected]> Reviewed-by: Amit Daniel Kachhap <[email protected]> Acked-by: Shuah Khan <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-09-18perf probe: Fall back to debuginfod query if debuginfo and source not found ↵Masami Hiramatsu3-8/+118
locally Since 'perf probe' heavily depends on debuginfo, debuginfod gives us many benefits on the 'perf probe' command on remote machine. Especially, this will be helpful for the embedded devices which will not have enough storage, or boot with a cross-build kernel whose source code is in the host machine. This will work as similar to commit c7a14fdcb3fa7736 ("perf build-ids: Fall back to debuginfod query if debuginfo not found") Tested with: (host) $ cd PATH/TO/KBUILD/DIR/ (host) $ debuginfod -F . ... (remote) # perf probe -L vfs_read Failed to find the path for the kernel: No such file or directory Error: Failed to show lines. (remote) # export DEBUGINFOD_URLS="http://$HOST_IP:8002/" (remote) # perf probe -L vfs_read <vfs_read@...> 0 ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos) { 2 ssize_t ret; if (!(file->f_mode & FMODE_READ)) return -EBADF; 6 if (!(file->f_mode & FMODE_CAN_READ)) return -EINVAL; 8 if (unlikely(!access_ok(buf, count))) return -EFAULT; 11 ret = rw_verify_area(READ, file, pos, count); 12 if (ret) return ret; if (count > MAX_RW_COUNT) ... (remote) # perf probe -a "vfs_read count" Added new event: probe:vfs_read (on vfs_read with count) (remote) # perf probe -l probe:vfs_read (on vfs_read@ksrc/linux/fs/read_write.c with count) Signed-off-by: Masami Hiramatsu <[email protected]> Reviewed-by: Frank Ch. Eigler <[email protected]> Cc: Aaron Merey <[email protected]> Cc: Daniel Thompson <[email protected]> Link: http://lore.kernel.org/lkml/160041610083.912668.13659563860278615846.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-18perf probe: Fix to adjust symbol address with correct reloc_sym addressMasami Hiramatsu1-3/+5
'perf probe' uses ref_reloc_sym to adjust symbol offset address from debuginfo address or ref_reloc_sym based address, but that is misusing reloc_sym->addr and reloc_sym->unrelocated_addr. If map is not relocated (map->reloc == 0), we can use reloc_sym->addr as unrelocated address instead of reloc_sym->unrelocated_addr. This usually does not happen. If we have a non-stripped ELF binary, we will use it for map and debuginfo, if not, we use only kallsyms without debuginfo. Thus, the map is always relocated (ELF and DWARF binary) or not relocated (kallsyms). However, if we allow the combination of debuginfo and kallsyms based map (like using debuginfod), we have to check the map->reloc and choose the collect address of reloc_sym. Signed-off-by: Masami Hiramatsu <[email protected]> Reviewed-by: Frank Ch. Eigler <[email protected]> Cc: Aaron Merey <[email protected]> Cc: Daniel Thompson <[email protected]> Link: http://lore.kernel.org/lkml/160041609047.912668.14314639291419159274.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-17selftests/bpf: Add tailcall_bpf2bpf testsMaciej Fijalkowski5-0/+533
Add four tests to tailcalls selftest explicitly named "tailcall_bpf2bpf_X" as their purpose is to validate that combination of tailcalls with bpf2bpf calls are working properly. These tests also validate LD_ABS from subprograms. Signed-off-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2020-09-17bpf: Add abnormal return checks.Alexei Starovoitov1-3/+3
LD_[ABS|IND] instructions may return from the function early. bpf_tail_call pseudo instruction is either fallthrough or return. Allow them in the subprograms only when subprograms are BTF annotated and have scalar return types. Allow ld_abs and tail_call in the main program even if it calls into subprograms. In the past that was not ok to do for ld_abs, since it was JITed with special exit sequence. Since bpf_gen_ld_abs() was introduced the ld_abs looks like normal exit insn from JIT point of view, so it's safe to allow them in the main program. Signed-off-by: Alexei Starovoitov <[email protected]>
2020-09-17selftests: Set default protocol for raw sockets in nettestDavid Ahern1-0/+2
IPPROTO_IP (0) is not valid for raw sockets. Default the protocol for raw sockets to IPPROTO_RAW if the protocol has not been set via the -P option. Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-17selftests/harness: Flush stdout before forkingMichael Ellerman1-0/+5
The test harness forks() a child to run each test. Both the parent and the child print to stdout using libc functions. That can lead to duplicated (or more) output if the libc buffers are not flushed before forking. It's generally not seen when running programs directly, because stdout will usually be line buffered when it's pointing to a terminal. This was noticed when running the seccomp_bpf test, eg: $ ./seccomp_bpf | tee test.log $ grep -c "TAP version 13" test.log 2 But we only expect the TAP header to appear once. It can be exacerbated using stdbuf to increase the buffer size: $ stdbuf -o 1MB ./seccomp_bpf > test.log $ grep -c "TAP version 13" test.log 13 The fix is simple, we just flush stdout & stderr before fork. Usually stderr is unbuffered, but that can be changed, so flush it as well just to be safe. Signed-off-by: Michael Ellerman <[email protected]> Tested-by: Max Filippov <[email protected]> Acked-by: Shuah Khan <[email protected]> Acked-by: Kees Cook <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2020-09-17selftests: mptcp: interpret \n as a new lineMatthieu Baerts2-4/+4
In case of errors, this message was printed: (...) # read: Resource temporarily unavailable # client exit code 0, server 3 # \nnetns ns1-0-BJlt5D socket stat for 10003: (...) Obviously, the idea was to add a new line before the socket stat and not print "\nnetns". Fixes: b08fbf241064 ("selftests: add test-cases for MPTCP MP_JOIN") Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp") Signed-off-by: Matthieu Baerts <[email protected]> Acked-by: Paolo Abeni <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-17perf metricgroup: Fix uncore metric expressionsIan Rogers1-19/+56
A metric like DRAM_BW_Use has on SkylakeX events uncore_imc/cas_count_read/ and uncore_imc/case_count_write/. These events open 6 events per socket with pmu names of uncore_imc_[0-5]. The current metric setup code in find_evsel_group assumes one ID will map to 1 event to be recorded in metric_events. For events with multiple matches, the first event is recorded in metric_events (avoiding matching >1 event with the same name) and the evlist_used updated so that duplicate events aren't removed when the evlist has unused events removed. Before this change: $ /tmp/perf/perf stat -M DRAM_BW_Use -a -- sleep 1 Performance counter stats for 'system wide': 41.14 MiB uncore_imc/cas_count_read/ 1,002,614,251 ns duration_time 1.002614251 seconds time elapsed After this change: $ /tmp/perf/perf stat -M DRAM_BW_Use -a -- sleep 1 Performance counter stats for 'system wide': 157.47 MiB uncore_imc/cas_count_read/ # 0.00 DRAM_BW_Use 126.97 MiB uncore_imc/cas_count_write/ 1,003,019,728 ns duration_time Erroneous duplication introduced in: commit 2440689d62e9 ("perf metricgroup: Remove duped metric group events"). Fixes: ded80bda8bc9 ("perf expr: Migrate expr ids table to a hashmap"). Reported-by: Jin Yao <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Fastabend <[email protected]> Cc: KP Singh <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Yonghong Song <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-17perf intel-pt: Fix "context_switch event has no tid" errorAdrian Hunter1-4/+4
A context_switch event can have no tid because pids can be detached from a task while the task is still running (in do_exit()). Note this won't happen with per-task contexts because then tracing stops at perf_event_exit_task() If a task with no tid gets preempted, or a dying task gets preempted and its parent releases it, when it subsequently gets switched back in, Intel PT will not be able to determine what task is running and prints an error "context_switch event has no tid". However, it is not really an error because the task is in kernel space and the decoder can continue to decode successfully. Fix by changing the error to be only a logged message, and make allowance for tid == -1. Example: Using 5.9-rc4 with Preemptible Kernel (Low-Latency Desktop) e.g. $ uname -r 5.9.0-rc4 $ grep PREEMPT .config # CONFIG_PREEMPT_NONE is not set # CONFIG_PREEMPT_VOLUNTARY is not set CONFIG_PREEMPT=y CONFIG_PREEMPT_COUNT=y CONFIG_PREEMPTION=y CONFIG_PREEMPT_RCU=y CONFIG_PREEMPT_NOTIFIERS=y CONFIG_DRM_I915_PREEMPT_TIMEOUT=640 CONFIG_DEBUG_PREEMPT=y # CONFIG_PREEMPT_TRACER is not set # CONFIG_PREEMPTIRQ_DELAY_TEST is not set Before: $ cat forkit.c #include <sys/types.h> #include <unistd.h> #include <sys/wait.h> int main() { pid_t child; int status = 0; child = fork(); if (child == 0) return 123; wait(&status); return 0; } $ gcc -o forkit forkit.c $ sudo ~/bin/perf record --kcore -a -m,64M -e intel_pt/cyc/k & [1] 11016 $ taskset 2 ./forkit $ sudo pkill perf $ [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 17.262 MB perf.data ] [1]+ Terminated sudo ~/bin/perf record --kcore -a -m,64M -e intel_pt/cyc/k $ sudo ~/bin/perf script --show-task-events --show-switch-events --itrace=iqqe-o -C 1 --ns | grep -C 2 forkit context_switch event has no tid taskset 11019 [001] 66663.270045029: 1 instructions:k: ffffffffb1d9f844 strnlen_user+0xb4 ([kernel.kallsyms]) taskset 11019 [001] 66663.270201816: 1 instructions:k: ffffffffb1a83121 unmap_page_range+0x561 ([kernel.kallsyms]) forkit 11019 [001] 66663.270327553: PERF_RECORD_COMM exec: forkit:11019/11019 forkit 11019 [001] 66663.270420028: 1 instructions:k: ffffffffb1db9537 __clear_user+0x27 ([kernel.kallsyms]) forkit 11019 [001] 66663.270648704: 1 instructions:k: ffffffffb18829e6 do_user_addr_fault+0xf6 ([kernel.kallsyms]) forkit 11019 [001] 66663.270833163: 1 instructions:k: ffffffffb230a825 irqentry_exit_to_user_mode+0x15 ([kernel.kallsyms]) forkit 11019 [001] 66663.271092359: 1 instructions:k: ffffffffb1aea3d9 lock_page_memcg+0x9 ([kernel.kallsyms]) forkit 11019 [001] 66663.271207092: PERF_RECORD_FORK(11020:11020):(11019:11019) forkit 11019 [001] 66663.271234775: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 11020/11020 forkit 11020 [001] 66663.271238407: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11019/11019 forkit 11020 [001] 66663.271312066: 1 instructions:k: ffffffffb1a88140 handle_mm_fault+0x10 ([kernel.kallsyms]) forkit 11020 [001] 66663.271476225: PERF_RECORD_EXIT(11020:11020):(11019:11019) forkit 11020 [001] 66663.271497488: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11019/11019 forkit 11019 [001] 66663.271500523: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11020/11020 forkit 11019 [001] 66663.271517241: 1 instructions:k: ffffffffb24012cd error_entry+0x6d ([kernel.kallsyms]) forkit 11019 [001] 66663.271664080: PERF_RECORD_EXIT(11019:11019):(1386:1386) After: $ sudo ~/bin/perf script --show-task-events --show-switch-events --itrace=iqqe-o -C 1 --ns | grep -C 2 forkit taskset 11019 [001] 66663.270045029: 1 instructions:k: ffffffffb1d9f844 strnlen_user+0xb4 ([kernel.kallsyms]) taskset 11019 [001] 66663.270201816: 1 instructions:k: ffffffffb1a83121 unmap_page_range+0x561 ([kernel.kallsyms]) forkit 11019 [001] 66663.270327553: PERF_RECORD_COMM exec: forkit:11019/11019 forkit 11019 [001] 66663.270420028: 1 instructions:k: ffffffffb1db9537 __clear_user+0x27 ([kernel.kallsyms]) forkit 11019 [001] 66663.270648704: 1 instructions:k: ffffffffb18829e6 do_user_addr_fault+0xf6 ([kernel.kallsyms]) forkit 11019 [001] 66663.270833163: 1 instructions:k: ffffffffb230a825 irqentry_exit_to_user_mode+0x15 ([kernel.kallsyms]) forkit 11019 [001] 66663.271092359: 1 instructions:k: ffffffffb1aea3d9 lock_page_memcg+0x9 ([kernel.kallsyms]) forkit 11019 [001] 66663.271207092: PERF_RECORD_FORK(11020:11020):(11019:11019) forkit 11019 [001] 66663.271234775: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 11020/11020 forkit 11020 [001] 66663.271238407: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11019/11019 forkit 11020 [001] 66663.271312066: 1 instructions:k: ffffffffb1a88140 handle_mm_fault+0x10 ([kernel.kallsyms]) forkit 11020 [001] 66663.271476225: PERF_RECORD_EXIT(11020:11020):(11019:11019) forkit 11020 [001] 66663.271497488: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11019/11019 forkit 11019 [001] 66663.271500523: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11020/11020 forkit 11019 [001] 66663.271517241: 1 instructions:k: ffffffffb24012cd error_entry+0x6d ([kernel.kallsyms]) forkit 11019 [001] 66663.271664080: PERF_RECORD_EXIT(11019:11019):(1386:1386) forkit 11019 [001] 66663.271688752: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: -1/-1 :-1 -1 [001] 66663.271692086: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11019/11019 :-1 -1 [001] 66663.271707466: 1 instructions:k: ffffffffb18eb096 update_load_avg+0x306 ([kernel.kallsyms]) Fixes: 86c2786994bd7c ("perf intel-pt: Add support for PERF_RECORD_SWITCH") Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Yu-cheng Yu <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-17perf script: Display negative tid in non-sample eventsAdrian Hunter2-5/+7
The kernel can release tasks while they are still running. This can result in a task having no tid, in which case perf records a tid of -1. Improve the perf script output in that case. Example: Before: # cat ./autoreap.c #include <sys/types.h> #include <unistd.h> #include <sys/wait.h> #include <signal.h> struct sigaction act = { .sa_handler = SIG_IGN, }; int main() { pid_t child; int status = 0; sigaction(SIGCHLD, &act, NULL); child = fork(); if (child == 0) return 123; wait(&status); return 0; } # gcc -o autoreap autoreap.c # ./perf record -a -e dummy --switch-events ./autoreap [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.948 MB perf.data ] # ./perf script --show-task-events --show-switch-events | grep -C2 'autoreap\|4294967295\|-1' swapper 0 [004] 18462.673613: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25189/25189 perf 25189 [004] 18462.673614: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 autoreap 25189 [004] 18462.673800: PERF_RECORD_COMM exec: autoreap:25189/25189 autoreap 25189 [004] 18462.674042: PERF_RECORD_FORK(25191:25191):(25189:25189) autoreap 25189 [004] 18462.674050: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 swapper 0 [004] 18462.674051: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 25189/25189 swapper 0 [005] 18462.674083: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25191/25191 autoreap 25191 [005] 18462.674084: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 swapper 0 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11 rcu_preempt 11 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 rcu_preempt 11 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 swapper 0 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11 autoreap 25191 [005] 18462.674138: PERF_RECORD_EXIT(25191:25191):(25189:25189) PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 swapper 0 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 4294967295/4294967295 swapper 0 [004] 18462.674182: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25189/25189 autoreap 25189 [004] 18462.674183: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 autoreap 25189 [004] 18462.674218: PERF_RECORD_EXIT(25189:25189):(25188:25188) autoreap 25189 [004] 18462.674225: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 swapper 0 [004] 18462.674226: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 25189/25189 swapper 0 [007] 18462.674257: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25188/25188 After: # ./perf script --show-task-events --show-switch-events | grep -C2 'autoreap\|4294967295\|-1' swapper 0 [004] 18462.673613: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25189/25189 perf 25189 [004] 18462.673614: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 autoreap 25189 [004] 18462.673800: PERF_RECORD_COMM exec: autoreap:25189/25189 autoreap 25189 [004] 18462.674042: PERF_RECORD_FORK(25191:25191):(25189:25189) autoreap 25189 [004] 18462.674050: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 swapper 0 [004] 18462.674051: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 25189/25189 swapper 0 [005] 18462.674083: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25191/25191 autoreap 25191 [005] 18462.674084: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 swapper 0 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11 rcu_preempt 11 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 rcu_preempt 11 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 swapper 0 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11 autoreap 25191 [005] 18462.674138: PERF_RECORD_EXIT(25191:25191):(25189:25189) :-1 -1 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 swapper 0 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: -1/-1 swapper 0 [004] 18462.674182: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25189/25189 autoreap 25189 [004] 18462.674183: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 autoreap 25189 [004] 18462.674218: PERF_RECORD_EXIT(25189:25189):(25188:25188) autoreap 25189 [004] 18462.674225: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 swapper 0 [004] 18462.674226: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 25189/25189 swapper 0 [007] 18462.674257: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25188/25188 Signed-off-by: Adrian Hunter <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Yu-cheng Yu <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-17perf docs: Improve help information in perf.txtZejiang Tang1-22/+47
perf has many undocumented options, such as:-vv, --exec-path, --html-path, -p, --paginate,--no-pager, --debugfs-dir, --buildid-dir, --list-cmds, --list-opts. Add entris for these options in perf.txt. Signed-off-by: Zejiang Tang <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>