blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2017-01-12	bpf: allow b/h/w/dw access for bpf's cb in ctx	Daniel Borkmann	1	-3/+439
	When structs are used to store temporary state in cb[] buffer that is used with programs and among tail calls, then the generated code will not always access the buffer in bpf_w chunks. We can ease programming of it and let this act more natural by allowing for aligned b/h/w/dw sized access for cb[] ctx member. Various test cases are attached as well for the selftest suite. Potentially, this can also be reused for other program types to pass data around. Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-01-11	tools: Sync x86's vmx.h with the kernel	Arnaldo Carvalho de Melo	1	-0/+5
	To pick the changes from: 1b07304c587d ("KVM: nVMX: support descriptor table exits") That adds entries to VMX_EXIT_REASONS, that is used by tools/perf/arch/x86/util/kvm-stat.c. This also picks the changes in: 1dc35dacc16b ("KVM: nVMX: check host CR3 on vmentry and vmexit") But these are not used in 'perf kvm stat', do it just to silence the kernel/tools file cache coherency detector: $ make -C tools/perf make: Entering directory '/home/acme/git/linux/tools/perf' BUILD: Doing 'make -j4' parallel build Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Ladi Prosek <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-11	perf record: Add switch-output time option argument	Jiri Olsa	2	-2/+44
	It's now possible to specify the threshold time for perf.data like: $ perf record --switch-output=30s ... Once it's reached, the current data are dumped in to the perf.data.<timestamp> file and session does on. $ perf record --switch-output=30s ... [ perf record: dump data: Woken up 44 times ] [ perf record: Dump perf.data.2017010213043746 ] ... The time is expected to be a number with appended unit character - s/m/h/d. Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Wang Nan <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-11	perf record: Add switch-output size warning	Jiri Olsa	3	-1/+24
	Adding switch-output size warning if the requested size of lower than the wakeup ring buffer size. $ perf record --switch-output=1K ls WARNING: switch-output data size lower than wakeup kernel buffer size (258K) expect bigger perf.data sizes ... Signed-off-by: Jiri Olsa <[email protected]> Suggested-and-Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-11	perf record: Add switch-output size option argument	Jiri Olsa	2	-16/+63
	It's now possible to specify the threshold size for perf.data like: $ perf record --switch-output=2G ... Once it's reached, the current data are dumped in to the perf.data.<timestamp> file and session does on. $ perf record --switch-output=2G ... [ perf record: dump data: Woken up 7244 times ] [ perf record: Dump perf.data.2017010214093746 ] ... The size is expected to be a number with appended unit character - B/K/M/G. Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Wang Nan <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-11	perf record: Change switch-output option to take optional argument	Jiri Olsa	1	-2/+26
	Next patches will add --switch-output option arguments, changing the option to allow that and adding its default value to 'signal'. Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Wang Nan <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-11	perf record: Add struct switch_output	Jiri Olsa	1	-6/+10
	Next patches will add more --switch-output option arguments, so preparing the data holder. Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Wang Nan <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-11	perf tools: Add unit_number__scnprintf function	Jiri Olsa	7	-2/+63
	Add unit_number__scnprintf function to display size units and use it in -m option info message. Before: $ perf record -m 10M ls rounding mmap pages size to 16777216 bytes (4096 pages) ... After: $ perf record -m 10M ls rounding mmap pages size to 16M (4096 pages) ... Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Rename it to unit_number__scnprintf for consistency ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-11	perf evlist: Fix typo in perf_evlist__start_workload()	Soramichi Akiyama	1	-1/+1
	This patch fixes a typo: s/enable to/unable to/ Signed-off-by: Soramichi AKIYAMA <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Fixes: bcf3145fbeb1 ("perf evlist: Enhance perf_evlist__start_workload()") Link: http://lkml.kernel.org/r/[email protected] [ Wasn't applying, fixed it up by hand, added Fixes: tag ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-11	perf trace: Allow specifying list of syscalls and events in -e/--expr/--event	Arnaldo Carvalho de Melo	2	-32/+96
	Makes it easier to specify both events and syscalls (to be formatter strace-like), i.e. previously one would have to do: # perf trace -e nanosleep --event sched:sched_switch usleep 1 Now it is possible to do: # perf trace -e nanosleep,sched:sched_switch usleep 1 0.000 ( 0.021 ms): usleep/17962 nanosleep(rqtp: 0x7ffdedd61ec0) ... 0.021 ( ): sched:sched_switch:usleep:17962 [120] S ==> swapper/1:0 [120]) 0.000 ( 0.066 ms): usleep/17962 ... [continued]: nanosleep()) = 0 # The old style --expr and using both -e and --event continues to work. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-11	perf kallsyms: Introduce tool to look for extended symbol information on the ↵	Arnaldo Carvalho de Melo	7	-1/+96
	running kernel Its similar to doing grep on a /proc/kallsyms, but it also shows extra information like the path to the kernel module and the unrelocated addresses in it, to help in diagnosing problems. It is also helps demonstrate the use of the symbols routines so that tool writers can use them more effectively. Using it: $ perf kallsyms e1000_xmit_frame netif_rx usb_stor_set_xfer_buf e1000_xmit_frame: [e1000e] /lib/modules/4.9.0+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko 0xffffffffc046fc10-0xffffffffc0470bb0 (0x19c80-0x1ac20) netif_rx: [kernel] [kernel.kallsyms] 0xffffffff916f03a0-0xffffffff916f0410 (0xffffffff916f03a0-0xffffffff916f0410) usb_stor_set_xfer_buf: [usb_storage] /lib/modules/4.9.0+/kernel/drivers/usb/storage/usb-storage.ko 0xffffffffc057aea0-0xffffffffc057af19 (0xf10-0xf89) $ Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-11	perf machine: Add a kallsyms loading constructor	Arnaldo Carvalho de Melo	2	-0/+20
	To reduce the boilerplate for searching for functions in the running kernel and modules. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-11	tools lib subcmd: Add missing linux/kernel.h include to subcmd.h	Arnaldo Carvalho de Melo	1	-0/+1
	As it was getting the BUILD_BUG_ON_ZERO() definition by luck. Cc: Jiri Olsa <[email protected]> Cc: Wang Nan <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-11	perf jvmti: Create libdir directory before installing libperf-jvmti.so	Laura Abbott	1	-0/+1
	The install command for libperf-jvmti.so does not check if libdir exists before installing. This means that when the install command is run: install libperf-jvmti.so '/tmp/test_root/usr/lib64'; libperf-jvmti.so will get installed to /usr/lib64 as a file and break further installation. Fix this by ensuring the directory gets created first. See https://bugzilla.redhat.com/show_bug.cgi?id=1410296 Signed-off-by: Laura Abbott <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Peter Zijlstra <[email protected]> Fixes: d4dfdf00d43e ("perf jvmti: Plug compilation into perf build") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-11	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	5	-5/+4
	Two AF_* families adding entries to the lockdep tables at the same time. Signed-off-by: David S. Miller <[email protected]>
2017-01-11	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds	1	-1/+0
	Merge fixes from Andrew Morton: "27 fixes. There are three patches that aren't actually fixes. They're simple function renamings which are nice-to-have in mainline as ongoing net development depends on them." * akpm: (27 commits) timerfd: export defines to userspace mm/hugetlb.c: fix reservation race when freeing surplus pages mm/slab.c: fix SLAB freelist randomization duplicate entries zram: support BDI_CAP_STABLE_WRITES zram: revalidate disk under init_lock mm: support anonymous stable page mm: add documentation for page fragment APIs mm: rename __page_frag functions to __page_frag_cache, drop order from drain mm: rename __alloc_page_frag to page_frag_alloc and __free_page_frag to page_frag_free mm, memcg: fix the active list aging for lowmem requests when memcg is enabled mm: don't dereference struct page fields of invalid pages mailmap: add codeaurora.org names for nameless email commits signal: protect SIGNAL_UNKILLABLE from unintentional clearing. mm: pmd dirty emulation in page fault handler ipc/sem.c: fix incorrect sem_lock pairing lib/Kconfig.debug: fix frv build failure mm: get rid of __GFP_OTHER_NODE mm: fix remote numa hits statistics mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done} ocfs2: fix crash caused by stale lvb with fsdlm plugin ...
2017-01-11	selftests: firmware: send expected errors to /dev/null	Luis R. Rodriguez	1	-3/+3
	Error that we expect should not be spilled to stdout. Without this we get: ./fw_filesystem.sh: line 58: printf: write error: Invalid argument ./fw_filesystem.sh: line 63: printf: write error: No such device ./fw_filesystem.sh: line 69: echo: write error: No such file or directory ./fw_filesystem.sh: filesystem loading works ./fw_filesystem.sh: async filesystem loading works With it: ./fw_filesystem.sh: filesystem loading works ./fw_filesystem.sh: async filesystem loading works Signed-off-by: Luis R. Rodriguez <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-01-11	selftests: firmware: only modprobe if driver is missing	Luis R. Rodriguez	1	-2/+17
	No need to load test_firmware if its already there. Also use a more generic form to recommend what is required to be built. Signed-off-by: Luis R. Rodriguez <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-01-10	mm: get rid of __GFP_OTHER_NODE	Michal Hocko	1	-1/+0
	The flag was introduced by commit 78afd5612deb ("mm: add __GFP_OTHER_NODE flag") to allow proper accounting of remote node allocations done by kernel daemons on behalf of a process - e.g. khugepaged. After "mm: fix remote numa hits statistics" we do not need and actually use the flag so we can safely remove it because all allocations which are satisfied from their "home" node are accounted properly. [[email protected]: fix build] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Michal Hocko <[email protected]> Acked-by: Mel Gorman <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Taku Izumi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-01-10	tools: usb: usbip: Update README	Krzysztof Opasiak	1	-1/+56
	Update README file: - remove outdated parts - clarify terminology and general structure - add some description of vUDC Signed-off-by: Krzysztof Opasiak <[email protected]> Acked-by: Shuah Khan <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-01-10	tools: usb: usbip: Add simple script to show how to setup vUDC	Krzysztof Opasiak	1	-0/+107
	Add some simple script which creates a USB gadget using ConfigFS and then exports it using vUDC. This may be useful for people who just started playing with USB/IP and vUDC as it shows exact steps how to setup all stuff. Signed-off-by: Krzysztof Opasiak <[email protected]> Acked-by: Shuah Khan <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-01-09	bpf: allow helpers access to variable memory	Gianluca Borello	1	-0/+410
	Currently, helpers that read and write from/to the stack can do so using a pair of arguments of type ARG_PTR_TO_STACK and ARG_CONST_STACK_SIZE. ARG_CONST_STACK_SIZE accepts a constant register of type CONST_IMM, so that the verifier can safely check the memory access. However, requiring the argument to be a constant can be limiting in some circumstances. Since the current logic keeps track of the minimum and maximum value of a register throughout the simulated execution, ARG_CONST_STACK_SIZE can be changed to also accept an UNKNOWN_VALUE register in case its boundaries have been set and the range doesn't cause invalid memory accesses. One common situation when this is useful: int len; char buf[BUFSIZE]; /* BUFSIZE is 128 */ if (some_condition) len = 42; else len = 84; some_helper(..., buf, len & (BUFSIZE - 1)); The compiler can often decide to assign the constant values 42 or 48 into a variable on the stack, instead of keeping it in a register. When the variable is then read back from stack into the register in order to be passed to the helper, the verifier will not be able to recognize the register as constant (the verifier is not currently tracking all constant writes into memory), and the program won't be valid. However, by allowing the helper to accept an UNKNOWN_VALUE register, this program will work because the bitwise AND operation will set the range of possible values for the UNKNOWN_VALUE register to [0, BUFSIZE), so the verifier can guarantee the helper call will be safe (assuming the argument is of type ARG_CONST_STACK_SIZE_OR_ZERO, otherwise one more check against 0 would be needed). Custom ranges can be set not only with ALU operations, but also by explicitly comparing the UNKNOWN_VALUE register with constants. Another very common example happens when intercepting system call arguments and accessing user-provided data of variable size using bpf_probe_read(). One can load at runtime the user-provided length in an UNKNOWN_VALUE register, and then read that exact amount of data up to a compile-time determined limit in order to fit into the proper local storage allocated on the stack, without having to guess a suboptimal access size at compile time. Also, in case the helpers accepting the UNKNOWN_VALUE register operate in raw mode, disable the raw mode so that the program is required to initialize all memory, since there is no guarantee the helper will fill it completely, leaving possibilities for data leak (just relevant when the memory used by the helper is the stack, not when using a pointer to map element value or packet). In other words, ARG_PTR_TO_RAW_STACK will be treated as ARG_PTR_TO_STACK. Signed-off-by: Gianluca Borello <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-01-09	bpf: allow adjusted map element values to spill	Gianluca Borello	1	-0/+46
	commit 484611357c19 ("bpf: allow access into map value arrays") introduces the ability to do pointer math inside a map element value via the PTR_TO_MAP_VALUE_ADJ register type. The current support doesn't handle the case where a PTR_TO_MAP_VALUE_ADJ is spilled into the stack, limiting several use cases, especially when generating bpf code from a compiler. Handle this case by explicitly enabling the register type PTR_TO_MAP_VALUE_ADJ to be spilled. Also, make sure that min_value and max_value are reset just for BPF_LDX operations that don't result in a restore of a spilled register from stack. Signed-off-by: Gianluca Borello <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-01-09	bpf: allow helpers access to map element values	Gianluca Borello	1	-0/+491
	Enable helpers to directly access a map element value by passing a register type PTR_TO_MAP_VALUE (or PTR_TO_MAP_VALUE_ADJ) to helper arguments ARG_PTR_TO_STACK or ARG_PTR_TO_RAW_STACK. This enables several use cases. For example, a typical tracing program might want to capture pathnames passed to sys_open() with: struct trace_data { char pathname[PATHLEN]; }; SEC("kprobe/sys_open") void bpf_sys_open(struct pt_regs ctx) { struct trace_data data; bpf_probe_read(data.pathname, sizeof(data.pathname), ctx->di); / consume data.pathname, for example via * bpf_trace_printk() or bpf_perf_event_output() / } Such a program could easily hit the stack limit in case PATHLEN needs to be large or more local variables need to exist, both of which are quite common scenarios. Allowing direct helper access to map element values, one could do: struct bpf_map_def SEC("maps") scratch_map = { .type = BPF_MAP_TYPE_PERCPU_ARRAY, .key_size = sizeof(u32), .value_size = sizeof(struct trace_data), .max_entries = 1, }; SEC("kprobe/sys_open") int bpf_sys_open(struct pt_regs ctx) { int id = 0; struct trace_data p = bpf_map_lookup_elem(&scratch_map, &id); if (!p) return; bpf_probe_read(p->pathname, sizeof(p->pathname), ctx->di); / consume p->pathname, for example via * bpf_trace_printk() or bpf_perf_event_output() */ } And wouldn't risk exhausting the stack. Code changes are loosely modeled after commit 6841de8b0d03 ("bpf: allow helpers access the packet directly"). Unlike with PTR_TO_PACKET, these changes just work with ARG_PTR_TO_STACK and ARG_PTR_TO_RAW_STACK (not ARG_PTR_TO_MAP_KEY, ARG_PTR_TO_MAP_VALUE, ...): adding those would be trivial, but since there is not currently a use case for that, it's reasonable to limit the set of changes. Also, add new tests to make sure accesses to map element values from helpers never go out of boundary, even when adjusted. Signed-off-by: Gianluca Borello <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-01-05	selftests: x86/pkeys: fix spelling mistake: "itertation" -> "iteration"	Colin King	1	-1/+1
	Fix spelling mistake in print test pass message. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2017-01-05	selftests: do not require bash to run netsocktests testcase	Rolf Eike Beer	1	-1/+1
	Nothing in this minimal script seems to require bash. We often run these tests on embedded devices where the only shell available is the busybox ash. Use sh instead. Signed-off-by: Rolf Eike Beer <[email protected]> Cc: [email protected] Signed-off-by: Shuah Khan <[email protected]>
2017-01-05	selftests: do not require bash to run bpf tests	Rolf Eike Beer	1	-1/+1
	Nothing in this minimal script seems to require bash. We often run these tests on embedded devices where the only shell available is the busybox ash. Use sh instead. Signed-off-by: Rolf Eike Beer <[email protected]> Cc: [email protected] Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2017-01-05	selftests: do not require bash for the generated test	Rolf Eike Beer	1	-1/+1
	Nothing in this minimal script seems to require bash. We often run these tests on embedded devices where the only shell available is the busybox ash. Use sh instead. Signed-off-by: Rolf Eike Beer <[email protected]> Cc: [email protected] Signed-off-by: Shuah Khan <[email protected]>
2017-01-05	tools: psock_tpacket: block Rx until socket filter has been added and socket ↵	Sowmini Varadhan	1	-3/+3
	has been bound to loopback. Packets from any/all interfaces may be queued up on the PF_PACKET socket before it is bound to the loopback interface by psock_tpacket, and when these are passed up by the kernel, they could interfere with the Rx tests. Avoid interference from spurious packet by blocking Rx until the socket filter has been set up, and the packet has been bound to the desired (lo) interface. The effective sequence is socket(PF_PACKET, SOCK_RAW, 0); set up ring Invoke SO_ATTACH_FILTER bind to sll_protocol set to ETH_P_ALL, sll_ifindex for lo After this sequence, the only packets that will be passed up are those received on loopback that pass the attached filter. Signed-off-by: Sowmini Varadhan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-01-05	iio: Add channel for Gravity	Song Hongyan	1	-0/+2
	Add new channel types support for gravity sensor. Gravity sensor provides an application-level or physical collection that identifies a device that measures exclusively the force of Earth's gravity along any number of axes. More information can be found in: http://www.usb.org/developers/hidpage/HUTRR59_-_Usages_for_Wearables.pdf Signed-off-by: Song Hongyan <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]>
2017-01-05	selftests/x86: Add a selftest for SYSRET to noncanonical addresses	Andy Lutomirski	2	-1/+196
	SYSRET to a noncanonical address will blow up on Intel CPUs. Linux needs to prevent this from happening in two major cases, and the criteria will become more complicated when support for larger virtual address spaces is added. A fast-path SYSCALL will fall through to the following instruction using SYSRET without any particular checking. To prevent fall-through to a noncanonical address, Linux prevents the highest canonical page from being mapped. This test case checks a variety of possible maximum addresses to make sure that either we can't map code there or that SYSCALL fall-through works. A slow-path system call can return anywhere. Linux needs to make sure that, if the return address is non-canonical, it won't use SYSRET. This test cases causes sigreturn() to return to a variety of addresses (with RCX == RIP) and makes sure that nothing explodes. Some of this code comes from Kirill Shutemov. Kirill reported the following output with 5-level paging enabled: [RUN] sigreturn to 0x800000000000 [OK] Got SIGSEGV at RIP=0x800000000000 [RUN] sigreturn to 0x1000000000000 [OK] Got SIGSEGV at RIP=0x1000000000000 [RUN] sigreturn to 0x2000000000000 [OK] Got SIGSEGV at RIP=0x2000000000000 [RUN] sigreturn to 0x4000000000000 [OK] Got SIGSEGV at RIP=0x4000000000000 [RUN] sigreturn to 0x8000000000000 [OK] Got SIGSEGV at RIP=0x8000000000000 [RUN] sigreturn to 0x10000000000000 [OK] Got SIGSEGV at RIP=0x10000000000000 [RUN] sigreturn to 0x20000000000000 [OK] Got SIGSEGV at RIP=0x20000000000000 [RUN] sigreturn to 0x40000000000000 [OK] Got SIGSEGV at RIP=0x40000000000000 [RUN] sigreturn to 0x80000000000000 [OK] Got SIGSEGV at RIP=0x80000000000000 [RUN] sigreturn to 0x100000000000000 [OK] Got SIGSEGV at RIP=0x100000000000000 [RUN] sigreturn to 0x200000000000000 [OK] Got SIGSEGV at RIP=0x200000000000000 [RUN] sigreturn to 0x400000000000000 [OK] Got SIGSEGV at RIP=0x400000000000000 [RUN] sigreturn to 0x800000000000000 [OK] Got SIGSEGV at RIP=0x800000000000000 [RUN] sigreturn to 0x1000000000000000 [OK] Got SIGSEGV at RIP=0x1000000000000000 [RUN] sigreturn to 0x2000000000000000 [OK] Got SIGSEGV at RIP=0x2000000000000000 [RUN] sigreturn to 0x4000000000000000 [OK] Got SIGSEGV at RIP=0x4000000000000000 [RUN] sigreturn to 0x8000000000000000 [OK] Got SIGSEGV at RIP=0x8000000000000000 [RUN] Trying a SYSCALL that falls through to 0x7fffffffe000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x7ffffffff000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x800000000000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0xfffffffff000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x1000000000000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x1fffffffff000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x2000000000000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x3fffffffff000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x4000000000000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x7fffffffff000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x8000000000000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0xffffffffff000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x10000000000000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x1ffffffffff000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x20000000000000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x3ffffffffff000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x40000000000000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x7ffffffffff000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x80000000000000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0xfffffffffff000 [OK] We survived [RUN] Trying a SYSCALL that falls through to 0x100000000000000 [OK] mremap to 0xfffffffffff000 failed [RUN] Trying a SYSCALL that falls through to 0x1fffffffffff000 [OK] mremap to 0x1ffffffffffe000 failed [RUN] Trying a SYSCALL that falls through to 0x200000000000000 [OK] mremap to 0x1fffffffffff000 failed [RUN] Trying a SYSCALL that falls through to 0x3fffffffffff000 [OK] mremap to 0x3ffffffffffe000 failed [RUN] Trying a SYSCALL that falls through to 0x400000000000000 [OK] mremap to 0x3fffffffffff000 failed [RUN] Trying a SYSCALL that falls through to 0x7fffffffffff000 [OK] mremap to 0x7ffffffffffe000 failed [RUN] Trying a SYSCALL that falls through to 0x800000000000000 [OK] mremap to 0x7fffffffffff000 failed [RUN] Trying a SYSCALL that falls through to 0xffffffffffff000 [OK] mremap to 0xfffffffffffe000 failed [RUN] Trying a SYSCALL that falls through to 0x1000000000000000 [OK] mremap to 0xffffffffffff000 failed [RUN] Trying a SYSCALL that falls through to 0x1ffffffffffff000 [OK] mremap to 0x1fffffffffffe000 failed [RUN] Trying a SYSCALL that falls through to 0x2000000000000000 [OK] mremap to 0x1ffffffffffff000 failed [RUN] Trying a SYSCALL that falls through to 0x3ffffffffffff000 [OK] mremap to 0x3fffffffffffe000 failed [RUN] Trying a SYSCALL that falls through to 0x4000000000000000 [OK] mremap to 0x3ffffffffffff000 failed [RUN] Trying a SYSCALL that falls through to 0x7ffffffffffff000 [OK] mremap to 0x7fffffffffffe000 failed [RUN] Trying a SYSCALL that falls through to 0x8000000000000000 [OK] mremap to 0x7ffffffffffff000 failed Signed-off-by: Andy Lutomirski <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/e70bd9a3f90657ba47b755100a20475d038fa26b.1482808435.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2017-01-05	Merge tag 'perf-urgent-for-mingo-4.10-20170104' of ↵	Ingo Molnar	9	-45/+107
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes and one improvement from Arnaldo Carvalho de Melo: Fixes: - Fix prev/next_prio formatting for deadline tasks in libtraceevent (Daniel Bristot de Oliveira) - Robustify reading of build-ids from /sys/kernel/note (Arnaldo Carvalho de Melo) - Fix building some sample/bpf in Alpine Linux 3.4 (Arnaldo Carvalho de Melo) - Fix 'make install-bin' to install libtraceevent plugins (Arnaldo Carvalho de Melo) - Fix 'perf record --switch-output' documentation and comment (Jiri Olsa) - Fix 'perf probe' for cross arch probing (Masami Hiramatsu) Improvement: - Show total scheduling time in 'perf sched timehist' (Namhyumg Kim) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-04	perf probe: Fix to probe on gcc generated symbols for offline kernel	Masami Hiramatsu	1	-1/+47
	Fix perf-probe to show probe definition on gcc generated symbols for offline kernel (including cross-arch kernel image). gcc sometimes optimizes functions and generate new symbols with suffixes such as ".constprop.N" or ".isra.N" etc. Since those symbol names are not recorded in DWARF, we have to find correct generated symbols from offline ELF binary to probe on it (kallsyms doesn't correct it). For online kernel or uprobes we don't need it because those are rebased on _text, or a section relative address. E.g. Without this: $ perf probe -k build-arm/vmlinux -F __slab_alloc* __slab_alloc.constprop.9 $ perf probe -k build-arm/vmlinux -D __slab_alloc p:probe/__slab_alloc __slab_alloc+0 If you put above definition on target machine, it should fail because there is no __slab_alloc in kallsyms. With this fix, perf probe shows correct probe definition on __slab_alloc.constprop.9: $ perf probe -k build-arm/vmlinux -D __slab_alloc p:probe/__slab_alloc __slab_alloc.constprop.9+0 Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/148350060434.19001.11864836288580083501.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-04	perf probe: Fix --funcs to show correct symbols for offline module	Masami Hiramatsu	1	-19/+6
	Fix --funcs (-F) option to show correct symbols for offline module. Since previous perf-probe uses machine__findnew_module_map() for offline module, even if user passes a module file (with full path) which is for other architecture, perf-probe always tries to load symbol map for current kernel module. This fix uses dso__new_map() to load the map from given binary as same as a map for user applications. Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/148350053478.19001.15435255244512631545.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-03	perf symbols: Robustify reading of build-id from sysfs	Arnaldo Carvalho de Melo	1	-0/+6
	Markus reported that perf segfaults when reading /sys/kernel/notes from a kernel linked with GNU gold, due to what looks like a gold bug, so do some bounds checking to avoid crashing in that case. Reported-by: Markus Trippelsdorf <[email protected]> Report-Link: http://lkml.kernel.org/r/20161219161821.GA294@x4 Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-03	perf tools: Install tools/lib/traceevent plugins with install-bin	Arnaldo Carvalho de Melo	1	-2/+2
	Those are binaries as well, so should be installed by: make -C tools/perf install-bin' too. Cc: Alexander Shishkin <[email protected]> Cc: Daniel Bristot de Oliveira <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-03	tools lib traceevent: Fix prev/next_prio for deadline tasks	Daniel Bristot de Oliveira	1	-2/+2
	Currently, the sched:sched_switch tracepoint reports deadline tasks with priority -1. But when reading the trace via perf script I've got the following output: # ./d & # (d is a deadline task, see [1]) # perf record -e sched:sched_switch -a sleep 1 # perf script ... swapper 0 [000] 2146.962441: sched:sched_switch: swapper/0:0 [120] R ==> d:2593 [4294967295] d 2593 [000] 2146.972472: sched:sched_switch: d:2593 [4294967295] R ==> g:2590 [4294967295] The task d reports the wrong priority [4294967295]. This happens because the "int prio" is stored in an unsigned long long val. Although it is set as a %lld, as int is shorter than unsigned long long, trace_seq_printf prints it as a positive number. The fix is just to cast the val as an int, and print it as a %d, as in the sched:sched_switch tracepoint's "format". The output with the fix is: # ./d & # perf record -e sched:sched_switch -a sleep 1 # perf script ... swapper 0 [000] 4306.374037: sched:sched_switch: swapper/0:0 [120] R ==> d:10941 [-1] d 10941 [000] 4306.383823: sched:sched_switch: d:10941 [-1] R ==> swapper/0:0 [120] [1] d.c --- #include <stdio.h> #include <unistd.h> #include <sys/syscall.h> #include <linux/types.h> #include <linux/sched.h> struct sched_attr { __u32 size, sched_policy; __u64 sched_flags; __s32 sched_nice; __u32 sched_priority; __u64 sched_runtime, sched_deadline, sched_period; }; int sched_setattr(pid_t pid, const struct sched_attr attr, unsigned int flags) { return syscall(__NR_sched_setattr, pid, attr, flags); } int main(void) { struct sched_attr attr = { .size = sizeof(attr), .sched_policy = SCHED_DEADLINE, / This creates a 10ms/30ms reservation / .sched_runtime = 10 1000 * 1000, .sched_period = attr.sched_deadline = 30 * 1000 * 1000, }; if (sched_setattr(0, &attr, 0) < 0) { perror("sched_setattr"); return -1; } for(;;); } --- Committer notes: Got the program from the provided URL, http://bristot.me/lkml/d.c, trimmed it and included in the cset log above, so that we have everything needed to test it in one place. Signed-off-by: Daniel Bristot de Oliveira <[email protected]> Acked-by: Steven Rostedt <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Daniel Bristot de Oliveira <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/866ef75bcebf670ae91c6a96daa63597ba981f0d.1483443552.git.bristot@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-03	tools: test case for TPACKET_V3/TX_RING support	Sowmini Varadhan	1	-17/+74
	Add a test case and sample code for (TPACKET_V3, PACKET_TX_RING) Signed-off-by: Sowmini Varadhan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-01-03	perf record: Fix --switch-output documentation and comment	Jiri Olsa	2	-1/+5
	There's no --signal-trigger option, also adding the code comment into record man page. Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Wang Nan <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-03	perf record: Make __record_options static	Jiri Olsa	1	-1/+1
	There's no need for this one to be global. Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Wang Nan <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-03	tools lib subcmd: Add OPT_STRING_OPTARG_SET option	Jiri Olsa	2	-0/+8
	To allow string options with a default argument and variable set when the option is used. Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Wang Nan <[email protected]> Cc: David Ahern <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-02	ACPICA: Hardware: Add sleep register hooks	Lv Zheng	1	-0/+22
	ACPICA commit ba665dc8e20d9f7730466a659564dd6c557a6cbc In Linux, para-virtualization implmentation hooks critical register writes to prevent real hardware operations. This increases divergences when the sleep registers are cracked in Linux resident ACPICA. This patch tries to introduce a single OSL to reduce the divergences. Link: https://github.com/acpica/acpica/commit/ba665dc8 Signed-off-by: Lv Zheng <[email protected]> Signed-off-by: Bob Moore <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2017-01-02	perf probe: Fix to get correct modname from elf header	Masami Hiramatsu	1	-16/+16
	Since 'perf probe' supports cross-arch probes, it is possible to analyze different arch kernel image which has different bits-per-long. In that case, it fails to get the module name because it uses the MOD_NAME_OFFSET macro based on the host machine bits-per-long, instead of the target arch bits-per-long. This fixes above issue by changing modname-offset based on the target archs bit width. This is ok because linux kernel uses LP64 model on 64bit arch. E.g. without this (on x86_64, and target module is arm32): $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup p:probe/configfs_lookup :configfs_lookup+0 ^-Here is an empty module name. With this fix, you can see correct module name: $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup p:probe/configfs_lookup configfs:configfs_lookup+0 Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/148337043836.6752.383495516397005695.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-12-27	perf sched timehist: Show total scheduling time	Namhyung Kim	1	-3/+14
	Show length of analyzed sample time and rate of idle task running. This also takes care of time range given by --time option. $ perf sched timehist -sI \| tail Samples do not have callchains. Idle stats: CPU 0 idle for 930.316 msec ( 92.93%) CPU 1 idle for 963.614 msec ( 96.25%) CPU 2 idle for 885.482 msec ( 88.45%) CPU 3 idle for 938.635 msec ( 93.76%) Total number of unique tasks: 118 Total number of context switches: 2337 Total run time (msec): 3718.048 Total scheduling time (msec): 1001.131 (x 4) Suggested-by: David Ahern <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-12-25	Merge branch 'turbostat' of ↵	Linus Torvalds	3	-371/+673
	git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat updates from Len Brown. * 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: remove obsolete -M, -m, -C, -c options tools/power turbostat: Make extensible via the --add parameter tools/power turbostat: Denverton uses a 25 MHz crystal, not 19.2 MHz tools/power turbostat: line up headers when -M is used tools/power turbostat: fix SKX PKG_CSTATE_LIMIT decoding tools/power turbostat: Support Knights Mill (KNM) tools/power turbostat: Display HWP OOB status tools/power turbostat: fix Denverton BCLK tools/power turbostat: use intel-family.h model strings tools/power/turbostat: Add Denverton RAPL support tools/power/turbostat: Add Denverton support tools/power/turbostat: split core MSR support into status + limit tools/power turbostat: fix error case overflow read of slm_freq_table[] tools/power turbostat: Allocate correct amount of fd and irq entries tools/power turbostat: switch to tab delimited output tools/power turbostat: Gracefully handle ACPI S3 tools/power turbostat: tidy up output on Joule counter overflow
2016-12-24	tools/power turbostat: remove obsolete -M, -m, -C, -c options	Len Brown	2	-110/+2
	The new --add option has replaced the -M, -m, -C, -c options Eg. -M 0x10 is now --add msr0x10,raw -m 0x10 is now --add msr0x10,raw,u32 -C 0x10 is now --add msr0x10,delta -c 0x10 is now --add msr0x10,delta,u32 The --add option can be repeated to add any number of counters, while the previous options were limited to adding one of each type. In addition, the --add option can accept a column label, and can also display a counter as a percentage of elapsed cycles. Eg. --add msr0x3fe,core,percent,MY_CC3 Signed-off-by: Len Brown <[email protected]>
2016-12-24	tools/power turbostat: Make extensible via the --add parameter	Len Brown	2	-9/+409
	Create the "--add" parameter. This can be used to teach an existing turbostat binary about any number of any type of counter. turbostat(8) details the syntax for --add. Signed-off-by: Len Brown <[email protected]>
2016-12-23	Merge branch 'perf-urgent-for-linus' of ↵	Linus Torvalds	27	-405/+865
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "On the kernel side there's two x86 PMU driver fixes and a uprobes fix, plus on the tooling side there's a number of fixes and some late updates" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) perf sched timehist: Fix invalid period calculation perf sched timehist: Remove hardcoded 'comm_width' check at print_summary perf sched timehist: Enlarge default 'comm_width' perf sched timehist: Honour 'comm_width' when aligning the headers perf/x86: Fix overlap counter scheduling bug perf/x86/pebs: Fix handling of PEBS buffer overflows samples/bpf: Move open_raw_sock to separate header samples/bpf: Remove perf_event_open() declaration samples/bpf: Be consistent with bpf_load_program bpf_insn parameter tools lib bpf: Add bpf_prog_{attach,detach} samples/bpf: Switch over to libbpf perf diff: Do not overwrite valid build id perf annotate: Don't throw error for zero length symbols perf bench futex: Fix lock-pi help string perf trace: Check if MAP_32BIT is defined (again) samples/bpf: Make perf_event_read() static uprobes: Fix uprobes on MIPS, allow for a cache flush after ixol breakpoint creation samples/bpf: Make samples more libbpf-centric tools lib bpf: Add flags to bpf_create_map() tools lib bpf: use __u32 from linux/types.h ...
2016-12-22	perf sched timehist: Fix invalid period calculation	Namhyung Kim	1	-1/+1
	When --time option is given with a value outside recorded time, the last sample time (tprev) was set to that value and run time calculation might be incorrect. This is a problem of the first samples for each cpus since it would skip the runtime update when tprev is 0. But with --time option it had non-zero (which is invalid) value so the calculation is also incorrect. For example, let's see the followging: $ perf sched timehist time cpu task name wait time sch delay run time [tid/pid] (msec) (msec) (msec) --------------- ------ ------------------------------ --------- --------- --------- 3195.968367 [0003] <idle> 0.000 0.000 0.000 3195.968386 [0002] Timer[4306/4277] 0.000 0.000 0.018 3195.968397 [0002] Web Content[4277] 0.000 0.000 0.000 3195.968595 [0001] JS Helper[4302/4277] 0.000 0.000 0.000 3195.969217 [0000] <idle> 0.000 0.000 0.621 3195.969251 [0001] kworker/1:1H[291] 0.000 0.000 0.033 The sample starts at 3195.968367 but when I gave a time interval from 3194 to 3196 (in sec) it will calculate the whole 2 second as runtime. In below, 2 cpus accounted it as runtime, other 2 cpus accounted it as idle time. Before: $ perf sched timehist --time 3194,3196 -s \| tail Idle stats: CPU 0 idle for 1995.991 msec CPU 1 idle for 20.793 msec CPU 2 idle for 30.191 msec CPU 3 idle for 1999.852 msec Total number of unique tasks: 23 Total number of context switches: 128 Total run time (msec): 3724.940 After: $ perf sched timehist --time 3194,3196 -s \| tail Idle stats: CPU 0 idle for 10.811 msec CPU 1 idle for 20.793 msec CPU 2 idle for 30.191 msec CPU 3 idle for 18.337 msec Total number of unique tasks: 23 Total number of context switches: 128 Total run time (msec): 18.139 Committer notes: Further testing: Before: Idle stats: CPU 0 idle for 229.785 msec CPU 1 idle for 937.944 msec CPU 2 idle for 188.931 msec CPU 3 idle for 986.185 msec After: # perf sched timehist --time 40602,40603 -s \| tail Idle stats: CPU 0 idle for 229.785 msec CPU 1 idle for 175.407 msec CPU 2 idle for 188.931 msec CPU 3 idle for 223.657 msec Total number of unique tasks: 68 Total number of context switches: 814 Total run time (msec): 97.688 # for cpu in `seq 0 3` ; do echo -n "CPU $cpu idle for " ; perf sched timehist --time 40602,40603 \| grep "\[000${cpu}\].*\<idle\>" \| tr -s ' ' \| cut -d' ' -f7 \| awk '{entries++ ; s+=$1} END {print s " msec (entries: " entries ")"}' ; done CPU 0 idle for 229.721 msec (entries: 123) CPU 1 idle for 175.381 msec (entries: 65) CPU 2 idle for 188.903 msec (entries: 56) CPU 3 idle for 223.61 msec (entries: 102) Difference due to the idle stats being accounted at nanoseconds precision while the <idle> entries in 'perf sched timehist' are trucated at msec.usec. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Fixes: 853b74071110 ("perf sched timehist: Add option to specify time window of interest") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-12-22	perf sched timehist: Remove hardcoded 'comm_width' check at print_summary	Namhyung Kim	1	-3/+0
	Now that the default 'comm_width' value is 30, no need to check that at print_summary, Signed-off-by: Namhyung Kim <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>