blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2015-06-07	perf/x86/intel: Introduce PERF_RECORD_LOST_SAMPLES	Kan Liang	1	-0/+12
	After enlarging the PEBS interrupt threshold, there may be some mixed up PEBS samples which are discarded by the kernel. This patch makes the kernel emit a PERF_RECORD_LOST_SAMPLES record with the number of possible discarded records when it is impossible to demux the samples. It makes sure the user is not left in the dark about such discards. Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andrew Morton <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-06-07	perf: add new PERF_SAMPLE_BRANCH_IND_JUMP branch sample type	Stephane Eranian	1	-0/+2
	This patch adds a new branch_sample_type flag to enable filtering branch sampling to indirect jumps. The support is subject to hardware or kernel software support on each architecture. Filtering on indirect jump is useful to study the targets of the jump. Signed-off-by: Stephane Eranian <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-04-02	perf: Add ITRACE_START record to indicate that tracing has started	Alexander Shishkin	1	-0/+11
	For counters that generate AUX data that is bound to the context of a running task, such as instruction tracing, the decoder needs to know exactly which task is running when the event is first scheduled in, before the first sched_switch. The decoder's need to know this stems from the fact that instruction flow trace decoding will almost always require program's object code in order to reconstruct said flow and for that we need at least its pid/tid in the perf stream. To single out such instruction tracing pmus, this patch introduces ITRACE PMU capability. The reason this is not part of RECORD_AUX record is that not all pmus capable of generating AUX data need this, and the opposite is probably also true. While sched_switch covers for most cases, there are two problems with it: the consumer will need to process events out of order (that is, having found RECORD_AUX, it will have to skip forward to the nearest sched_switch to figure out which task it was, then go back to the actual trace to decode it) and it completely misses the case when the tracing is enabled and disabled before sched_switch, for example, via PERF_EVENT_IOC_DISABLE. Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kaixu Xia <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Robert Richter <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/1421237903-181015-15-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2015-04-02	perf: Add wakeup watermark control to the AUX area	Alexander Shishkin	1	-0/+7
	When AUX area gets a certain amount of new data, we want to wake up userspace to collect it. This adds a new control to specify how much data will cause a wakeup. This is then passed down to pmu drivers via output handle's "wakeup" field, so that the driver can find the nearest point where it can generate an interrupt. We repurpose __reserved_2 in the event attribute for this, even though it was never checked to be zero before, aux_watermark will only matter for new AUX-aware code, so the old code should still be fine. Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kaixu Xia <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Robert Richter <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/1421237903-181015-10-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2015-04-02	perf: Support overwrite mode for the AUX area	Alexander Shishkin	1	-0/+1
	This adds support for overwrite mode in the AUX area, which means "keep collecting data till you're stopped", turning AUX area into a circular buffer, where new data overwrites old data. It does not depend on data buffer's overwrite mode, so that it doesn't lose sideband data that is instrumental for processing AUX data. Overwrite mode is enabled at mapping AUX area read only. Even though aux_tail in the buffer's user page might be user writable, it will be ignored in this mode. A PERF_RECORD_AUX with PERF_AUX_FLAG_OVERWRITE set is written to the perf data stream every time an event writes new data to the AUX area. The pmu driver might not be able to infer the exact beginning of the new data in each snapshot, some drivers will only provide the tail, which is aux_offset + aux_size in the AUX record. Consumer has to be able to tell the new data from the old one, for example, by means of time stamps if such are provided in the trace. Consumer is also responsible for disabling any events that might write to the AUX area (thus potentially racing with the consumer) before collecting the data. Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kaixu Xia <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Robert Richter <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/1421237903-181015-9-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2015-04-02	perf: Add AUX record	Alexander Shishkin	1	-0/+19
	When there's new data in the AUX space, output a record indicating its offset and size and a set of flags, such as PERF_AUX_FLAG_TRUNCATED, to mean the described data was truncated to fit in the ring buffer. Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kaixu Xia <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Robert Richter <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/1421237903-181015-7-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2015-04-02	perf: Add AUX area to ring buffer for raw data streams	Peter Zijlstra	1	-0/+16
	This patch introduces "AUX space" in the perf mmap buffer, intended for exporting high bandwidth data streams to userspace, such as instruction flow traces. AUX space is a ring buffer, defined by aux_{offset,size} fields in the user_page structure, and read/write pointers aux_{head,tail}, which abide by the same rules as data_* counterparts of the main perf buffer. In order to allocate/mmap AUX, userspace needs to set up aux_offset to such an offset that will be greater than data_offset+data_size and aux_size to be the desired buffer size. Both need to be page aligned. Then, same aux_offset and aux_size should be passed to mmap() call and if everything adds up, you should have an AUX buffer as a result. Pages that are mapped into this buffer also come out of user's mlock rlimit plus perf_event_mlock_kb allowance. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Alexander Shishkin <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kaixu Xia <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Robert Richter <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/1421237903-181015-3-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2015-04-02	perf: Add data_{offset,size} to user_page	Alexander Shishkin	1	-0/+5
	Currently, the actual perf ring buffer is one page into the mmap area, following the user page and the userspace follows this convention. This patch adds data_{offset,size} fields to user_page that can be used by userspace instead for locating perf data in the mmap area. This is also helpful when mapping existing or shared buffers if their size is not known in advance. Right now, it is made to follow the existing convention that data_offset == PAGE_SIZE and data_offset + data_size == mmap_size. Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kaixu Xia <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Robert Richter <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/1421237903-181015-2-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2015-04-02	tracing, perf: Implement BPF programs attached to kprobes	Alexei Starovoitov	1	-0/+1
	BPF programs, attached to kprobes, provide a safe way to execute user-defined BPF byte-code programs without being able to crash or hang the kernel in any way. The BPF engine makes sure that such programs have a finite execution time and that they cannot break out of their sandbox. The user interface is to attach to a kprobe via the perf syscall: struct perf_event_attr attr = { .type = PERF_TYPE_TRACEPOINT, .config = event_id, ... }; event_fd = perf_event_open(&attr,...); ioctl(event_fd, PERF_EVENT_IOC_SET_BPF, prog_fd); 'prog_fd' is a file descriptor associated with BPF program previously loaded. 'event_id' is an ID of the kprobe created. Closing 'event_fd': close(event_fd); ... automatically detaches BPF program from it. BPF programs can call in-kernel helper functions to: - lookup/update/delete elements in maps - probe_read - wraper of probe_kernel_read() used to access any kernel data structures BPF programs receive 'struct pt_regs *' as an input ('struct pt_regs' is architecture dependent) and return 0 to ignore the event and 1 to store kprobe event into the ring buffer. Note, kprobes are a fundamentally _not_ a stable kernel ABI, so BPF programs attached to kprobes must be recompiled for every kernel version and user must supply correct LINUX_VERSION_CODE in attr.kern_version during bpf_prog_load() call. Signed-off-by: Alexei Starovoitov <[email protected]> Reviewed-by: Steven Rostedt <[email protected]> Reviewed-by: Masami Hiramatsu <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: David S. Miller <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-03-27	perf: Add per event clockid support	Peter Zijlstra	1	-3/+3
	While thinking on the whole clock discussion it occurred to me we have two distinct uses of time: 1) the tracking of event/ctx/cgroup enabled/running/stopped times which includes the self-monitoring support in struct perf_event_mmap_page. 2) the actual timestamps visible in the data records. And we've been conflating them. The first is all about tracking time deltas, nobody should really care in what time base that happens, its all relative information, as long as its internally consistent it works. The second however is what people are worried about when having to merge their data with external sources. And here we have the discussion on MONOTONIC vs MONOTONIC_RAW etc.. Where MONOTONIC is good for correlating between machines (static offset), MONOTNIC_RAW is required for correlating against a fixed rate hardware clock. This means configurability; now 1) makes that hard because it needs to be internally consistent across groups of unrelated events; which is why we had to have a global perf_clock(). However, for 2) it doesn't really matter, perf itself doesn't care what it writes into the buffer. The below patch makes the distinction between these two cases by adding perf_event_clock() which is used for the second case. It further makes this configurable on a per-event basis, but adds a few sanity checks such that we cannot combine events with different clocks in confusing ways. And since we then have per-event configurability we might as well retain the 'legacy' behaviour as a default. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Stultz <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2015-02-18	perf/x86/intel: Expose LBR callstack to user space tooling	Peter Zijlstra	1	-8/+8
	With LBR call stack feature enable, there are three callchain options. Enable the 3rd callchain option (LBR callstack) to user space tooling. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Kan Liang <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-02-18	perf/x86/intel: Reduce lbr_sel_map[] size	Yan, Zheng	1	-14/+35
	The index of lbr_sel_map is bit value of perf branch_sample_type. PERF_SAMPLE_BRANCH_MAX is 1024 at present, so each lbr_sel_map uses 4096 bytes. By using bit shift as index, we can reduce lbr_sel_map size to 40 bytes. This patch defines 'bit shift' for branch types, and use 'bit shift' to define lbr_sel_maps. Signed-off-by: Yan, Zheng <[email protected]> Signed-off-by: Kan Liang <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Stephane Eranian <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-11-16	perf: Add ability to sample machine state on interrupt	Stephane Eranian	1	-1/+14
	Enable capture of interrupted machine state for each sample. Registers to sample are passed per event in the sample_regs_intr bitmask. To sample interrupt machine state, the PERF_SAMPLE_INTR_REGS must be passed in sample_type. The list of available registers is arch dependent and provided by asm/perf_regs.h Registers are laid out as u64 in the order of the bit order of sample_intr_regs. This patch also adds a new ABI version PERF_ATTR_SIZE_VER4 because we extend the perf_event_attr struct with a new u64 field. Reviewed-by: Jiri Olsa <[email protected]> Signed-off-by: Stephane Eranian <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: [email protected] Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-10-28	perf: Fix typos in sample code in the perf_event.h header	Andy Lutomirski	1	-7/+7
	struct perf_event_mmap_page has members called "index" and "cap_user_rdpmc". Spell them correctly in the examples. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/320ba26391a8123cc16e5f02d24d34bd404332fd.1412313343.git.luto@amacapital.net Signed-off-by: Ingo Molnar <[email protected]>
2014-06-09	perf: Pass protection and flags bits through mmap2 interface	Peter Zijlstra	1	-0/+1
	The mmap2 interface was missing the protection and flags bits needed to accurately determine if a mmap memory area was shared or private and if it was readable or not. Signed-off-by: Peter Zijlstra <[email protected]> [tweaked patch to compile and wrote changelog] Signed-off-by: Don Zickus <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2014-06-06	perf: Differentiate exec() and non-exec() comm events	Adrian Hunter	1	-2/+7
	perf tools like 'perf report' can aggregate samples by comm strings, which generally works. However, there are other potential use-cases. For example, to pair up 'calls' with 'returns' accurately (from branch events like Intel BTS) it is necessary to identify whether the process has exec'd. Although a comm event is generated when an 'exec' happens it is also generated whenever the comm string is changed on a whim (e.g. by prctl PR_SET_NAME). This patch adds a flag to the comm event to differentiate one case from the other. In order to determine whether the kernel supports the new flag, a selection bit named 'exec' is added to struct perf_event_attr. The bit does nothing but will cause perf_event_open() to fail if the bit is set on kernels that do not have it defined. Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Cc: Paul Mackerras <[email protected]> Cc: Dave Jones <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-06-05	perf: Add new conditional branch filter 'PERF_SAMPLE_BRANCH_COND'	Anshuman Khandual	1	-1/+2
	This patch introduces new branch filter PERF_SAMPLE_BRANCH_COND which will extend the existing perf ABI. This will filter branches which are conditional. Various architectures can provide this functionality either with HW filtering support (if present) or with SW filtering of captured branch instructions. Signed-off-by: Anshuman Khandual <[email protected]> Reviewed-by: Stephane Eranian <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Linus Torvalds <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-05-19	perf: Fix perf_event_open(.flags) test	Peter Zijlstra	1	-4/+4
	Vince noticed that we test the (unsigned long) flags field against an (unsigned int) constant. This would allow setting the high bits on 64bit platforms and not get an error. There is nothing that uses the high bits, so it should be entirely harmless, but we don't want userspace to accidentally set them anyway, so fix the constants. Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Reported-by: Vince Weaver <[email protected]> Tested-by: Vince Weaver <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2014-01-23	uapi: convert u64 to __u64 in exported headers	Mike Frysinger	1	-1/+1
	The u64 type is not defined in any exported kernel headers, so trying to use it will lead to build failures. Signed-off-by: Mike Frysinger <[email protected]> Acked-by: David Rientjes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-16	Merge branch 'perf/urgent' into perf/core	Ingo Molnar	1	-0/+1
	Pick up the latest fixes, refresh the development tree. Signed-off-by: Ingo Molnar <[email protected]>
2014-01-12	perf: Introduce a flag to enable close-on-exec in perf_event_open()	Yann Droneaud	1	-0/+1
	Unlike recent modern userspace API such as: epoll_create1 (EPOLL_CLOEXEC), eventfd (EFD_CLOEXEC), fanotify_init (FAN_CLOEXEC), inotify_init1 (IN_CLOEXEC), signalfd (SFD_CLOEXEC), timerfd_create (TFD_CLOEXEC), or the venerable general purpose open (O_CLOEXEC), perf_event_open() syscall lack a flag to atomically set FD_CLOEXEC (eg. close-on-exec) flag on file descriptor it returns to userspace. The present patch adds a PERF_FLAG_FD_CLOEXEC flag to allow perf_event_open() syscall to atomically set close-on-exec. Having this flag will enable userspace to remove the file descriptor from the list of file descriptors being inherited across exec, without the need to call fcntl(fd, F_SETFD, FD_CLOEXEC) and the associated race condition between the current thread and another thread calling fork(2) then execve(2). Links: - Secure File Descriptor Handling (Ulrich Drepper, 2008) http://udrepper.livejournal.com/20407.html - Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012) http://danwalsh.livejournal.com/53603.html - Notes in DMA buffer sharing: leak and security hole http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/dma-buf-sharing.txt?id=v3.13-rc3#n428 Signed-off-by: Yann Droneaud <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Al Viro <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Linus Torvalds <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/8c03f54e1598b1727c19706f3af03f98685d9fe6.1388952061.git.ydroneaud@opteya.com Signed-off-by: Ingo Molnar <[email protected]>
2013-12-17	perf: Document the new transaction sample type	Vince Weaver	1	-0/+1
	Commit fdfbbd07e91f8fe3871 ("perf: Add generic transaction flags") added support for PERF_SAMPLE_TRANSACTION but forgot to add documentation for the sample type to include/uapi/linux/perf_event.h Signed-off-by: Vince Weaver <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Andi Kleen <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-04	Merge branch 'perf/urgent' into perf/core to fix conflicts	Ingo Molnar	1	-5/+7
	Conflicts: tools/perf/bench/numa.c Signed-off-by: Ingo Molnar <[email protected]>
2013-10-29	perf: Fix perf ring buffer memory ordering	Peter Zijlstra	1	-5/+7
	The PPC64 people noticed a missing memory barrier and crufty old comments in the perf ring buffer code. So update all the comments and add the missing barrier. When the architecture implements local_t using atomic_long_t there will be double barriers issued; but short of introducing more conditional barrier primitives this is the best we can do. Reported-by: Victor Kaplansky <[email protected]> Tested-by: Victor Kaplansky <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: [email protected] Cc: Paul McKenney <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-10-04	perf: Add generic transaction flags	Andi Kleen	1	-1/+24
	Add a generic qualifier for transaction events, as a new sample type that returns a flag word. This is particularly useful for qualifying aborts: to distinguish aborts which happen due to asynchronous events (like conflicts caused by another CPU) versus instructions that lead to an abort. The tuning strategies are very different for those cases, so it's important to distinguish them easily and early. Since it's inconvenient and inflexible to filter for this in the kernel we report all the events out and allow some post processing in user space. The flags are based on the Intel TSX events, but should be fairly generic and mostly applicable to other HTM architectures too. In addition to various flag words there's also reserved space to report an program supplied abort code. For TSX this is used to distinguish specific classes of aborts, like a lock busy abort when doing lock elision. Flags: Elision and generic transactions (ELISION vs TRANSACTION) (HLE vs RTM on TSX; IBM etc. would likely only use TRANSACTION) Aborts caused by current thread vs aborts caused by others (SYNC vs ASYNC) Retryable transaction (RETRY) Conflicts with other threads (CONFLICT) Transaction write capacity overflow (CAPACITY WRITE) Transaction read capacity overflow (CAPACITY READ) Transactions implicitely aborted can also return an abort code. This can be used to signal specific events to the profiler. A common case is abort on lock busy in a RTM eliding library (code 0xff) To handle this case we include the TSX abort code Common example aborts in TSX would be: - Data conflict with another thread on memory read. Flags: TRANSACTION\|ASYNC\|CONFLICT - executing a WRMSR in a transaction. Flags: TRANSACTION\|SYNC - HLE transaction in user space is too large Flags: ELISION\|SYNC\|CAPACITY-WRITE The only flag that is somewhat TSX specific is ELISION. This adds the perf core glue needed for reporting the new flag word out. v2: Add MEM/MISC v3: Move transaction to the end v4: Separate capacity-read/write and remove misc v5: Remove _SAMPLE. Move abort flags to 32bit. Rename transaction to txn Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-09-20	perf: Fix capabilities bitfield compatibility in 'struct perf_event_mmap_page'	Peter Zijlstra	1	-5/+9
	Solve the problems around the broken definition of perf_event_mmap_page:: cap_usr_time and cap_usr_rdpmc fields which used to overlap, partially fixed by: 860f085b74e9 ("perf: Fix broken union in 'struct perf_event_mmap_page'") The problem with the fix (merged in v3.12-rc1 and not yet released officially), noticed by Vince Weaver is that the new behavior is not detectable by new user-space, and that due to the reuse of the field names it's easy to mis-compile a binary if old headers are used on a new kernel or new headers are used on an old kernel. To solve all that make this change explicit, detectable and self-contained, by iterating the ABI the following way: - Always clear bit 0, and rename it to usrpage->cap_bit0, to at least not confuse old user-space binaries. RDPMC will be marked as unavailable to old binaries but that's within the ABI, this is a capability bit. - Rename bit 1 to ->cap_bit0_is_deprecated and always set it to 1, so new libraries can reliably detect that bit 0 is deprecated and perma-zero without having to check the kernel version. - Use bits 2, 3, 4 for the newly defined, correct functionality: cap_user_rdpmc : 1, /* The RDPMC instruction can be used to read counts / cap_user_time : 1, / The time_* fields are used / cap_user_time_zero : 1, / The time_zero field is used */ - Rename all the bitfield names in perf_event.h to be different from the old names, to make sure it's not possible to mis-compile it accidentally with old assumptions. The 'size' field can then be used in the future to add new fields and it will act as a natural ABI version indicator as well. Also adjust tools/perf/ userspace for the new definitions, noticed by Adrian Hunter. Reported-by: Vince Weaver <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Also-Fixed-by: Adrian Hunter <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-09-20	perf: Update ABI comment	Peter Zijlstra	1	-0/+1
	For some mysterious reason the sample_id field of PERF_RECORD_MMAP went AWOL. Reported-by: Vince Weaver <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2013-09-18	perf: Fix UAPI export of PERF_EVENT_IOC_ID	Vince Weaver	1	-1/+1
	Without the following patch I have problems compiling code using the new PERF_EVENT_IOC_ID ioctl(). It looks like u64 was used instead of __u64 Signed-off-by: Vince Weaver <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1309171450380.11444@vincent-weaver-1.um.maine.edu Signed-off-by: Ingo Molnar <[email protected]>
2013-09-02	perf: Add a dummy software event to keep tracking	Adrian Hunter	1	-0/+1
	When an event is disabled the "tracking" events selected by the 'mmap', 'comm' and 'task' bits of struct perf_event_attr, are also disabled. However, the information those events provide is necessary to resolve symbols for when the main event is re-enabled. The "tracking" events can be kept enabled by putting them on another event, but that requires an event that otherwise does nothing. A new software event PERF_COUNT_SW_DUMMY is added for that purpose. Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-09-02	perf: Export struct perf_branch_entry to userspace	Vince Weaver	1	-0/+24
	If PERF_SAMPLE_BRANCH_STACK is enabled then samples are returned with the format { u64 from, to, flags } but the flags layout is not specified. This field has the type struct perf_branch_entry; move this definition into include/uapi/linux/perf_event.h so users can access these fields. This is similar to the existing inclusion of perf_mem_data_src in the include/uapi/linux/perf_event.h file. Signed-off-by: Vince Weaver <[email protected]> Acked-by: Stephane Eranian <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-09-02	perf: Add attr->mmap2 attribute to an event	Stephane Eranian	1	-1/+23
	Adds a new PERF_RECORD_MMAP2 record type which is essence an expanded version of PERF_RECORD_MMAP. Used to request mmap records with more information about the mapping, including device major, minor and the inode number and generation for mappings associated with files or shared memory segments. Works for code and data (with attr->mmap_data set). Existing PERF_RECORD_MMAP record is unmodified by this patch. Signed-off-by: Stephane Eranian <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Al Viro <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Added Al to the Cc:. Are the ino, maj/min exports of vma->vm_file OK? ] Signed-off-by: Ingo Molnar <[email protected]>
2013-08-29	perf: make events stream always parsable	Adrian Hunter	1	-7/+20
	The event stream is not always parsable because the format of a sample is dependent on the sample_type of the selected event. When there is more than one selected event and the sample_types are not the same then parsing becomes problematic. A sample can be matched to its selected event using the ID that is allocated when the event is opened. Unfortunately, to get the ID from the sample means first parsing it. This patch adds a new sample format bit PERF_SAMPLE_IDENTIFER that puts the ID at a fixed position so that the ID can be retrieved without parsing the sample. For sample events, that is the first position immediately after the header. For non-sample events, that is the last position. In this respect parsing samples requires that the sample_type and ID values are recorded. For example, perf tools records struct perf_event_attr and the IDs within the perf.data file. Those must be read first before it is possible to parse samples found later in the perf.data file. Signed-off-by: Adrian Hunter <[email protected]> Tested-by: Stephane Eranian <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-08-07	perf: Add PERF_EVENT_IOC_ID ioctl to return event ID	Jiri Olsa	1	-0/+1
	The only way to get the event ID is by reading the event fd, followed by parsing the ID value out of the returned data. While this is ok for current read format used by perf tool, it is not ok when we use PERF_FORMAT_GROUP format. With this format the data are returned for the whole group and there's no way to find out what ID belongs to our fd (if we are not group leader event). Adding a simple ioctl that returns event primary ID for given fd. Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Corey Ashford <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-07-23	perf/x86: Add ability to calculate TSC from perf sample timestamps	Adrian Hunter	1	-2/+20
	For modern CPUs, perf clock is directly related to TSC. TSC can be calculated from perf clock and vice versa using a simple calculation. Two of the three componenets of that calculation are already exported in struct perf_event_mmap_page. This patch exports the third. Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-07-23	perf: Fix broken union in 'struct perf_event_mmap_page'	Adrian Hunter	1	-3/+5
	The capabilities bits must not be "union'ed" together. Put them in a separate struct. Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-07-23	perf: Update perf_event_type documentation	Peter Zijlstra	1	-1/+17
	Due to a discussion with Adrian I had a good look at the perf_event_type record layout and found the documentation to be somewhat unclear. Cc: Adrian Hunter <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-06-19	perf/x86/intel: Support Haswell/v4 LBR format	Andi Kleen	1	-1/+4
	Haswell has two additional LBR from flags for TSX: in_tx and abort_tx, implemented as a new "v4" version of the LBR format. Handle those in and adjust the sign extension code to still correctly extend. The flags are exported similarly in the LBR record to the existing misprediction flag Signed-off-by: Andi Kleen <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-04-08	perf: Fix comments in PERF_MEM_LVL bitmask	Stephane Eranian	1	-2/+2
	This small patch fixes a mistake in the comments for the PERF_MEM_LVL_* events. The L2, L3 bits simply represent cache levels, not hits or misses. That is encoded in PERF_MEM_LVL_MISS/PERF_MEM_LVL_HIT. Signed-off-by: Stephane Eranian <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/20130405144941.GA30503@quad Signed-off-by: Ingo Molnar <[email protected]>
2013-04-01	perf: Add PERF_RECORD_MISC_MMAP_DATA to RECORD_MMAP	Stephane Eranian	1	-0/+1
	Type of mapping was lost and made it hard for a tool to distinguish code vs. data mmaps. Perf has the ability to distinguish the two. Use a bit in the header->misc bitmask to keep track of the mmap type. If PERF_RECORD_MISC_MMAP_DATA is set then the mapping is not executable (!VM_EXEC). If not set, then the mapping is executable. Signed-off-by: Stephane Eranian <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-04-01	perf: Add generic memory sampling interface	Stephane Eranian	1	-2/+66
	This patch adds PERF_SAMPLE_DATA_SRC. PERF_SAMPLE_DATA_SRC collects the data source, i.e., where did the data associated with the sampled instruction come from. Information is stored in a perf_mem_data_src structure. It contains opcode, mem level, tlb, snoop, lock information, subject to availability in hardware. Signed-off-by: Stephane Eranian <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-04-01	perf/core: Add weighted samples	Andi Kleen	1	-1/+5
	For some events it's useful to weight sample with a hardware provided number. This expresses how expensive the action the sample represent was. This allows the profiler to scale the samples to be more informative to the programmer. There is already the period which is used similarly, but it means something different, so I chose to not overload it. Instead a new sample type for WEIGHT is added. Can be used for multiple things. Initially it is used for TSX abort costs and profiling by memory latencies (so to make expensive load appear higher up in the histograms). The concept is quite generic and can be extended to many other kinds of events or architectures, as long as the hardware provides suitable auxillary values. In principle it could be also used for software tracepoints. This adds the generic glue. A new optional sample format for a 64-bit weight value. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Stephane Eranian <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-01-24	perf: Missing field in PERF_RECORD_SAMPLE documentation	Vince Weaver	1	-1/+2
	While trying to write a perf_event/mmap test for my perf_event test-suite I came across a missing field description in the PERF_RECORD_SAMPLE documentation in perf_event.h Signed-off-by: Vince Weaver <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1301081439300.24507@vincent-weaver-1.um.maine.edu Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-10-13	UAPI: (Scripted) Disintegrate include/linux	David Howells	1	-0/+615
	Signed-off-by: David Howells <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Michael Kerrisk <[email protected]> Acked-by: Paul E. McKenney <[email protected]> Acked-by: Dave Jones <[email protected]>