blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2010-05-14	perf report: Report number of events, not samples	Arnaldo Carvalho de Melo	6	-50/+82
	Number of samples is meaningless after we switched to auto-freq, so report the number of events, i.e. not the sum of the different periods, but the number PERF_RECORD_SAMPLE emitted by the kernel. While doing this I noticed that naming "count" to the sum of all the event periods can be confusing, so rename it to .period, just like in struct sample.data, so that we become more consistent. This helps with the next step, that was to record in struct hist_entry the number of sample events for each instance, we need that because we use it to generate the number of events when applying filters to the tree of hist entries like it is being done in the TUI report browser. Suggested-by: Ingo Molnar <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-14	perf hist: Clarify events_stats fields usage	Arnaldo Carvalho de Melo	5	-17/+29
	The events_stats.total field is too generic, rename it to .total_period, and also add a comment explaining that it is the sum of all the .period fields in samples, that is needed because we use auto-freq to avoid sampling artifacts. Ditto for events_stats.lost, that is the sum of all lost_event.lost fields, i.e. the number of events the kernel dropped. Looking at the users, builtin-sched.c can make use of these fields and stop doing it again. Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-14	perf hist: Make event__totals per hists	Arnaldo Carvalho de Melo	6	-36/+54
	This is one more thing that started global but are more useful per hist or per session. Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-13	perf hist: Fix missing getline declaration	Frederic Weisbecker	1	-0/+1
	hist.c needs to include util.h so that it gets stdio.h inclusion with __GNU_SOURCE defined. Fixes: util/hist.c: In function ‘hist_entry__parse_objdump_line’: util/hist.c:931: erreur: implicit declaration of function ‘getline’ util/hist.c:931: erreur: nested extern declaration of ‘getline’ Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Paul Mackerras <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-13	perf hist: Fix hists__browse no-newt case	Frederic Weisbecker	1	-1/+1
	Fix mistake in a parameter type of the no-newt hists__browse() version. Fixes: builtin-report.c: In function ‘__cmd_report’: builtin-report.c:314: erreur: incompatible type for argument 1 of ‘hists__browse’ Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Paul Mackerras <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-11	perf report: Librarize the annotation code and use it in the newt browser	Arnaldo Carvalho de Melo	4	-52/+519
	Now we don't anymore use popen to run 'perf annotate' for the selected symbol, instead we collect per address samplings when processing samples in 'perf report' if we're using the newt browser, then we use this data directly to do annotation. Done this way we can actually traverse the objdump_line objects directly, matching the addresses to the collected samples and colouring them appropriately using lower level slang routines. The new ui_browser class will be reused for the main, callchain aware, histogram browser, when it will be made generic and don't assume that the objects are always instances of the objdump_line class maintained using list_heads. Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-11	perf ui: Add ui_helpline methods	Arnaldo Carvalho de Melo	1	-19/+50
	Initially this was just to be able to have a printf like method to prepare the formatted string and then pass to newtPushHelpLine, but as we already have for ui_progress, etc, its a step in identifying a restricted, highlevel set of widgets we can then have implementations for multiple widget sets (GTK, etc). Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-11	perf hist: Adopt filter by dso and by thread methods from the newt browser	Arnaldo Carvalho de Melo	4	-81/+88
	Those are really not specific to the newt code, can be used by other UI frontends. Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-11	perf: Fix static strings treated like dynamic ones	Frederic Weisbecker	1	-1/+1
	The raw_field_ptr() helper, used to retrieve the address of a field inside a trace event, treats every strings as if they were dynamic ie: having a secondary level of indirection to retrieve their contents. FIELD_IS_STRING doesn't mean FIELD_IS_DYNAMIC, we only need to compute the secondary dereference for the latter case. This fixes perf sched segfaults, bad cmdline report and may be some other bugs. Reported-by: Jason Baron <[email protected]> Reported-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Tom Zanussi <[email protected]>
2010-05-10	perf/trace/scripting: don't show script start/stop messages by default	Tom Zanussi	2	-7/+0
	Only print the script start/stop messages in verbose mode - users normally don't care and it just clutters up the output. Cc: Frédéric Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Tom Zanussi <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-10	perf hist: Calculate max_sym name len and nr_entries	Arnaldo Carvalho de Melo	4	-10/+27
	Better done when we are adding entries, be it initially of when we're re-sorting the histograms. Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-10	perf hist: Introduce hists class and move lots of methods to it	Arnaldo Carvalho de Melo	6	-91/+79
	In cbbc79a we introduced support for multiple events by introducing a new "event_stat_id" struct and then made several perf_session methods receive a point to it instead of a pointer to perf_session, and kept the event_stats and hists rb_tree in perf_session. While working on the new newt based browser, I realised that it would be better to introduce a new class, "hists" (short for "histograms"), renaming the "event_stat_id" struct and the perf_session methods that were really "hists" methods, as they manipulate only struct hists members, not touching anything in the other perf_session members. Other optimizations, such as calculating the maximum lenght of a symbol name present in an hists instance will be possible as we add them, avoiding a re-traversal just for finding that information. The rationale for the name "hists" to replace "event_stat_id" is that we may have multiple sets of hists for the same event_stat id, as, for instance, the 'perf diff' tool has, so event stat id is not what characterizes what this struct and the functions that manipulate it do. Cc: Eric B Munson <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-10	perf session: create_kernel_maps should use ->host_machine	Arnaldo Carvalho de Melo	1	-3/+2
	Using machines__create_kernel_maps(..., HOST_KERNEL_ID) it would create another machine instance for the host machine, and since 1f626bc we have it out of the machines rb_tree. Fix it by using machine__create_kernel_maps(&self->host_machine) directly. Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-10	perf callchains: Use zalloc to allocate objects	Arnaldo Carvalho de Melo	1	-3/+3
	Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-10	perf newt: Use newtAddComponent()	Arnaldo Carvalho de Melo	1	-3/+3
	Instead of newtAddComponents(just-one-entry, NULL), that is not needed if, like in this browser, we're adding just one component at a time. Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-10	Merge branch 'perf/test' of ↵	Ingo Molnar	3	-54/+110
	git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
2010-05-09	perf report: Allow limiting the number of entries to print in callchains	Arnaldo Carvalho de Melo	2	-0/+11
	Works by adding a third parameter to the '-g' argument, after the graph type and minimum percentage, for example: [root@doppio linux-2.6-tip]# perf report -g fractal,0.5,2 Will show only the first two symbols where at least 0.5% of the samples took place. All the other symbols that don't fall outside these constraints will be put together in the last entry, prefixed with "[...]" and the total percentage for them. Suggested-by: Arjan van de Ven <[email protected]> Acked-by: Arjan van de Ven <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-09	perf session: Embed the host machine data on perf_session	Arnaldo Carvalho de Melo	6	-31/+29
	We have just one host on a given session, and that is the most common setup right now, so embed a ->host_machine struct machine instance directly in the perf_session class, check if we're looking for it before going to the rb_tree. This also fixes a problem found when we try to process old perf.data files where we didn't have MMAP events for the kernel and modules and thus don't create the kernel maps, do it in event__preprocess_sample if it wasn't already. Reported-by: Ingo Molnar <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> Cc: Zhang, Yanmin <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-09	perf symbols: Check if a struct machine instance was found	Arnaldo Carvalho de Melo	1	-2/+8
	Which can happen when processing old files that had no fake kernel MMAP, events. That shouldn't result in perf_session__create_kernel_maps not being called, this will be fixed in a followup patch, for now do these checks to avoid segfaulting. Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-09	perf symbols: Consider unresolved DSOs in the dso__col_widt calculation	Arnaldo Carvalho de Melo	1	-0/+7
	By using BITS_PER_LONG / 4, that is the number of chars that will be used in such cases as the DSO "name". Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-09	perf hist: Simplify the insertion of new hist_entry instances	Arnaldo Carvalho de Melo	2	-17/+24
	And with that fix at least one bug: The first hit for an entry, the one that calls malloc to create a new instance in __perf_session__add_hist_entry, wasn't adding the count to the per cpumode (PERF_RECORD_MISC_USER, etc) total variable. Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-09	perf callchain: Move validate_callchain to callchain lib	Arnaldo Carvalho de Melo	2	-0/+10
	Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-09	perf/live-mode: Handle payload-less events	Tom Zanussi	1	-8/+11
	Some events, such as the PERF_RECORD_FINISHED_ROUND event consist of only an event header and no data. In this case, a 0-length payload will be read, and the 0 return value will be wrongly interpreted as an 'unexpected end of event stream'. This patch allows for proper handling of data-less events by skipping 0-length reads. Signed-off-by: Tom Zanussi <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Masami Hiramatsu <[email protected]> LKML-Reference: <1273038527.6383.51.camel@tropicana> Signed-off-by: Frederic Weisbecker <[email protected]>
2010-05-09	perf: Provide a new deterministic events reordering algorithm	Frederic Weisbecker	2	-45/+97
	The current events reordering algorithm is based on a heuristic that gets broken once we deal with a very fast flow of events. Indeed the time period based flushing is not suitable anymore in the following case, assuming we have a flush period of two seconds. CPU 0 \| CPU 1 \| cnt1 timestamps \| cnt1 timestamps \| 0 \| 0 1 \| 1 2 \| 2 3 \| 3 [...] \| [...] 4 seconds later If we spend too much time to read the buffers (case of a lot of events to record in each buffers or when we have a lot of CPU buffers to read), in the next pass the CPU 0 buffer could contain a slice of several seconds of events. We'll read them all and notice we've reached the period to flush. In the above example we flush the first half of the CPU 0 buffer, then we read the CPU 1 buffer where we have events that were on the flush slice and then the reordering fails. It's simple to reproduce with: perf lock record perf bench sched messaging To solve this, we use a new solution that doesn't rely on an heuristical time slice period anymore but on a deterministic basis based on how perf record does its job. perf record saves the buffers through passes. A pass is a tour on every buffers from every CPUs. This is made in order: for each CPU we read the buffers of every counters. So the more buffers we visit, the later will be the timstamps of their events. When perf record finishes a pass it records a PERF_RECORD_FINISHED_ROUND pseudo event. We record the max timestamp t found in the pass n. Assuming these timestamps are monotonic across cpus, we know that if a buffer still has events with timestamps below t, they will be all available and then read in the pass n + 1. Hence when we start to read the pass n + 2, we can safely flush every events with timestamps below t. ============ PASS n ================= CPU 0 \| CPU 1 \| cnt1 timestamps \| cnt2 timestamps 1 \| 2 2 \| 3 - \| 4 <--- max recorded ============ PASS n + 1 ============== CPU 0 \| CPU 1 \| cnt1 timestamps \| cnt2 timestamps 3 \| 5 4 \| 6 5 \| 7 <---- max recorded Flush every events below timestamp 4 ============ PASS n + 2 ============== CPU 0 \| CPU 1 \| cnt1 timestamps \| cnt2 timestamps 6 \| 8 7 \| 9 - \| 10 Flush every events below timestamp 7 etc... It also works on perf.data versions that don't have PERF_RECORD_FINISHED_ROUND pseudo events. The difference is that the events will be only flushed in the end of the perf.data processing. It will then consume more memory and scale less with large perf.data files. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Tom Zanussi <[email protected]> Cc: Masami Hiramatsu <[email protected]>
2010-05-09	perf: Introduce a new "round of buffers read" pseudo event	Frederic Weisbecker	1	-1/+2
	In order to provide a more rubust and deterministic reordering algorithm, we need to know when we reach a point where we just did a pass through over every counter buffers to read every thing they had. This patch introduces a new PERF_RECORD_FINISHED_ROUND pseudo event that only consist in an event header and doesn't need to contain anything. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Tom Zanussi <[email protected]> Cc: Masami Hiramatsu <[email protected]>
2010-05-08	Merge branch 'perf' of ↵	Ingo Molnar	1	-1/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core
2010-05-07	perf list: Improve the raw hw event descriptor documentation	Arnaldo Carvalho de Melo	1	-1/+2
	It was x86 specific and imcomplete at that, improve the situation by making it clear where the example provided applies and by adding the URLs for the Intel and AMD manuals where this is discussed in depth. Acked-by: Robert Richter <[email protected]> Cc: Cyrill Gorcunov <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> Cc: Robert Richter <[email protected]> Reported-by: Robert Richter <[email protected] LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-07	perf, x86: Improve the PEBS ABI	Peter Zijlstra	1	-9/+16
	Rename perf_event_attr::precise to perf_event_attr::precise_ip and widen it to 2 bits. This new field describes the required precision of the PERF_SAMPLE_IP field: 0 - SAMPLE_IP can have arbitrary skid 1 - SAMPLE_IP must have constant skid 2 - SAMPLE_IP requested to have 0 skid 3 - SAMPLE_IP must have 0 skid And modify the Intel PEBS code accordingly. The PEBS implementation now supports up to precise_ip == 2, where we perform the IP fixup. Also s/PERF_RECORD_MISC_EXACT/&_IP/ to clarify its meaning, this bit should be set for each PERF_SAMPLE_IP field known to match the actual instruction triggering the event. This new scheme allows for a PEBS mode that uses the buffer for more than a single event. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Stephane Eranian <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2010-05-05	perf list: Add explanation about raw hardware event descriptors	Arnaldo Carvalho de Melo	1	-1/+1
	Using explanation given by Ingo Molnar in the oprofile mailing list. Suggested-by: Nick Black <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Nick Black <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-05	perf/record: simplify TRACE_INFO tracepoint check	Tom Zanussi	1	-1/+7
	Fix a couple of inefficiencies and redundancies related to have_tracepoints() and its use when checking whether to write TRACE_INFO. First, there's no need to use get_tracepoints_path() in have_tracepoints() - we really just want the part that checks whether any attributes correspondo to tracepoints. Second, we really don't care about raw_samples per se - tracepoints are always raw_samples. In any case, the have_tracepoints() check should be sufficient to decide whether or not to write TRACE_INFO. Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]>, Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> LKML-Reference: <1273030770.6383.6.camel@tropicana> Signed-off-by: Tom Zanussi <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-05	perf report: Make dso__calc_col_width agree with hist_entry__dso_snprintf	Arnaldo Carvalho de Melo	1	-2/+4
	The first was always using the ->long_name, while the later used ->short_name if verbose was not set, resulting in the dso column to be much wider than needed most of the time. Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-04	Merge branch 'perf' of ↵	Ingo Molnar	4	-1/+15
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core
2010-05-04	perf: Fix performance issue with perf report	Anton Blanchard	2	-0/+9
	On a large machine we spend a lot of time in perf_header__find_attr when running perf report. If we are parsing a file without PERF_SAMPLE_ID then for each sample we call perf_header__find_attr and loop through all counter IDs, never finding a match. As the machine gets larger there are more per cpu counters and we spend an awful lot of time in there. The patch below initialises each sample id to -1ULL and checks for this in perf_header__find_attr. We may need to do something more intelligent eventually (eg a hash lookup from counter id to attr) but this at least fixes the most common usage of perf report. Cc: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Eric B Munson <[email protected]> Acked-by: Eric B Munson <[email protected]> LKML-Reference: <20100504111915.GB14636@kryten> Signed-off-by: Anton Blanchard <[email protected]> -- Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-03	perf: record TRACE_INFO only if using tracepoints and SAMPLE_RAW	Tom Zanussi	3	-1/+6
	The current perf code implicitly assumes SAMPLE_RAW means tracepoints are being used, but doesn't check for that. It happily records the TRACE_INFO even if SAMPLE_RAW is used without tracepoints, but when the perf data is read it won't go any further when it finds TRACE_INFO but no tracepoints, and displays misleading errors. This adds a check for both in perf-record, and won't record TRACE_INFO unless both are true. This at least allows perf report -D to dump raw events, and avoids triggering a misleading error condition in perf trace. It doesn't actually enable the non-tracepoint raw events to be displayed in perf trace, since perf trace currently only deals with tracepoint events. Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <1272865861.7932.16.camel@tropicana> Signed-off-by: Tom Zanussi <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-03	Merge branch 'perf/core' of ↵	Ingo Molnar	3	-95/+7
	git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
2010-05-02	perf: add perf-inject builtin	Tom Zanussi	5	-12/+50
	Currently, perf 'live mode' writes build-ids at the end of the session, which isn't actually useful for processing live mode events. What would be better would be to have the build-ids sent before any of the samples that reference them, which can be done by processing the event stream and retrieving the build-ids on the first hit. Doing that in perf-record itself, however, is off-limits. This patch introduces perf-inject, which does the same job while leaving perf-record untouched. Normal mode perf still records the build-ids at the end of the session as it should, but for live mode, perf-inject can be injected in between the record and report steps e.g.: perf record -o - ./hackbench 10 \| perf inject -v -b \| perf report -v -i - perf-inject reads a perf-record event stream and repipes it to stdout. At any point the processing code can inject other events into the event stream - in this case build-ids (-b option) are read and injected as needed into the event stream. Build-ids are just the first user of perf-inject - potentially anything that needs userspace processing to augment the trace stream with additional information could make use of this facility. Cc: Ingo Molnar <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Frédéric Weisbecker <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Tom Zanussi <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-02	perf/live: don't synthesize build ids at the end of a live mode trace	Tom Zanussi	2	-63/+0
	It doesn't really make sense to record the build ids at the end of a live mode session - live mode samples need that information during the trace rather than at the end. Leave event__synthesize_build_id() in place, however; we'll still be using that to synthesize build ids in a more timely fashion in a future patch. Cc: Frédéric Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Steven Rostedt <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Tom Zanussi <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-02	perf tools: Don't use code surrounded by __KERNEL__	Arnaldo Carvalho de Melo	6	-32/+104
	We need to refactor code to be explicitely shared by the kernel and at least the tools/ userspace programs, so, till we do that, copy the bare minimum bitmap/bitops code needed by tools/perf. Reported-by: "H. Peter Anvin" <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-01	perf: Fix warning while reading ring buffer headers	Frederic Weisbecker	3	-95/+7
	commit e9e94e3bd862d31777335722e747e97d9821bc1d "perf trace: Ignore "overwrite" field if present in /events/header_page" makes perf trace launching spurious warnings about unexpected tokens read: Warning: Error: expected type 6 but read 4 This change tries to handle the overcommit field in the header_page file whenever this field is present or not. The problem is that if this field is not present, we try to find it and give up in the middle of the line when we realize we are actually dealing with another field, which is the "data" one. And this failure abandons the file pointer in the middle of the "data" description line: field: u64 timestamp; offset:0; size:8; signed:0; field: local_t commit; offset:8; size:8; signed:1; field: char data; offset:16; size:4080; signed:1; ^^^ Here What happens next is that we want to read this line to parse the data field, but we fail because the pointer is not in the beginning of the line. We could probably fix that by rewinding the pointer. But in fact we don't care much about these headers that only concern the ftrace ring-buffer. We don't use them from perf. Just skip this part of perf.data, but don't remove it from recording to stay compatible with olders perf.data Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Steven Rostedt <[email protected]>
2010-04-29	perf symbols: Add machine helper routines	Arnaldo Carvalho de Melo	3	-24/+85
	Created when writing the first 'perf test' regression testing routine. Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-04-27	perf machines: Make the machines class adopt the dsos__fprintf methods	Arnaldo Carvalho de Melo	3	-11/+29
	Now those methods don't operate on a global list of dsos, but on lists of machines, so make this clear by renaming the functions. Cc: Avi Kivity <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Zhang, Yanmin <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-04-27	perf machine: Adopt some map_groups functions	Arnaldo Carvalho de Melo	7	-78/+73
	Those functions operated on members now grouped in 'struct machine', so move those methods to this new class. The changes made to 'perf probe' shows that using this abstraction inserting probes on guests almost got supported for free. Cc: Avi Kivity <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Zhang, Yanmin <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-04-27	perf machine: Pass buffer size to machine__mmap_name	Arnaldo Carvalho de Melo	4	-11/+11
	Don't blindly assume that the size of the buffer is enough, use snprintf. Cc: Avi Kivity <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Zhang, Yanmin <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-04-27	perf tools: Rename "kernel_info" to "machine"	Arnaldo Carvalho de Melo	10	-253/+249
	struct kernel_info and kerninfo__ are too vague, what they really describe are machines, virtual ones or hosts. There are more changes to introduce helpers to shorten function calls and to make more clear what is really being done, but I left that for subsequent patches. Cc: Avi Kivity <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Zhang, Yanmin <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-04-27	Merge branch 'perf' of ↵	Ingo Molnar	5	-24/+36
	git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core
2010-04-26	perf probe: Add --max-probes option	Masami Hiramatsu	4	-16/+22
	Add --max-probes option to change the maximum limit of findable probe points per event, since inlined function can be expanded into thousands of probe points. Default value is 128. Signed-off-by: Masami Hiramatsu <[email protected]> Suggested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-04-26	perf probe: Fix to exit callback soon after finding too many probe points	Masami Hiramatsu	1	-0/+4
	Fix to exit callback soon after finding too many probe points. Don't try to continue searching because it already failed. Signed-off-by: Masami Hiramatsu <[email protected]> Reported-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-04-26	perf probe: Fix to use symtab only if no debuginfo	Masami Hiramatsu	1	-8/+9
	Fix perf probe to use symtab only if there is no debuginfo, because debuginfo has more information than symtab. If we can't find a function in debuginfo, we never find it in symtab. Signed-off-by: Masami Hiramatsu <[email protected]> Reported-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-04-26	perf tools: Initialize dso->node member in dso__new	Masami Hiramatsu	1	-0/+1
	If dso->node member is not initialized, it causes a segmentation fault when adding to other lists. It should be initilized in dso__new(). Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> LKML-Reference: : <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-04-24	perf: Generalize perf lock's sample event reordering to the session layer	Frederic Weisbecker	2	-1/+188
	The sample events recorded by perf record are not time ordered because we have one buffer per cpu for each event (even demultiplexed per task/per cpu for task bound events). But when we read trace events we want them to be ordered by time because many state machines are involved. There are currently two ways perf tools deal with that: - use -M to multiplex every buffers (perf sched, perf kmem) But this creates a lot of contention in SMP machines on record time. - use a post-processing time reordering (perf timechart, perf lock) The reordering used by timechart is simple but doesn't scale well with huge flow of events, in terms of performance and memory use (unusable with perf lock for example). Perf lock has its own samples reordering that flushes its memory use in a regular basis and that uses a sorting based on the previous event queued (a new event to be queued is close to the previous one most of the time). This patch proposes to export perf lock's samples reordering facility to the session layer that reads the events. So if a tool wants to get ordered sample events, it needs to set its struct perf_event_ops::ordered_samples to true and that's it. This prepares tracing based perf tools to get rid of the need to use buffers multiplexing (-M) or to implement their own reordering. Also lower the flush period to 2 as it's sufficient already. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Tom Zanussi <[email protected]>