aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-11-06perf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxesYan, Zheng1-10/+51
The encoding for filter registers of IvyBridge-EP uncore QPI boxes is completely the same as SandyBridge-EP. Signed-off-by: Yan, Zheng <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: "Yan Zheng" <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06perf: Factor out strncpy() in perf_event_mmap_event()Oleg Nesterov1-16/+16
While this is really minor, but strncpy() does the unnecessary zero-padding till the end of tmp[16] and it is called every time we are going to use the string literal. Turn these strncpy()'s into the single strlcpy() under the new label, saves 72 bytes. Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06tools/perf: Add required memory barriersPeter Zijlstra3-16/+49
To match patch bf378d341e48 ("perf: Fix perf ring buffer memory ordering") change userspace to also adhere to the ordering outlined. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Michael Neuling <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: [email protected] Cc: Vince Weaver <[email protected]> Cc: Victor Kaplansky <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Anton Blanchard <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Michael Ellerman <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06perf: Fix arch_perf_out_copy_user defaultPeter Zijlstra6-16/+33
The arch_perf_output_copy_user() default of __copy_from_user_inatomic() returns bytes not copied, while all other argument functions given DEFINE_OUTPUT_COPY() return bytes copied. Since copy_from_user_nmi() is the odd duck out by returning bytes copied where all other *copy_{to,from}* functions return bytes not copied, change it over and ammend DEFINE_OUTPUT_COPY() to expect bytes not copied. Oddly enough DEFINE_OUTPUT_COPY() already returned bytes not copied while expecting its worker functions to return bytes copied. Signed-off-by: Peter Zijlstra <[email protected]> Acked-by: [email protected] Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06perf: Update a stale commentPeter Zijlstra1-2/+2
Signed-off-by: Peter Zijlstra <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michael Neuling <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: [email protected] Cc: Vince Weaver <[email protected]> Cc: Victor Kaplansky <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Anton Blanchard <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06perf: Optimize perf_output_begin() -- address calculationPeter Zijlstra1-7/+7
Rewrite the handle address calculation code to be clearer. Saves 8 bytes on x86_64-defconfig. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michael Neuling <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: [email protected] Cc: Vince Weaver <[email protected]> Cc: Victor Kaplansky <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Anton Blanchard <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06perf: Optimize perf_output_begin() -- lost_event casePeter Zijlstra1-5/+8
Avoid touching the lost_event and sample_data cachelines twince. Its not like we end up doing less work, but it might help to keep all accesses to these cachelines in one place. Due to code shuffle, this looses 4 bytes on x86_64-defconfig. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michael Neuling <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: [email protected] Cc: Vince Weaver <[email protected]> Cc: Victor Kaplansky <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Anton Blanchard <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06perf: Optimize perf_output_begin()Peter Zijlstra1-8/+9
There's no point in re-doing the memory-barrier when we fail the cmpxchg(). Also placing it after the space reservation loop makes it clearer it only separates the userpage->tail read from the data stores. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michael Neuling <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: [email protected] Cc: Vince Weaver <[email protected]> Cc: Victor Kaplansky <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Anton Blanchard <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06perf: Add unlikely() to the ring-buffer codePeter Zijlstra1-8/+8
Add unlikely() annotations to 'slow' paths: When having a sampling event but no output buffer; you have bigger issues -- also the bail is still faster than actually doing the work. When having a sampling event but a control page only buffer, you have bigger issues -- again the bail is still faster than actually doing work. Optimize for the case where you're not loosing events -- again, not doing the work is still faster but make sure that when you have to actually do work its as fast as possible. The typical watermark is 1/2 the buffer size, so most events will not take this path. Shrinks perf_output_begin() by 16 bytes on x86_64-defconfig. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michael Neuling <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: [email protected] Cc: Vince Weaver <[email protected]> Cc: Victor Kaplansky <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Anton Blanchard <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06perf: Simplify the ring-buffer codePeter Zijlstra1-33/+4
By using CIRC_SPACE() we can obviate the need for perf_output_space(). Shrinks the size of perf_output_begin() by 17 bytes on x86_64-defconfig. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michael Neuling <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: [email protected] Cc: Vince Weaver <[email protected]> Cc: Victor Kaplansky <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Anton Blanchard <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06arm64: KVM: vgic: byteswap GICv2 access on world switch if BEMarc Zyngier1-0/+13
Ensure that accesses to the GICH_* registers are byteswapped when the kernel is compiled as big-endian. Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2013-11-06arm64: KVM: initialize HYP mode following the kernel endiannessMarc Zyngier1-1/+4
Force SCTLR_EL2.EE to 1 if the kernel is compiled as BE. Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2013-11-06x86/cpu: Increase max CPU count to 8192Josh Boyer1-2/+2
The MAXSMP option is intended to enable silly large numbers of CPUs for testing purposes. The current value of 4096 isn't very silly any longer as there are actual SGI machines that approach 6096 CPUs when taking HT into account. Increase the value to a nice round 8192 to account for this and allow for short term future increases. Signed-off-by: Josh Boyer <[email protected]> Cc: [email protected] Cc: Russ Anderson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Tweaked it so that MAXSMP simply sets the maximum of the normal range. ] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06x86/cpu: Allow higher NR_CPUS valuesJosh Boyer1-2/+4
The current range for SMP configs is 2 - 512 CPUs, or a full 4096 in the case of MAXSMP. There are machines that have 1024 CPUs in them today and configuring a kernel for that means you are forced to set MAXSMP. This adds additional unnecessary overhead. While that overhead might be considered tiny for large machines, it isn't necessarily so if you are building a kernel that runs across a wide variety of machines. To cover the range of more common machines today, we allow NR_CPUS to be up to 4096 when CPUMASK_OFFSTACK is enabled. Signed-off-by: Josh Boyer <[email protected]> Cc: [email protected] Cc: Russ Anderson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06x86/cpu: Always print SMP information in /proc/cpuinfoHATAYAMA Daisuke1-9/+6
Currently show_cpuinfo_core() displays cpu core information only if the number of threads per a whole cores is 2 or larger. However, this condition doesn't care about the number of sockets. For example, this condition doesn't hold on systems with two logical cpus consisting of two sockets and a single core on each socket - yet the topology information would be interesting to see in that case as well. I don't know whether or not there are processors in real world by which such configurations are possible, but at least on vitual machine environments, such configuration can occur, typically when no explicit SMP information is provided in advance. For example, on qemu/KVM, SMP information is specified via -smp command-line option, more specifically, its syntax is: -smp n[,cores=cores][,threads=threads][,sockets=sockets][,maxcpus=maxcpus] If this is not specified, qemu tells configuration with n-sockets, 1-core and 1-thread to the guest machine, on which guest, MP information is not displayed in /proc/cpuinfo. I saw this situation on VMWare guest environment, too. To fix this issue, this patch simply removes the condition because this information is useful even if there's only 1 thread. Signed-off-by: HATAYAMA Daisuke <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06sched: Move completion code from core.c to completion.cPeter Zijlstra4-286/+301
Completions already have their own header file: linux/completion.h Move the implementation out of kernel/sched/core.c and into its own file: kernel/sched/completion.c. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06sched: Move wait code from core.c to wait.cPeter Zijlstra2-105/+105
For some reason only the wait part of the wait api lives in kernel/sched/wait.c and the wake part still lives in kernel/sched/core.c; ammend this. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06sched: Move wait.c into kernel/sched/Peter Zijlstra3-1/+2
Suggested-by: Ingo Molnar <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06Merge tag 'v3.12' into x86/cpu, to refresh the branch before queueing up ↵Ingo Molnar120-864/+1142
more changes Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar45-497/+591
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: * Check maximum frequency rate for record/top, emitting better error messages, from Jiri Olsa. * Disable live kvm command if timerfd is not supported, from David Ahern. * Add usage to 'perf list', from David Ahern. * Fix detection of non-core features, from David Ahern. * Consolidate __hists__add_*entry(), cleanup from Namhyung Kim. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2013-11-06ARC: [SMP] Fix build failures for large NR_CPUSVineet Gupta2-3/+21
ST.as only takes S9 (255) for offset. This was going out of range when accessing a task_struct field with 4k NR_CPUS (due to 128b of coumaks itself in there). Workaround by using an intermediate register to do the address scaling. There is some duplication of fix for ctx_sw.c and ctx_sw_asm.S however given that C version will go away soon I'm not bothering to factor out the common code. Reported-by: Noam Camus <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06ARC: [SMP] enlarge possible NR_CPUSNoam Camus1-2/+2
Signed-off-by: Noam Camus <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06ARC: [SMP] TLB flushVineet Gupta4-3/+99
- Add mm_cpumask setting (aggregating only, unlike some other arches) used to restrict the TLB flush cross-calling - cross-calling versions of TLB flush routines (thanks to Noam) Signed-off-by: Noam Camus <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06ARC: [SMP] ASID allocationVineet Gupta3-23/+37
-Track a Per CPU ASID counter -mm-per-cpu ASID (multiple threads, or mm migrated around) Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06arc: export symbol for pm_power_off in reset.cChen Gang1-0/+1
Need export symbol for it, or can not pass compiling, the related error with allmodconfig: MODPOST 2994 modules ERROR: "pm_power_off" [drivers/mfd/retu-mfd.ko] undefined! ERROR: "pm_power_off" [drivers/char/ipmi/ipmi_poweroff.ko] undefined! Signed-off-by: Chen Gang <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06arc: export symbol for save_stack_trace() in stacktrace.cChen Gang1-0/+1
Need export its symbol just like other architectures done, or can not pass compiling with allmodconfig, the related error: MODPOST 2994 modules ERROR: "save_stack_trace" [kernel/backtracetest.ko] undefined! ERROR: "save_stack_trace" [drivers/md/persistent-data/dm-persistent-data.ko] undefined! Signed-off-by: Chen Gang <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06arc: remove '__init' for get_hw_config_num_irq()Chen Gang2-2/+2
get_hw_config_num_irq() may be called by normal iss_model_init_smp() which is a function pointer for 'init_smp' which may be called by first_lines_of_secondary() which also need be normal too. The related warning (with allmodconfig): MODPOST vmlinux.o WARNING: vmlinux.o(.text+0x5814): Section mismatch in reference from the function iss_model_init_smp() to the function .init.text:get_hw_config_num_irq() The function iss_model_init_smp() references the function __init get_hw_config_num_irq(). This is often because iss_model_init_smp lacks a __init annotation or the annotation of get_hw_config_num_irq is wrong. Signed-off-by: Chen Gang <[email protected]>
2013-11-06arc: remove '__init' for first_lines_of_secondary()Chen Gang2-2/+2
first_lines_of_secondary() is a '__init' function, but it may be called by __cpu_up() by _cpu_up() by cpu_up() which is a normal export symbol function. So recommend to remove '__init'. The related warning (with allmodconfig): MODPOST vmlinux.o WARNING: vmlinux.o(.text+0x315c): Section mismatch in reference from the function __cpu_up() to the function .init.text:first_lines_of_secondary() The function __cpu_up() references the function __init first_lines_of_secondary(). This is often because __cpu_up lacks a __init annotation or the annotation of first_lines_of_secondary is wrong. Signed-off-by: Chen Gang <[email protected]>
2013-11-06arc: remove '__init' for setup_processor() and arc_init_IRQ()Chen Gang2-2/+2
They haven't '__init' in definition, but has '__init' in declaration. And normal function start_kernel_secondary() may call setup_processor() which will call arc_init_IRQ(). So need remove '__init' for both of them. The related warning (with allmodconfig): MODPOST vmlinux.o WARNING: vmlinux.o(.text+0x3084): Section mismatch in reference from the function start_kernel_secondary() to the function .init.text:setup_processor() The function start_kernel_secondary() references the function __init setup_processor(). This is often because start_kernel_secondary lacks a __init annotation or the annotation of setup_processor is wrong. Signed-off-by: Chen Gang <[email protected]>
2013-11-06arc: kgdb: add default implementation for kgdb_roundup_cpus()Chen Gang1-0/+12
arc supports kgdb, but need update -- add function kgdb_roundup_cpus(), or can not pass compiling. At present, add the simple generic one just like other architectures(e.g. tile, mips ...). The related error (with allmodconfig): kernel/built-in.o: In function `kgdb_cpu_enter': kernel/debug/debug_core.c:580: undefined reference to `kgdb_roundup_cpus' Signed-off-by: Chen Gang <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06ARC: Fix bogus gcc warning and micro-optimise TLB iteration loopVineet Gupta1-2/+2
------------------>8---------------------- arch/arc/mm/tlb.c: In function ‘do_tlb_overlap_fault’: arch/arc/mm/tlb.c:688:13: warning: array subscript is above array bounds [-Warray-bounds] (pd0[n] & PAGE_MASK)) { ^ ------------------>8---------------------- While at it, remove the usless last iteration of outer loop when reading a TLB SET for duplicate entries. Suggested-by: Mischa Jonker <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06ARC: Add support for irqflags tracing and lockdepVineet Gupta4-1/+42
Lockdep required a small fix to stacktrace API which was incorrectly unwindign out of __switch_to for the current call frame. Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06ARC: Reset the value of Interrupt Priority RegisterVineet Gupta1-3/+7
In case bootloader has changed the priority of one/more IRQ lines Reported-by: Noam Camus <[email protected]> Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06ARC: Reduce #ifdef'ery for unaligned access emulationVineet Gupta3-7/+3
Emulation not enabled is treated as if the fixup failed, so no need for special #ifdef checks. Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06ARC: Change calling convention of do_page_fault()Vineet Gupta3-8/+7
switch the args (address, pt_regs) to match with all the other "C" exception handlers. This removes the awkwardness in EV_ProtV for page fault vs. unaligned access. Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06ARC: cacheflush optim - PTAG can be loop invariant if V-P is constVineet Gupta1-3/+11
Line op needs vaddr (indexing) and paddr (tag match). For page sized flushes (V-P const), each line op will need a different index, but the tag bits wil remain constant, hence paddr can be setup once outside the loop. This improves select LMBench numbers for Aliasing dcache where we have more "preventive" cache flushing. Processor, Processes - times in microseconds - smaller is better ------------------------------------------------------------------------------ Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 3.11-rc7- Linux 3.11.0- 80 4.66 8.88 69.7 112. 268. 8.60 28.0 3489 13.K 27.K # Non alias ARC700 3.11-rc7- Linux 3.11.0- 80 4.64 8.51 68.6 98.5 271. 8.58 28.1 4160 15.K 32.K # Aliasing 3.11-rc7- Linux 3.11.0- 80 4.64 8.51 69.8 99.4 270. 8.73 27.5 3880 15.K 31.K # PTAG loop Inv Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06ARC: cacheflush refactor #3: Unify the {d,i}cache flush leaf helpersVineet Gupta1-84/+55
With Line length being constant now, we can fold the 2 helpers into 1. This allows applying any optimizations (forthcoming) to single place. Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06ARC: cacheflush refactor #2: I and D caches lines to have same sizeVineet Gupta2-22/+16
Having them be different seems an obscure configuration. Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06ARC: cacheflush refactor #1: push aux reg ascertaining into leaf routineVineet Gupta1-10/+6
ARC dcache supports 3 ops - Inv, Flush, Flush-n-Inv. The programming model however provides 2 commands FLUSH, INV. INV will either discard or flush-n-discard (based on DT_CTRL bit) The leaf helper __dc_line_loop() used to take the AUX register (corresponding to the 2 commands). Now we push that to within the helper, paving way for code consolidations to follow. Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06ARC: use __weak instead of __attribute__((weak))Vineet Gupta2-2/+2
Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06ARC: Annotate some functions as staticVineet Gupta1-6/+5
Signed-off-by: Vineet Gupta <[email protected]>
2013-11-06arc: Replace __get_cpu_var usesChristoph Lameter2-4/+4
__get_cpu_var() is used for multiple purposes in the kernel source. One of them is address calculation via the form &__get_cpu_var(x). This calculates the address for the instance of the percpu variable of the current processor based on an offset. Other use cases are for storing and retrieving data from the current processors percpu area. __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment. __get_cpu_var() is defined as : #define __get_cpu_var(var) (*this_cpu_ptr(&(var))) __get_cpu_var() always only does an address determination. However, store and retrieve operations could use a segment prefix (or global register on other platforms) to avoid the address calculation. this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use optimized assembly code to read and write per cpu variables. This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr() or into a use of this_cpu operations that use the offset. Thereby address calcualtions are avoided and less registers are used when code is generated. At the end of the patchset all uses of __get_cpu_var have been removed so the macro is removed too. The patchset includes passes over all arches as well. Once these operations are used throughout then specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by f.e. using a global register that may be set to the per cpu base. Transformations done to __get_cpu_var() 1. Determine the address of the percpu instance of the current processor. DEFINE_PER_CPU(int, y); int *x = &__get_cpu_var(y); Converts to int *x = this_cpu_ptr(&y); 2. Same as #1 but this time an array structure is involved. DEFINE_PER_CPU(int, y[20]); int *x = __get_cpu_var(y); Converts to int *x = this_cpu_ptr(y); 3. Retrieve the content of the current processors instance of a per cpu variable. DEFINE_PER_CPU(int, u); int x = __get_cpu_var(y) Converts to int x = __this_cpu_read(y); 4. Retrieve the content of a percpu struct DEFINE_PER_CPU(struct mystruct, y); struct mystruct x = __get_cpu_var(y); Converts to memcpy(this_cpu_ptr(&x), y, sizeof(x)); 5. Assignment to a per cpu variable DEFINE_PER_CPU(int, y) __get_cpu_var(y) = x; Converts to this_cpu_write(y, x); 6. Increment/Decrement etc of a per cpu variable DEFINE_PER_CPU(int, y); __get_cpu_var(y)++ Converts to this_cpu_inc(y) Acked-by: Vineet Gupta <[email protected]> Signed-off-by: Christoph Lameter <[email protected]>
2013-11-05perf tools: Finish the removal of 'self' argumentsArnaldo Carvalho de Melo21-246/+242
They convey no information, perhaps I was bitten by some snake at some point, complete the detox by naming the last of those arguments more sensibly. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-11-05perf tools: Check maximum frequency rate for record/topJiri Olsa4-28/+74
Adding the check for maximum allowed frequency rate defined in following file: /proc/sys/kernel/perf_event_max_sample_rate When we cross the maximum value we fail and display detailed error message with advise. $ perf record -F 3000 ls Maximum frequency rate (2000) reached. Please use -F freq option with lower value or consider tweaking /proc/sys/kernel/perf_event_max_sample_rate. In case user does not specify the frequency and the default value cross the maximum, we display warning and set the frequency value to the current maximum. $ perf record ls Lowering default frequency rate to 2000. Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate. Same messages are used for 'perf top'. Signed-off-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-11-05perf fs: Add procfs supportJiri Olsa3-2/+19
Adding procfs support into fs class. The interface function: const char *procfs__mountpoint(void); provides existing mountpoint path for procfs. Signed-off-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Fixup namespace ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-11-05perf fs: Rename NAME_find_mountpoint() to NAME__mountpoint()Arnaldo Carvalho de Melo5-21/+16
Shorten it, "finding" it is an implementation detail, what callers want is the pathname, not to ask for it to _always_ do the lookup. And the existing implementation already caches it, i.e. it doesn't "finds" it on every call. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-11-05arm64: compat: Clear the IT state independent of the 32-bit ARM or Thumb-2 modeT.J. Purtell1-4/+5
The ARM architecture reference specifies that the IT state bits in the PSR must be all zeros in ARM mode or behavior is unspecified. If an ARM function is registered as a signal handler, and that signal is delivered inside a block of instructions following an IT instruction, some of the instructions at the beginning of the signal handler may be skipped if the IT state bits of the Program Status Register are not cleared by the kernel. Signed-off-by: T.J. Purtell <[email protected]> [[email protected]: code comment and commit log updated] Signed-off-by: Catalin Marinas <[email protected]>
2013-11-05perf tools: Factor sysfs code into generic fs objectJiri Olsa9-72/+119
Moving sysfs code into generic fs object and preparing it to carry procfs support. This should be merged with tools/lib/lk/debugfs.c at some point in the future. Signed-off-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Added fs__ namespace qualifier to some more functions ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-11-05perf list: Add usageDavid Ahern1-3/+14
Currently 'perf list' is not very helpful if you forget the syntax: $ perf list -h List of pre-defined events (to be used in -e): After: $ perf list -h usage: perf list [hw|sw|cache|tracepoint|pmu|event_glob] Signed-off-by: David Ahern <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-11-05perf list: Remove a level of indentationDavid Ahern1-36/+37
With a return after the if check an indentation level can be removed. Indentation shift only; no functional changes. Signed-off-by: David Ahern <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>