aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kernel
AgeCommit message (Collapse)AuthorFilesLines
2010-04-02x86: Fix double enable_IR_x2apic() call on SMP kernel on !SMP boardsSuresh Siddha1-0/+2
Jan Grossmann reported kernel boot panic while booting SMP kernel on his system with a single core cpu. SMP kernels call enable_IR_x2apic() from native_smp_prepare_cpus() and on platforms where the kernel doesn't find SMP configuration we ended up again calling enable_IR_x2apic() from the APIC_init_uniprocessor() call in the smp_sanity_check(). Thus leading to kernel panic. Don't call enable_IR_x2apic() and default_setup_apic_routing() from APIC_init_uniprocessor() in CONFIG_SMP case. NOTE: this kind of non-idempotent and assymetric initialization sequence is rather fragile and unclean, we'll clean that up in v2.6.35. This is the minimal fix for v2.6.34. Reported-by: [email protected] Signed-off-by: Suresh Siddha <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> # [v2.6.32.x, v2.6.33.x] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-04-02perf, x86: Add Nehalem programming quirk to WestmerePeter Zijlstra1-0/+2
According to the Xeon-5600 errata the Westmere suffers the same PMU programming bug as the original Nehalem did. Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2010-04-02perf, x86: Fix __initconst vs constPeter Zijlstra4-11/+11
All variables that have __initconst should also be const. Suggested-by: Stephen Rothwell <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2010-04-02perf, x86: Fix up the ANY flag stuffPeter Zijlstra5-51/+74
Stephane noticed that the ANY flag was in generic arch code, and Cyrill reported that it broke the P4 code. Solve this by merging x86_pmu::raw_event into x86_pmu::hw_config and provide intel_pmu and amd_pmu specific versions of this callback. The intel_pmu one deals with the ANY flag, the amd_pmu adds the few extra event bits AMD64 has. Reported-by: Stephane Eranian <[email protected]> Reported-by: Cyrill Gorcunov <[email protected]> Acked-by: Robert Richter <[email protected]> Acked-by: Cyrill Gorcunov <[email protected]> Acked-by: Stephane Eranian <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <1269968113.5258.442.camel@laptop> Signed-off-by: Ingo Molnar <[email protected]>
2010-04-02perf, x86: implement ARCH_PERFMON_EVENTSEL bit masksRobert Richter4-56/+20
ARCH_PERFMON_EVENTSEL bit masks are often used in the kernel. This patch adds macros for the bit masks and removes local defines. The function intel_pmu_raw_event() becomes x86_pmu_raw_event() which is generic for x86 models and same also for p6. Duplicate code is removed. Signed-off-by: Robert Richter <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-04-02perf, x86: Undo some some *_counter* -> *_event* renamesRobert Richter5-61/+61
The big rename: cdd6c48 perf: Do the big rename: Performance Counters -> Performance Events accidentally renamed some members of stucts that were named after registers in the spec. To avoid confusion this patch reverts some changes. The related specs are MSR descriptions in AMD's BKDGs and the ARCHITECTURAL PERFORMANCE MONITORING section in the Intel 64 and IA-32 Architectures Software Developer's Manuals. This patch does: $ sed -i -e 's:num_events:num_counters:g' \ arch/x86/include/asm/perf_event.h \ arch/x86/kernel/cpu/perf_event_amd.c \ arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_intel.c \ arch/x86/kernel/cpu/perf_event_p6.c \ arch/x86/kernel/cpu/perf_event_p4.c \ arch/x86/oprofile/op_model_ppro.c $ sed -i -e 's:event_bits:cntval_bits:g' -e 's:event_mask:cntval_mask:g' \ arch/x86/kernel/cpu/perf_event_amd.c \ arch/x86/kernel/cpu/perf_event.c \ arch/x86/kernel/cpu/perf_event_intel.c \ arch/x86/kernel/cpu/perf_event_p6.c \ arch/x86/kernel/cpu/perf_event_p4.c Signed-off-by: Robert Richter <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-04-02Merge branch 'perf/urgent' into perf/coreIngo Molnar12-63/+169
Conflicts: arch/x86/kernel/cpu/perf_event.c Merge reason: Resolve the conflict, pick up fixes Signed-off-by: Ingo Molnar <[email protected]>
2010-04-02perf, x86: Fix callgraphs of 32-bit processes on 64-bit kernelsTorok Edwin2-5/+44
When profiling a 32-bit process on a 64-bit kernel, callgraph tracing stopped after the first function, because it has seen a garbage memory address (tried to interpret the frame pointer, and return address as a 64-bit pointer). Fix this by using a struct stack_frame with 32-bit pointers when the TIF_IA32 flag is set. Note that TIF_IA32 flag must be used, and not is_compat_task(), because the latter is only set when the 32-bit process is executing a syscall, which may not always be the case (when tracing page fault events for example). Signed-off-by: Török Edwin <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-04-02perf, x86: Fix AMD hotplug & constraint initializationPeter Zijlstra2-36/+52
Commit 3f6da39 ("perf: Rework and fix the arch CPU-hotplug hooks") moved the amd northbridge allocation from CPUS_ONLINE to CPUS_PREPARE_UP however amd_nb_id() doesn't work yet on prepare so it would simply bail basically reverting to a state where we do not properly track node wide constraints - causing weird perf results. Fix up the AMD NorthBridge initialization code by allocating from CPU_UP_PREPARE and installing it from CPU_STARTING once we have the proper nb_id. It also properly deals with the allocation failing. Signed-off-by: Peter Zijlstra <[email protected]> [ robustify using amd_has_nb() ] Signed-off-by: Stephane Eranian <[email protected]> LKML-Reference: <1269353485.5109.48.camel@twins> Signed-off-by: Ingo Molnar <[email protected]>
2010-04-02x86: Move notify_cpu_starting() callback to a later stagePeter Zijlstra1-2/+2
Because we need to have cpu identification things done by the time we run CPU_STARTING notifiers. ( This init ordering will be relied on by the next fix. ) Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <1269353485.5109.48.camel@twins> Signed-off-by: Ingo Molnar <[email protected]>
2010-04-02Merge branch 'perf/urgent' of ↵Ingo Molnar2-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent
2010-04-01ibft, x86: Change reserve_ibft_region() to find_ibft_region()Yinghai Lu1-2/+12
This allows arch code could decide the way to reserve the ibft. And we should reserve ibft as early as possible, instead of BOOTMEM stage, in case the table is in RAM range and is not reserved by BIOS (this will often be the case.) Move to just after find_smp_config(). Also when CONFIG_NO_BOOTMEM=y, We will not have reserve_bootmem() anymore. -v2: fix typo about ibft pointed by Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: Yinghai Lu <[email protected]> LKML-Reference: <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Peter Jones <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> CC: Jan Beulich <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2010-04-01x86, hpet: Fix bug in RTC emulationAlok Kataria1-0/+1
We think there exists a bug in the HPET code that emulates the RTC. In the normal case, when the RTC frequency is set, the rtc driver tells the hpet code about it here: int hpet_set_periodic_freq(unsigned long freq) { uint64_t clc; if (!is_hpet_enabled()) return 0; if (freq <= DEFAULT_RTC_INT_FREQ) hpet_pie_limit = DEFAULT_RTC_INT_FREQ / freq; else { clc = (uint64_t) hpet_clockevent.mult * NSEC_PER_SEC; do_div(clc, freq); clc >>= hpet_clockevent.shift; hpet_pie_delta = (unsigned long) clc; } return 1; } If freq is set to 64Hz (DEFAULT_RTC_INT_FREQ) or lower, then hpet_pie_limit (a static) is set to non-zero. Then, on every one-shot HPET interrupt, hpet_rtc_timer_reinit is called to compute the next timeout. Well, that function has this logic: if (!(hpet_rtc_flags & RTC_PIE) || hpet_pie_limit) delta = hpet_default_delta; else delta = hpet_pie_delta; Since hpet_pie_limit is not 0, hpet_default_delta is used. That corresponds to 64Hz. Now, if you set a different rtc frequency, you'll take the else path through hpet_set_periodic_freq, but unfortunately no one resets hpet_pie_limit back to 0. Boom....now you are stuck with 64Hz RTC interrupts forever. The patch below just resets the hpet_pie_limit value when requested freq is greater than DEFAULT_RTC_INT_FREQ, which we think fixes this problem. Signed-off-by: Alok N Kataria <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Daniel Hecht <[email protected]> Cc: Venkatesh Pallipadi <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2010-04-01x86, hpet: Erratum workaround for read after write of HPET comparatorPallipadi, Venkatesh1-1/+7
On Wed, Feb 24, 2010 at 03:37:04PM -0800, Justin Piszcz wrote: > Hello, > > Again, on the Intel DP55KG board: > > # uname -a > Linux host 2.6.33 #1 SMP Wed Feb 24 18:31:00 EST 2010 x86_64 GNU/Linux > > [ 1.237600] ------------[ cut here ]------------ > [ 1.237890] WARNING: at arch/x86/kernel/hpet.c:404 hpet_next_event+0x70/0x80() > [ 1.238221] Hardware name: > [ 1.238504] hpet: compare register read back failed. > [ 1.238793] Modules linked in: > [ 1.239315] Pid: 0, comm: swapper Not tainted 2.6.33 #1 > [ 1.239605] Call Trace: > [ 1.239886] <IRQ> [<ffffffff81056c13>] ? warn_slowpath_common+0x73/0xb0 > [ 1.240409] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0 > [ 1.240699] [<ffffffff81056cb0>] ? warn_slowpath_fmt+0x40/0x50 > [ 1.240992] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0 > [ 1.241281] [<ffffffff81041ad0>] ? hpet_next_event+0x70/0x80 > [ 1.241573] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0 > [ 1.241859] [<ffffffff81078e32>] ? tick_handle_oneshot_broadcast+0xe2/0x100 > [ 1.246533] [<ffffffff8102a67a>] ? timer_interrupt+0x1a/0x30 > [ 1.246826] [<ffffffff81085499>] ? handle_IRQ_event+0x39/0xd0 > [ 1.247118] [<ffffffff81087368>] ? handle_edge_irq+0xb8/0x160 > [ 1.247407] [<ffffffff81029f55>] ? handle_irq+0x15/0x20 > [ 1.247689] [<ffffffff810294a2>] ? do_IRQ+0x62/0xe0 > [ 1.247976] [<ffffffff8146be53>] ? ret_from_intr+0x0/0xa > [ 1.248262] <EOI> [<ffffffff8102f277>] ? mwait_idle+0x57/0x80 > [ 1.248796] [<ffffffff8102645c>] ? cpu_idle+0x5c/0xb0 > [ 1.249080] ---[ end trace db7f668fb6fef4e1 ]--- > > Is this something Intel has to fix or is it a bug in the kernel? This is a chipset erratum. Thomas: You mentioned we can retain this check only for known-buggy and hpet debug kind of options. But here is the simple workaround patch for this particular erratum. Some chipsets have a erratum due to which read immediately following a write of HPET comparator returns old comparator value instead of most recently written value. Erratum 15 in "Intel I/O Controller Hub 9 (ICH9) Family Specification Update" (http://www.intel.com/assets/pdf/specupdate/316973.pdf) Workaround for the errata is to read the comparator twice if the first one fails. Signed-off-by: Venkatesh Pallipadi <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]> Cc: Venkatesh Pallipadi <[email protected]> Cc: <[email protected]>
2010-04-01x86: Handle overlapping mptablesAndi Kleen1-2/+2
We found a system where the MP table MPC and MPF structures overlap. That doesn't really matter because the mptable is not used anyways with ACPI, but it leads to a panic in the early allocator due to the overlapping reservations in 2.6.33. Earlier kernels handled this without problems. Simply change these reservations to reserve_early_overlap_ok to avoid the panic. Reported-by: Thomas Renninger <[email protected]> Tested-by: Thomas Renninger <[email protected]> Signed-off-by: Andi Kleen <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]> Cc: <[email protected]>
2010-04-01x86,kgdb: Always initialize the hw breakpoint attributeJason Wessel1-1/+1
It is required to call hw_breakpoint_init() on an attr before using it in any other calls. This fixes the problem where kgdb will sometimes fail to initialize on x86_64. Signed-off-by: Jason Wessel <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: 2.6.33 <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]>
2010-04-01perf: Use hot regs with software sched switch/migrate eventsFrederic Weisbecker1-2/+0
Scheduler's task migration events don't work because they always pass NULL regs perf_sw_event(). The event hence gets filtered in perf_swevent_add(). Scheduler's context switches events use task_pt_regs() to get the context when the event occured which is a wrong thing to do as this won't give us the place in the kernel where we went to sleep but the place where we left userspace. The result is even more wrong if we switch from a kernel thread. Use the hot regs snapshot for both events as they belong to the non-interrupt/exception based events family. Unlike page faults or so that provide the regs matching the exact origin of the event, we need to save the current context. This makes the task migration event working and fix the context switch callchains and origin ip. Example: perf record -a -e cs Before: 10.91% ksoftirqd/0 0 [k] 0000000000000000 | --- (nil) perf_callchain perf_prepare_sample __perf_event_overflow perf_swevent_overflow perf_swevent_add perf_swevent_ctx_event do_perf_sw_event __perf_sw_event perf_event_task_sched_out schedule run_ksoftirqd kthread kernel_thread_helper After: 23.77% hald-addon-stor [kernel.kallsyms] [k] schedule | --- schedule | |--60.00%-- schedule_timeout | wait_for_common | wait_for_completion | blk_execute_rq | scsi_execute | scsi_execute_req | sr_test_unit_ready | | | |--66.67%-- sr_media_change | | media_changed | | cdrom_media_changed | | sr_block_media_changed | | check_disk_change | | cdrom_open v2: Always build perf_arch_fetch_caller_regs() now that software events need that too. They don't need it from modules, unlike trace events, so we keep the EXPORT_SYMBOL in trace_event_perf.c Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: David Miller <[email protected]>
2010-03-31x86: Make e820_remove_range to handle all covered caseYinghai Lu1-4/+20
Rusty found on lguest with trim_bios_range, max_pfn is not right anymore, and looks e820_remove_range does not work right. [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] LGUEST: 0000000000000000 - 0000000004000000 (usable) [ 0.000000] Notice: NX (Execute Disable) protection missing in CPU or disabled in BIOS! [ 0.000000] DMI not present or invalid. [ 0.000000] last_pfn = 0x3fa0 max_arch_pfn = 0x100000 [ 0.000000] init_memory_mapping: 0000000000000000-0000000003fa0000 root cause is: the e820_remove_range doesn't handle the all covered case. e820_remove_range(BIOS_START, BIOS_END - BIOS_START, ...) produces a bogus range as a result. Make it match e820_update_range() by handling that case too. Reported-by: Rusty Russell <[email protected]> Signed-off-by: Yinghai Lu <[email protected]> Tested-by: Rusty Russell <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo52-15/+40
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <[email protected]> Guess-its-ok-by: Christoph Lameter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Lee Schermerhorn <[email protected]>
2010-03-29x86: Make sure free_init_pages() frees pages on page boundaryYinghai Lu3-6/+11
When CONFIG_NO_BOOTMEM=y, it could use memory more effiently, or in a more compact fashion. Example: Allocated new RAMDISK: 00ec2000 - 0248ce57 Move RAMDISK from 000000002ea04000 - 000000002ffcee56 to 00ec2000 - 0248ce56 The new RAMDISK's end is not page aligned. Last page could be shared with other users. When free_init_pages are called for initrd or .init, the page could be freed and we could corrupt other data. code segment in free_init_pages(): | for (; addr < end; addr += PAGE_SIZE) { | ClearPageReserved(virt_to_page(addr)); | init_page_count(virt_to_page(addr)); | memset((void *)(addr & ~(PAGE_SIZE-1)), | POISON_FREE_INITMEM, PAGE_SIZE); | free_page(addr); | totalram_pages++; | } last half page could be used as one whole free page. So page align the boundaries. -v2: make the original initramdisk to be aligned, according to Johannes, otherwise we have the chance to lose one page. we still need to keep initrd_end not aligned, otherwise it could confuse decompressor. -v3: change to WARN_ON instead, suggested by Johannes. -v4: use PAGE_ALIGN, suggested by Johannes. We may fix that macro name later to PAGE_ALIGN_UP, and PAGE_ALIGN_DOWN Add comments about assuming ramdisk start is aligned in relocate_initrd(), change to re get ramdisk_image instead of save it to make diff smaller. Add warning for wrong range, suggested by Johannes. -v6: remove one WARN() We need to align beginning in free_init_pages() do not copy more than ramdisk_size, noticed by Johannes Reported-by: Stanislaw Gruszka <[email protected]> Tested-by: Stanislaw Gruszka <[email protected]> Signed-off-by: Yinghai Lu <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: David Miller <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Linus Torvalds <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-29x86: Make smp_locks end with page alignmentYinghai Lu1-1/+1
Fix: ------------[ cut here ]------------ WARNING: at arch/x86/mm/init.c:342 free_init_pages+0x4c/0xfa() free_init_pages: range [0x40daf000, 0x40db5c24] is not aligned Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.34-rc2-tip-03946-g4f16b23-dirty #50 Call Trace: [<40232e9f>] warn_slowpath_common+0x65/0x7c [<4021c9f0>] ? free_init_pages+0x4c/0xfa [<40881434>] ? _etext+0x0/0x24 [<40232eea>] warn_slowpath_fmt+0x24/0x27 [<4021c9f0>] free_init_pages+0x4c/0xfa [<40881434>] ? _etext+0x0/0x24 [<40d3f4bd>] alternative_instructions+0xf6/0x100 [<40d3fe4f>] check_bugs+0xbd/0xbf [<40d398a7>] start_kernel+0x2d5/0x2e4 [<40d390ce>] i386_start_kernel+0xce/0xd5 ---[ end trace 4eaa2a86a8e2da22 ]--- Comments in vmlinux.lds.S already said: | /* | * smp_locks might be freed after init | * start/end must be page aligned | */ Signed-off-by: Yinghai Lu <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: David Miller <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Linus Torvalds <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-26Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds4-9/+55
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, amd: Restrict usage of c1e_idle() x86: Fix placement of FIX_OHCI1394_BASE x86: Handle legacy PIC interrupts on all the cpu's
2010-03-26perf, x86: Add Nehelem PMU programming errata workaroundPeter Zijlstra4-10/+45
Implement the workaround for Intel Errata AAK100 and AAP53. Also, remove the Core-i7 name for Nehalem events since there are also Westmere based i7 chips. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> LKML-Reference: <1269608924.12097.147.camel@laptop> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-26x86, ptrace: Fix block-stepPeter Zijlstra4-6/+48
Implement ptrace-block-step using TIF_BLOCKSTEP which will set DEBUGCTLMSR_BTF when set for a task while preserving any other DEBUGCTLMSR bits. Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-26x86, perf, bts, mm: Delete the never used BTS-ptrace codePeter Zijlstra13-2317/+6
Support for the PMU's BTS features has been upstreamed in v2.6.32, but we still have the old and disabled ptrace-BTS, as Linus noticed it not so long ago. It's buggy: TIF_DEBUGCTLMSR is trampling all over that MSR without regard for other uses (perf) and doesn't provide the flexibility needed for perf either. Its users are ptrace-block-step and ptrace-bts, since ptrace-bts was never used and ptrace-block-step can be implemented using a much simpler approach. So axe all 3000 lines of it. That includes the *locked_memory*() APIs in mm/mlock.c as well. Reported-by: Linus Torvalds <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Roland McGrath <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Markus Metzger <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Andrew Morton <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-26perf, x86: Clean up debugctlmsr bit definitionsPeter Zijlstra2-21/+9
Move all debugctlmsr thingies into msr-index.h Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-26x86, perf: Add raw events support for the P4 PMUCyrill Gorcunov1-322/+424
The adding of raw event support lead to complete code refactoring. I hope is became more readable then it was. The list of changes: 1) The 64bit config field is enough to hold all information we need to track event details. To achieve it we used *own* enum for events selection in ESCR register and map this key into proper value at moment of event enabling. For the same reason we use 12LSB bits in CCCR register -- to track which exactly cache trace event was requested. And we cear this bits at real 'write' moment. 2) There is no per-cpu area reserved for P4 PMU anymore. We don't need it. All is held by config. 3) Now we may use any available counter, ie we try to grab any possible counter. v2: - Lin Ming reported the lack of ESCR selector in CCCR for cache events v3: - Don't loose cache event codes at config unpacking procedure, we may need it one day so no obscure hack behind our back, better to clear reserved bits explicitly when needed (thanks Ming for pointing out) - Lin Ming fixed misplaced opcodes in cache events Signed-off-by: Cyrill Gorcunov <[email protected]> Tested-by: Lin Ming <[email protected]> Signed-off-by: Lin Ming <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Robert Richter <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Cyrill Gorcunov <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> [ v4: did a few whitespace fixlets ] Signed-off-by: Ingo Molnar <[email protected]>
2010-03-22Merge commit 'v2.6.34-rc2' into perf/coreIngo Molnar19-314/+46
Merge reason: Pick up latest perf fixes from upstream. Signed-off-by: Ingo Molnar <[email protected]>
2010-03-22x86 / perf: Fix suspend to RAM on HP nx6325Rafael J. Wysocki1-3/+5
Commit 3f6da3905398826d85731247e7fbcf53400c18bd (perf: Rework and fix the arch CPU-hotplug hooks) broke suspend to RAM on my HP nx6325 (and most likely on other AMD-based boxes too) by allowing amd_pmu_cpu_offline() to be executed for CPUs that are going offline as part of the suspend process. The problem is that cpuhw->amd_nb may be NULL already, so the function should make sure it's not NULL before accessing the object pointed to by it. Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2010-03-19x86, amd: Restrict usage of c1e_idle()Andreas Herrmann1-8/+24
Currently c1e_idle returns true for all CPUs greater than or equal to family 0xf model 0x40. This covers too many CPUs. Meanwhile a respective erratum for the underlying problem was filed (#400). This patch adds the logic to check whether erratum #400 applies to a given CPU. Especially for CPUs where SMI/HW triggered C1e is not supported, c1e_idle() doesn't need to be used. We can check this by looking at the respective OSVW bit for erratum #400. Cc: <[email protected]> # .32.x .33.x Signed-off-by: Andreas Herrmann <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2010-03-19x86, tboot: Add support for S3 memory integrity protectionShane Wang1-9/+11
This patch adds support for S3 memory integrity protection within an Intel(R) TXT launched kernel, for all kernel and userspace memory. All RAM used by the kernel and userspace, as indicated by memory ranges of type E820_RAM and E820_RESERVED_KERN in the e820 table, will be integrity protected. The MAINTAINERS file is also updated to reflect the maintainers of the TXT-related code. All MACing is done in tboot, based on a complexity analysis and tradeoff. v3: Compared with v2, this patch adds a check of array size in tboot.c, and a note to specify which c/s of tboot supports this kind of MACing in intel_txt.txt. Signed-off-by: Shane Wang <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Joseph Cihula <[email protected]> Acked-by: Pavel Machek <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2010-03-18Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds6-155/+176
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (35 commits) perf: Fix unexported generic perf_arch_fetch_caller_regs perf record: Don't try to find buildids in a zero sized file perf: export perf_trace_regs and perf_arch_fetch_caller_regs perf, x86: Fix hw_perf_enable() event assignment perf, ppc: Fix compile error due to new cpu notifiers perf: Make the install relative to DESTDIR if specified kprobes: Calculate the index correctly when freeing the out-of-line execution slot perf tools: Fix sparse CPU numbering related bugs perf_event: Fix oops triggered by cpu offline/online perf: Drop the obsolete profile naming for trace events perf: Take a hot regs snapshot for trace events perf: Introduce new perf_fetch_caller_regs() for hot regs snapshot perf/x86-64: Use frame pointer to walk on irq and process stacks lockdep: Move lock events under lockdep recursion protection perf report: Print the map table just after samples for which no map was found perf report: Add multiple event support perf session: Change perf_session post processing functions to take histogram tree perf session: Add storage for seperating event types in report perf session: Change add_hist_entry to take the tree root instead of session perf record: Add ID and to recorded event data when recording multiple events ...
2010-03-18x86, perf: Fix few cosmetic dabs for P4 pmu (comments and constantify)Cyrill Gorcunov1-1/+1
- A few ESCR have escaped fixing at previous attempt. - p4_escr_map is read only, make it const. Nothing serious. Signed-off-by: Cyrill Gorcunov <[email protected]> Cc: Lin Ming <[email protected]> LKML-Reference: <20100318211256.GH5062@lenovo> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-18perf_events: Fix resource leak in x86 __hw_perf_event_init()Stephane Eranian1-1/+4
If reserve_pmc_hardware() succeeds but reserve_ds_buffers() fails, then we need to release_pmc_hardware. It won't be done by the destroy() callback because we return before setting it in case of error. Signed-off-by: Stephane Eranian <[email protected]> Cc: <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> -- arch/x86/kernel/cpu/perf_event.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
2010-03-18perf, x86: Add cache events for the Pentium-4 PMULin Ming1-6/+147
Move the HT bit setting code from p4_pmu_event_map to p4_hw_config. So the cache events can get HT bit set correctly. Tested on my P4 desktop, below 6 cache events work: L1-dcache-load-misses LLC-load-misses dTLB-load-misses dTLB-store-misses iTLB-loads iTLB-load-misses Signed-off-by: Lin Ming <[email protected]> Reviewed-by: Cyrill Gorcunov <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-18perf, x86: Add a key to simplify template lookup in Pentium-4 PMULin Ming1-52/+34
Currently, we use opcode(Event and Event-Selector) + emask to look up template in p4_templates. But cache events (L1-dcache-load-misses, LLC-load-misses, etc) use the same event(P4_REPLAY_EVENT) to do the counting, ie, they have the same opcode and emask. So we can not use current lookup mechanism to find the template for cache events. This patch introduces a "key", which is the index into p4_templates. The low 12 bits of CCCR are reserved, so we can hide the "key" in the low 12 bits of hwc->config. We extract the key from hwc->config and then quickly find the template. Signed-off-by: Lin Ming <[email protected]> Reviewed-by: Cyrill Gorcunov <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-18x86, perf: Use apic_write unconditionallyCyrill Gorcunov2-6/+0
Since apic_write() maps to a plain noop in the !CONFIG_X86_LOCAL_APIC case we're safe to remove this conditional compilation and clean up the code a bit. Signed-off-by: Cyrill Gorcunov <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-17x86, UV: Delete unneeded boot messagesJack Steiner1-3/+0
SGI:UV: Delete extra boot messages that describe the system topology. These messages are no longer useful. Signed-off-by: Jack Steiner <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-17perf/core, x86: Remove duplicate perf_event_mask variableRobert Richter1-6/+3
The same information is stored also in x86_pmu.intel_ctrl. This patch removes perf_event_mask and instead uses x86_pmu.intel_ctrl directly. Signed-off-by: Robert Richter <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-17perf/core, x86: Remove cpu_hw_events.interruptsRobert Richter1-1/+0
This member in the struct is not used anymore and can be removed. Signed-off-by: Robert Richter <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-17perf/core, x86: Reduce number of CONFIG_X86_LOCAL_APIC macrosRobert Richter1-6/+9
The function reserve_pmc_hardware() and release_pmc_hardware() were hard to read. This patch improves readability of the code by removing most of the CONFIG_X86_LOCAL_APIC macros. Signed-off-by: Robert Richter <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-17perf: Fix unexported generic perf_arch_fetch_caller_regsFrederic Weisbecker1-1/+2
perf_arch_fetch_caller_regs() is exported for the overriden x86 version, but not for the generic weak version. As a general rule, weak functions should not have their symbol exported in the same file they are defined. So let's export it on trace_event_perf.c as it is used by trace events only. This fixes: ERROR: ".perf_arch_fetch_caller_regs" [fs/xfs/xfs.ko] undefined! ERROR: ".perf_arch_fetch_caller_regs" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined! -v2: And also only build it if trace events are enabled. -v3: Fix changelog mistake Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Xiao Guangrong <[email protected]> Cc: Paul Mackerras <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-17perf, x86: Report error code that returned from x86_pmu.hw_config()Robert Richter1-2/+3
If x86_pmu.hw_config() fails a fixed error code (-EOPNOTSUPP) is returned even if a different error was reported. This patch fixes this. Signed-off-by: Robert Richter <[email protected]> Acked-by: Cyrill Gorcunov <[email protected]> Acked-by: Lin Ming <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-16perf: Fix unexported generic perf_arch_fetch_caller_regsFrederic Weisbecker1-1/+2
perf_arch_fetch_caller_regs() is exported for the overriden x86 version, but not for the generic weak version. As a general rule, weak functions should not have their symbol exported in the same file they are defined. So let's export it on trace_event_perf.c as it is used by trace events only. This fixes: ERROR: ".perf_arch_fetch_caller_regs" [fs/xfs/xfs.ko] undefined! ERROR: ".perf_arch_fetch_caller_regs" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined! -v2: And also only build it if trace events are enabled. -v3: Fix changelog mistake Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Xiao Guangrong <[email protected]> Cc: Paul Mackerras <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-16x86: Handle legacy PIC interrupts on all the cpu'sSuresh Siddha3-1/+31
Ingo Molnar reported that with the recent changes of not statically blocking IRQ0_VECTOR..IRQ15_VECTOR's on all the cpu's, broke an AMD platform (with Nvidia chipset) boot when "noapic" boot option is used. On this platform, legacy PIC interrupts are getting delivered to all the cpu's instead of just the boot cpu. Thus not initializing the vector to irq mapping for the legacy irq's resulted in not handling certain interrupts causing boot hang. Fix this by initializing the vector to irq mapping on all the logical cpu's, if the legacy IRQ is handled by the legacy PIC. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Suresh Siddha <[email protected]> [ -v2: io-apic-enabled improvement ] Acked-by: Yinghai Lu <[email protected]> Cc: Eric W. Biederman <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-15perf, x86: Enable not tagged retired instruction counting on P4sCyrill Gorcunov1-5/+3
This should turn on instruction counting on P4s, which was missing in the first version of the new PMU driver. It's inaccurate for now, we still need dependant event to tag mops before we can count them precisely. The result is that the number of instruction may be lifted up. Signed-off-by: Cyrill Gorcunov <[email protected]> Signed-off-by: Lin Ming <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-14Merge branches 'battery-2.6.34', 'bugzilla-10805', 'bugzilla-14668', ↵Len Brown97-3797/+5743
'bugzilla-531916-power-state', 'ht-warn-2.6.34', 'pnp', 'processor-rename', 'sony-2.6.34', 'suse-bugzilla-531547', 'tz-check', 'video' and 'misc-2.6.34' into release
2010-03-14ACPI: processor: driver doesn't need to evaluate _PDCAlex Chiang1-0/+3
Now that the early _PDC evaluation path knows how to correctly evaluate _PDC on only physically present processors, there's no need for the processor driver to evaluate it later when it loads. To cover the hotplug case, push _PDC evaluation down into the hotplug paths. Cc: [email protected] Cc: Tony Luck <[email protected]> Acked-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Alex Chiang <[email protected]> Signed-off-by: Len Brown <[email protected]>
2010-03-14ACPI: delete the "acpi=ht" boot optionLen Brown1-16/+3
acpi=ht was important in 2003 -- before ACPI was universally deployed and enabled by default in the major Linux distributions. At that time, there were a fair number of people who or chose to, or needed to, run with acpi=off, yet also wanted access to Hyper-threading. Today we find that many invocations of "acpi=ht" are accidental, and thus is it possible that it is doing more harm than good. In 2.6.34, we warn on invocation of acpi=ht. In 2.6.35, we delete the boot option. Signed-off-by: Len Brown <[email protected]>
2010-03-14ACPI: plan to delete "acpi=ht" boot optionLen Brown1-1/+3
Signed-off-by: Len Brown <[email protected]>