aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kernel
AgeCommit message (Collapse)AuthorFilesLines
2010-04-01x86, hpet: Fix bug in RTC emulationAlok Kataria1-0/+1
We think there exists a bug in the HPET code that emulates the RTC. In the normal case, when the RTC frequency is set, the rtc driver tells the hpet code about it here: int hpet_set_periodic_freq(unsigned long freq) { uint64_t clc; if (!is_hpet_enabled()) return 0; if (freq <= DEFAULT_RTC_INT_FREQ) hpet_pie_limit = DEFAULT_RTC_INT_FREQ / freq; else { clc = (uint64_t) hpet_clockevent.mult * NSEC_PER_SEC; do_div(clc, freq); clc >>= hpet_clockevent.shift; hpet_pie_delta = (unsigned long) clc; } return 1; } If freq is set to 64Hz (DEFAULT_RTC_INT_FREQ) or lower, then hpet_pie_limit (a static) is set to non-zero. Then, on every one-shot HPET interrupt, hpet_rtc_timer_reinit is called to compute the next timeout. Well, that function has this logic: if (!(hpet_rtc_flags & RTC_PIE) || hpet_pie_limit) delta = hpet_default_delta; else delta = hpet_pie_delta; Since hpet_pie_limit is not 0, hpet_default_delta is used. That corresponds to 64Hz. Now, if you set a different rtc frequency, you'll take the else path through hpet_set_periodic_freq, but unfortunately no one resets hpet_pie_limit back to 0. Boom....now you are stuck with 64Hz RTC interrupts forever. The patch below just resets the hpet_pie_limit value when requested freq is greater than DEFAULT_RTC_INT_FREQ, which we think fixes this problem. Signed-off-by: Alok N Kataria <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Daniel Hecht <[email protected]> Cc: Venkatesh Pallipadi <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2010-04-01x86, hpet: Erratum workaround for read after write of HPET comparatorPallipadi, Venkatesh1-1/+7
On Wed, Feb 24, 2010 at 03:37:04PM -0800, Justin Piszcz wrote: > Hello, > > Again, on the Intel DP55KG board: > > # uname -a > Linux host 2.6.33 #1 SMP Wed Feb 24 18:31:00 EST 2010 x86_64 GNU/Linux > > [ 1.237600] ------------[ cut here ]------------ > [ 1.237890] WARNING: at arch/x86/kernel/hpet.c:404 hpet_next_event+0x70/0x80() > [ 1.238221] Hardware name: > [ 1.238504] hpet: compare register read back failed. > [ 1.238793] Modules linked in: > [ 1.239315] Pid: 0, comm: swapper Not tainted 2.6.33 #1 > [ 1.239605] Call Trace: > [ 1.239886] <IRQ> [<ffffffff81056c13>] ? warn_slowpath_common+0x73/0xb0 > [ 1.240409] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0 > [ 1.240699] [<ffffffff81056cb0>] ? warn_slowpath_fmt+0x40/0x50 > [ 1.240992] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0 > [ 1.241281] [<ffffffff81041ad0>] ? hpet_next_event+0x70/0x80 > [ 1.241573] [<ffffffff81079608>] ? tick_dev_program_event+0x38/0xc0 > [ 1.241859] [<ffffffff81078e32>] ? tick_handle_oneshot_broadcast+0xe2/0x100 > [ 1.246533] [<ffffffff8102a67a>] ? timer_interrupt+0x1a/0x30 > [ 1.246826] [<ffffffff81085499>] ? handle_IRQ_event+0x39/0xd0 > [ 1.247118] [<ffffffff81087368>] ? handle_edge_irq+0xb8/0x160 > [ 1.247407] [<ffffffff81029f55>] ? handle_irq+0x15/0x20 > [ 1.247689] [<ffffffff810294a2>] ? do_IRQ+0x62/0xe0 > [ 1.247976] [<ffffffff8146be53>] ? ret_from_intr+0x0/0xa > [ 1.248262] <EOI> [<ffffffff8102f277>] ? mwait_idle+0x57/0x80 > [ 1.248796] [<ffffffff8102645c>] ? cpu_idle+0x5c/0xb0 > [ 1.249080] ---[ end trace db7f668fb6fef4e1 ]--- > > Is this something Intel has to fix or is it a bug in the kernel? This is a chipset erratum. Thomas: You mentioned we can retain this check only for known-buggy and hpet debug kind of options. But here is the simple workaround patch for this particular erratum. Some chipsets have a erratum due to which read immediately following a write of HPET comparator returns old comparator value instead of most recently written value. Erratum 15 in "Intel I/O Controller Hub 9 (ICH9) Family Specification Update" (http://www.intel.com/assets/pdf/specupdate/316973.pdf) Workaround for the errata is to read the comparator twice if the first one fails. Signed-off-by: Venkatesh Pallipadi <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]> Cc: Venkatesh Pallipadi <[email protected]> Cc: <[email protected]>
2010-04-01x86: Handle overlapping mptablesAndi Kleen1-2/+2
We found a system where the MP table MPC and MPF structures overlap. That doesn't really matter because the mptable is not used anyways with ACPI, but it leads to a panic in the early allocator due to the overlapping reservations in 2.6.33. Earlier kernels handled this without problems. Simply change these reservations to reserve_early_overlap_ok to avoid the panic. Reported-by: Thomas Renninger <[email protected]> Tested-by: Thomas Renninger <[email protected]> Signed-off-by: Andi Kleen <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]> Cc: <[email protected]>
2010-04-01x86,kgdb: Always initialize the hw breakpoint attributeJason Wessel1-1/+1
It is required to call hw_breakpoint_init() on an attr before using it in any other calls. This fixes the problem where kgdb will sometimes fail to initialize on x86_64. Signed-off-by: Jason Wessel <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: 2.6.33 <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]>
2010-04-01perf: Use hot regs with software sched switch/migrate eventsFrederic Weisbecker1-2/+0
Scheduler's task migration events don't work because they always pass NULL regs perf_sw_event(). The event hence gets filtered in perf_swevent_add(). Scheduler's context switches events use task_pt_regs() to get the context when the event occured which is a wrong thing to do as this won't give us the place in the kernel where we went to sleep but the place where we left userspace. The result is even more wrong if we switch from a kernel thread. Use the hot regs snapshot for both events as they belong to the non-interrupt/exception based events family. Unlike page faults or so that provide the regs matching the exact origin of the event, we need to save the current context. This makes the task migration event working and fix the context switch callchains and origin ip. Example: perf record -a -e cs Before: 10.91% ksoftirqd/0 0 [k] 0000000000000000 | --- (nil) perf_callchain perf_prepare_sample __perf_event_overflow perf_swevent_overflow perf_swevent_add perf_swevent_ctx_event do_perf_sw_event __perf_sw_event perf_event_task_sched_out schedule run_ksoftirqd kthread kernel_thread_helper After: 23.77% hald-addon-stor [kernel.kallsyms] [k] schedule | --- schedule | |--60.00%-- schedule_timeout | wait_for_common | wait_for_completion | blk_execute_rq | scsi_execute | scsi_execute_req | sr_test_unit_ready | | | |--66.67%-- sr_media_change | | media_changed | | cdrom_media_changed | | sr_block_media_changed | | check_disk_change | | cdrom_open v2: Always build perf_arch_fetch_caller_regs() now that software events need that too. They don't need it from modules, unlike trace events, so we keep the EXPORT_SYMBOL in trace_event_perf.c Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: David Miller <[email protected]>
2010-03-31x86: Make e820_remove_range to handle all covered caseYinghai Lu1-4/+20
Rusty found on lguest with trim_bios_range, max_pfn is not right anymore, and looks e820_remove_range does not work right. [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] LGUEST: 0000000000000000 - 0000000004000000 (usable) [ 0.000000] Notice: NX (Execute Disable) protection missing in CPU or disabled in BIOS! [ 0.000000] DMI not present or invalid. [ 0.000000] last_pfn = 0x3fa0 max_arch_pfn = 0x100000 [ 0.000000] init_memory_mapping: 0000000000000000-0000000003fa0000 root cause is: the e820_remove_range doesn't handle the all covered case. e820_remove_range(BIOS_START, BIOS_END - BIOS_START, ...) produces a bogus range as a result. Make it match e820_update_range() by handling that case too. Reported-by: Rusty Russell <[email protected]> Signed-off-by: Yinghai Lu <[email protected]> Tested-by: Rusty Russell <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2010-03-30include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵Tejun Heo52-15/+40
implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_*.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). * x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <[email protected]> Guess-its-ok-by: Christoph Lameter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Lee Schermerhorn <[email protected]>
2010-03-29x86: Make sure free_init_pages() frees pages on page boundaryYinghai Lu3-6/+11
When CONFIG_NO_BOOTMEM=y, it could use memory more effiently, or in a more compact fashion. Example: Allocated new RAMDISK: 00ec2000 - 0248ce57 Move RAMDISK from 000000002ea04000 - 000000002ffcee56 to 00ec2000 - 0248ce56 The new RAMDISK's end is not page aligned. Last page could be shared with other users. When free_init_pages are called for initrd or .init, the page could be freed and we could corrupt other data. code segment in free_init_pages(): | for (; addr < end; addr += PAGE_SIZE) { | ClearPageReserved(virt_to_page(addr)); | init_page_count(virt_to_page(addr)); | memset((void *)(addr & ~(PAGE_SIZE-1)), | POISON_FREE_INITMEM, PAGE_SIZE); | free_page(addr); | totalram_pages++; | } last half page could be used as one whole free page. So page align the boundaries. -v2: make the original initramdisk to be aligned, according to Johannes, otherwise we have the chance to lose one page. we still need to keep initrd_end not aligned, otherwise it could confuse decompressor. -v3: change to WARN_ON instead, suggested by Johannes. -v4: use PAGE_ALIGN, suggested by Johannes. We may fix that macro name later to PAGE_ALIGN_UP, and PAGE_ALIGN_DOWN Add comments about assuming ramdisk start is aligned in relocate_initrd(), change to re get ramdisk_image instead of save it to make diff smaller. Add warning for wrong range, suggested by Johannes. -v6: remove one WARN() We need to align beginning in free_init_pages() do not copy more than ramdisk_size, noticed by Johannes Reported-by: Stanislaw Gruszka <[email protected]> Tested-by: Stanislaw Gruszka <[email protected]> Signed-off-by: Yinghai Lu <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: David Miller <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Linus Torvalds <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-29x86: Make smp_locks end with page alignmentYinghai Lu1-1/+1
Fix: ------------[ cut here ]------------ WARNING: at arch/x86/mm/init.c:342 free_init_pages+0x4c/0xfa() free_init_pages: range [0x40daf000, 0x40db5c24] is not aligned Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.34-rc2-tip-03946-g4f16b23-dirty #50 Call Trace: [<40232e9f>] warn_slowpath_common+0x65/0x7c [<4021c9f0>] ? free_init_pages+0x4c/0xfa [<40881434>] ? _etext+0x0/0x24 [<40232eea>] warn_slowpath_fmt+0x24/0x27 [<4021c9f0>] free_init_pages+0x4c/0xfa [<40881434>] ? _etext+0x0/0x24 [<40d3f4bd>] alternative_instructions+0xf6/0x100 [<40d3fe4f>] check_bugs+0xbd/0xbf [<40d398a7>] start_kernel+0x2d5/0x2e4 [<40d390ce>] i386_start_kernel+0xce/0xd5 ---[ end trace 4eaa2a86a8e2da22 ]--- Comments in vmlinux.lds.S already said: | /* | * smp_locks might be freed after init | * start/end must be page aligned | */ Signed-off-by: Yinghai Lu <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: David Miller <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Linus Torvalds <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-26Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds4-9/+55
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, amd: Restrict usage of c1e_idle() x86: Fix placement of FIX_OHCI1394_BASE x86: Handle legacy PIC interrupts on all the cpu's
2010-03-26perf, x86: Add Nehelem PMU programming errata workaroundPeter Zijlstra4-10/+45
Implement the workaround for Intel Errata AAK100 and AAP53. Also, remove the Core-i7 name for Nehalem events since there are also Westmere based i7 chips. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> LKML-Reference: <1269608924.12097.147.camel@laptop> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-26x86, ptrace: Fix block-stepPeter Zijlstra4-6/+48
Implement ptrace-block-step using TIF_BLOCKSTEP which will set DEBUGCTLMSR_BTF when set for a task while preserving any other DEBUGCTLMSR bits. Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-26x86, perf, bts, mm: Delete the never used BTS-ptrace codePeter Zijlstra13-2317/+6
Support for the PMU's BTS features has been upstreamed in v2.6.32, but we still have the old and disabled ptrace-BTS, as Linus noticed it not so long ago. It's buggy: TIF_DEBUGCTLMSR is trampling all over that MSR without regard for other uses (perf) and doesn't provide the flexibility needed for perf either. Its users are ptrace-block-step and ptrace-bts, since ptrace-bts was never used and ptrace-block-step can be implemented using a much simpler approach. So axe all 3000 lines of it. That includes the *locked_memory*() APIs in mm/mlock.c as well. Reported-by: Linus Torvalds <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Roland McGrath <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Markus Metzger <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Andrew Morton <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-26perf, x86: Clean up debugctlmsr bit definitionsPeter Zijlstra2-21/+9
Move all debugctlmsr thingies into msr-index.h Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-26x86, perf: Add raw events support for the P4 PMUCyrill Gorcunov1-322/+424
The adding of raw event support lead to complete code refactoring. I hope is became more readable then it was. The list of changes: 1) The 64bit config field is enough to hold all information we need to track event details. To achieve it we used *own* enum for events selection in ESCR register and map this key into proper value at moment of event enabling. For the same reason we use 12LSB bits in CCCR register -- to track which exactly cache trace event was requested. And we cear this bits at real 'write' moment. 2) There is no per-cpu area reserved for P4 PMU anymore. We don't need it. All is held by config. 3) Now we may use any available counter, ie we try to grab any possible counter. v2: - Lin Ming reported the lack of ESCR selector in CCCR for cache events v3: - Don't loose cache event codes at config unpacking procedure, we may need it one day so no obscure hack behind our back, better to clear reserved bits explicitly when needed (thanks Ming for pointing out) - Lin Ming fixed misplaced opcodes in cache events Signed-off-by: Cyrill Gorcunov <[email protected]> Tested-by: Lin Ming <[email protected]> Signed-off-by: Lin Ming <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Robert Richter <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Cyrill Gorcunov <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> [ v4: did a few whitespace fixlets ] Signed-off-by: Ingo Molnar <[email protected]>
2010-03-22Merge commit 'v2.6.34-rc2' into perf/coreIngo Molnar19-314/+46
Merge reason: Pick up latest perf fixes from upstream. Signed-off-by: Ingo Molnar <[email protected]>
2010-03-22x86 / perf: Fix suspend to RAM on HP nx6325Rafael J. Wysocki1-3/+5
Commit 3f6da3905398826d85731247e7fbcf53400c18bd (perf: Rework and fix the arch CPU-hotplug hooks) broke suspend to RAM on my HP nx6325 (and most likely on other AMD-based boxes too) by allowing amd_pmu_cpu_offline() to be executed for CPUs that are going offline as part of the suspend process. The problem is that cpuhw->amd_nb may be NULL already, so the function should make sure it's not NULL before accessing the object pointed to by it. Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2010-03-19x86, amd: Restrict usage of c1e_idle()Andreas Herrmann1-8/+24
Currently c1e_idle returns true for all CPUs greater than or equal to family 0xf model 0x40. This covers too many CPUs. Meanwhile a respective erratum for the underlying problem was filed (#400). This patch adds the logic to check whether erratum #400 applies to a given CPU. Especially for CPUs where SMI/HW triggered C1e is not supported, c1e_idle() doesn't need to be used. We can check this by looking at the respective OSVW bit for erratum #400. Cc: <[email protected]> # .32.x .33.x Signed-off-by: Andreas Herrmann <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2010-03-19x86, tboot: Add support for S3 memory integrity protectionShane Wang1-9/+11
This patch adds support for S3 memory integrity protection within an Intel(R) TXT launched kernel, for all kernel and userspace memory. All RAM used by the kernel and userspace, as indicated by memory ranges of type E820_RAM and E820_RESERVED_KERN in the e820 table, will be integrity protected. The MAINTAINERS file is also updated to reflect the maintainers of the TXT-related code. All MACing is done in tboot, based on a complexity analysis and tradeoff. v3: Compared with v2, this patch adds a check of array size in tboot.c, and a note to specify which c/s of tboot supports this kind of MACing in intel_txt.txt. Signed-off-by: Shane Wang <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Joseph Cihula <[email protected]> Acked-by: Pavel Machek <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2010-03-18Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds6-155/+176
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (35 commits) perf: Fix unexported generic perf_arch_fetch_caller_regs perf record: Don't try to find buildids in a zero sized file perf: export perf_trace_regs and perf_arch_fetch_caller_regs perf, x86: Fix hw_perf_enable() event assignment perf, ppc: Fix compile error due to new cpu notifiers perf: Make the install relative to DESTDIR if specified kprobes: Calculate the index correctly when freeing the out-of-line execution slot perf tools: Fix sparse CPU numbering related bugs perf_event: Fix oops triggered by cpu offline/online perf: Drop the obsolete profile naming for trace events perf: Take a hot regs snapshot for trace events perf: Introduce new perf_fetch_caller_regs() for hot regs snapshot perf/x86-64: Use frame pointer to walk on irq and process stacks lockdep: Move lock events under lockdep recursion protection perf report: Print the map table just after samples for which no map was found perf report: Add multiple event support perf session: Change perf_session post processing functions to take histogram tree perf session: Add storage for seperating event types in report perf session: Change add_hist_entry to take the tree root instead of session perf record: Add ID and to recorded event data when recording multiple events ...
2010-03-18x86, perf: Fix few cosmetic dabs for P4 pmu (comments and constantify)Cyrill Gorcunov1-1/+1
- A few ESCR have escaped fixing at previous attempt. - p4_escr_map is read only, make it const. Nothing serious. Signed-off-by: Cyrill Gorcunov <[email protected]> Cc: Lin Ming <[email protected]> LKML-Reference: <20100318211256.GH5062@lenovo> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-18perf_events: Fix resource leak in x86 __hw_perf_event_init()Stephane Eranian1-1/+4
If reserve_pmc_hardware() succeeds but reserve_ds_buffers() fails, then we need to release_pmc_hardware. It won't be done by the destroy() callback because we return before setting it in case of error. Signed-off-by: Stephane Eranian <[email protected]> Cc: <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> -- arch/x86/kernel/cpu/perf_event.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
2010-03-18perf, x86: Add cache events for the Pentium-4 PMULin Ming1-6/+147
Move the HT bit setting code from p4_pmu_event_map to p4_hw_config. So the cache events can get HT bit set correctly. Tested on my P4 desktop, below 6 cache events work: L1-dcache-load-misses LLC-load-misses dTLB-load-misses dTLB-store-misses iTLB-loads iTLB-load-misses Signed-off-by: Lin Ming <[email protected]> Reviewed-by: Cyrill Gorcunov <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-18perf, x86: Add a key to simplify template lookup in Pentium-4 PMULin Ming1-52/+34
Currently, we use opcode(Event and Event-Selector) + emask to look up template in p4_templates. But cache events (L1-dcache-load-misses, LLC-load-misses, etc) use the same event(P4_REPLAY_EVENT) to do the counting, ie, they have the same opcode and emask. So we can not use current lookup mechanism to find the template for cache events. This patch introduces a "key", which is the index into p4_templates. The low 12 bits of CCCR are reserved, so we can hide the "key" in the low 12 bits of hwc->config. We extract the key from hwc->config and then quickly find the template. Signed-off-by: Lin Ming <[email protected]> Reviewed-by: Cyrill Gorcunov <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-18x86, perf: Use apic_write unconditionallyCyrill Gorcunov2-6/+0
Since apic_write() maps to a plain noop in the !CONFIG_X86_LOCAL_APIC case we're safe to remove this conditional compilation and clean up the code a bit. Signed-off-by: Cyrill Gorcunov <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-17x86, UV: Delete unneeded boot messagesJack Steiner1-3/+0
SGI:UV: Delete extra boot messages that describe the system topology. These messages are no longer useful. Signed-off-by: Jack Steiner <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-17perf/core, x86: Remove duplicate perf_event_mask variableRobert Richter1-6/+3
The same information is stored also in x86_pmu.intel_ctrl. This patch removes perf_event_mask and instead uses x86_pmu.intel_ctrl directly. Signed-off-by: Robert Richter <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-17perf/core, x86: Remove cpu_hw_events.interruptsRobert Richter1-1/+0
This member in the struct is not used anymore and can be removed. Signed-off-by: Robert Richter <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-17perf/core, x86: Reduce number of CONFIG_X86_LOCAL_APIC macrosRobert Richter1-6/+9
The function reserve_pmc_hardware() and release_pmc_hardware() were hard to read. This patch improves readability of the code by removing most of the CONFIG_X86_LOCAL_APIC macros. Signed-off-by: Robert Richter <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-17perf: Fix unexported generic perf_arch_fetch_caller_regsFrederic Weisbecker1-1/+2
perf_arch_fetch_caller_regs() is exported for the overriden x86 version, but not for the generic weak version. As a general rule, weak functions should not have their symbol exported in the same file they are defined. So let's export it on trace_event_perf.c as it is used by trace events only. This fixes: ERROR: ".perf_arch_fetch_caller_regs" [fs/xfs/xfs.ko] undefined! ERROR: ".perf_arch_fetch_caller_regs" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined! -v2: And also only build it if trace events are enabled. -v3: Fix changelog mistake Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Xiao Guangrong <[email protected]> Cc: Paul Mackerras <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-17perf, x86: Report error code that returned from x86_pmu.hw_config()Robert Richter1-2/+3
If x86_pmu.hw_config() fails a fixed error code (-EOPNOTSUPP) is returned even if a different error was reported. This patch fixes this. Signed-off-by: Robert Richter <[email protected]> Acked-by: Cyrill Gorcunov <[email protected]> Acked-by: Lin Ming <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-16perf: Fix unexported generic perf_arch_fetch_caller_regsFrederic Weisbecker1-1/+2
perf_arch_fetch_caller_regs() is exported for the overriden x86 version, but not for the generic weak version. As a general rule, weak functions should not have their symbol exported in the same file they are defined. So let's export it on trace_event_perf.c as it is used by trace events only. This fixes: ERROR: ".perf_arch_fetch_caller_regs" [fs/xfs/xfs.ko] undefined! ERROR: ".perf_arch_fetch_caller_regs" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined! -v2: And also only build it if trace events are enabled. -v3: Fix changelog mistake Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Xiao Guangrong <[email protected]> Cc: Paul Mackerras <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-16x86: Handle legacy PIC interrupts on all the cpu'sSuresh Siddha3-1/+31
Ingo Molnar reported that with the recent changes of not statically blocking IRQ0_VECTOR..IRQ15_VECTOR's on all the cpu's, broke an AMD platform (with Nvidia chipset) boot when "noapic" boot option is used. On this platform, legacy PIC interrupts are getting delivered to all the cpu's instead of just the boot cpu. Thus not initializing the vector to irq mapping for the legacy irq's resulted in not handling certain interrupts causing boot hang. Fix this by initializing the vector to irq mapping on all the logical cpu's, if the legacy IRQ is handled by the legacy PIC. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Suresh Siddha <[email protected]> [ -v2: io-apic-enabled improvement ] Acked-by: Yinghai Lu <[email protected]> Cc: Eric W. Biederman <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-15perf, x86: Enable not tagged retired instruction counting on P4sCyrill Gorcunov1-5/+3
This should turn on instruction counting on P4s, which was missing in the first version of the new PMU driver. It's inaccurate for now, we still need dependant event to tag mops before we can count them precisely. The result is that the number of instruction may be lifted up. Signed-off-by: Cyrill Gorcunov <[email protected]> Signed-off-by: Lin Ming <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-14Merge branches 'battery-2.6.34', 'bugzilla-10805', 'bugzilla-14668', ↵Len Brown97-3797/+5743
'bugzilla-531916-power-state', 'ht-warn-2.6.34', 'pnp', 'processor-rename', 'sony-2.6.34', 'suse-bugzilla-531547', 'tz-check', 'video' and 'misc-2.6.34' into release
2010-03-14ACPI: processor: driver doesn't need to evaluate _PDCAlex Chiang1-0/+3
Now that the early _PDC evaluation path knows how to correctly evaluate _PDC on only physically present processors, there's no need for the processor driver to evaluate it later when it loads. To cover the hotplug case, push _PDC evaluation down into the hotplug paths. Cc: [email protected] Cc: Tony Luck <[email protected]> Acked-by: Venkatesh Pallipadi <[email protected]> Signed-off-by: Alex Chiang <[email protected]> Signed-off-by: Len Brown <[email protected]>
2010-03-14ACPI: delete the "acpi=ht" boot optionLen Brown1-16/+3
acpi=ht was important in 2003 -- before ACPI was universally deployed and enabled by default in the major Linux distributions. At that time, there were a fair number of people who or chose to, or needed to, run with acpi=off, yet also wanted access to Hyper-threading. Today we find that many invocations of "acpi=ht" are accidental, and thus is it possible that it is doing more harm than good. In 2.6.34, we warn on invocation of acpi=ht. In 2.6.35, we delete the boot option. Signed-off-by: Len Brown <[email protected]>
2010-03-14ACPI: plan to delete "acpi=ht" boot optionLen Brown1-1/+3
Signed-off-by: Len Brown <[email protected]>
2010-03-14ACPI: remove "acpi=ht" DMI blacklistLen Brown1-93/+0
SuSE added these entries when deploying ACPI in Linux-2.4. I pulled them into Linux-2.6 on 2003-08-09. Over the last 6+ years, several entries have proven to be unnecessary and deleted, while no new entries have been added. Matthew suggests that they now have negative value, and I agree. Based-on-patch-by: Matthew Garrett <[email protected]> Signed-off-by: Len Brown <[email protected]>
2010-03-14x86/mce: Fix build bug with CONFIG_PROVE_LOCKING=y && CONFIG_X86_MCE_INTEL=yIngo Molnar1-2/+2
Commit f56e8a076 "x86/mce: Fix RCU lockdep splats" introduced the following build bug: arch/x86/kernel/cpu/mcheck/mce.c: In function 'mce_log': arch/x86/kernel/cpu/mcheck/mce.c:166: error: 'mce_read_mutex' undeclared (first use in this function) arch/x86/kernel/cpu/mcheck/mce.c:166: error: (Each undeclared identifier is reported only once arch/x86/kernel/cpu/mcheck/mce.c:166: error: for each function it appears in.) Move the in-the-middle-of-file lock variable up to the variable definition section, the top of the .c file. Cc: Paul E. McKenney <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-13Merge branch 'sched-fixes-for-linus' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix pick_next_highest_task_rt() for cgroups sched: Cleanup: remove unused variable in try_to_wake_up() x86: Fix sched_clock_cpu for systems with unsynchronized TSC
2010-03-13Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds5-7/+19
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, k8 nb: Fix boot crash: enable k8_northbridges unconditionally on AMD systems x86, UV: Fix target_cpus() in x2apic_uv_x.c x86: Reduce per cpu warning boot up messages x86: Reduce per cpu MCA boot up messages x86_64, cpa: Don't work hard in preserving kernel 2M mappings when using 4K already
2010-03-13Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds2-4/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: locking: Make sparse work with inline spinlocks and rwlocks x86/mce: Fix RCU lockdep splats rcu: Increase RCU CPU stall timeouts if PROVE_RCU ftrace: Replace read_barrier_depends() with rcu_dereference_raw() rcu: Suppress RCU lockdep warnings during early boot rcu, ftrace: Fix RCU lockdep splat in ftrace_perf_buf_prepare() rcu: Suppress __mpol_dup() false positive from RCU lockdep rcu: Make rcu_read_lock_sched_held() handle !PREEMPT rcu: Add control variables to lockdep_rcu_dereference() diagnostics rcu, cgroup: Relax the check in task_subsys_state() as early boot is now handled by lockdep-RCU rcu: Use wrapper function instead of exporting tasklist_lock sched, rcu: Fix rcu_dereference() for RCU-lockdep rcu: Make task_subsys_state() RCU-lockdep checks handle boot-time use rcu: Fix holdoff for accelerated GPs for last non-dynticked CPU x86/gart: Unexport gart_iommu_aperture Fix trivial conflicts in kernel/trace/ftrace.c
2010-03-13Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds6-46/+62
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf: Provide generic perf_sample_data initialization MAINTAINERS: Add Arnaldo as tools/perf/ co-maintainer perf trace: Don't use pager if scripting perf trace/scripting: Remove extraneous header read perf, ARM: Modify kuser rmb() call to compile for Thumb-2 x86/stacktrace: Don't dereference bad frame pointers perf archive: Don't try to collect files without a build-id perf_events, x86: Fixup fixed counter constraints perf, x86: Restrict the ANY flag perf, x86: rename macro in ARCH_PERFMON_EVENTSEL_ENABLE perf, x86: add some IBS macros to perf_event.h perf, x86: make IBS macros available in perf_event.h hw-breakpoints: Remove stub unthrottle callback x86/hw-breakpoints: Remove the name field perf: Remove pointless breakpoint union perf lock: Drop the buffers multiplexing dependency perf lock: Fix and add misc documentally things percpu: Add __percpu sparse annotations to hw_breakpoint
2010-03-13x86, perf: Unmask LVTPC only if we have APIC supportedCyrill Gorcunov1-0/+2
Ingo reported: | | There's a build failure on -tip with the P4 driver, on UP 32-bit, if | PERF_EVENTS is enabled but UP_APIC is disabled: | | arch/x86/built-in.o: In function `p4_pmu_handle_irq': | perf_event.c:(.text+0xa756): undefined reference to `apic' | perf_event.c:(.text+0xa76e): undefined reference to `apic' | So we have to unmask LVTPC only if we're configured to have one. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Cyrill Gorcunov <[email protected]> CC: Lin Ming <[email protected]> CC: Peter Zijlstra <[email protected]> LKML-Reference: <20100313081116.GA5179@lenovo> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-13x86, k8 nb: Fix boot crash: enable k8_northbridges unconditionally on AMD ↵Borislav Petkov2-1/+15
systems de957628ce7c84764ff41331111036b3ae5bad0f changed setting of the x86_init.iommu.iommu_init function ptr only when GART IOMMU is found. One side effect of it is that num_k8_northbridges is not initialized anymore if not explicitly called. This resulted in uninitialized pointers in <arch/x86/kernel/cpu/intel_cacheinfo.c:amd_calc_l3_indices()>, for example, which uses the num_k8_northbridges thing through node_to_k8_nb_misc(). Fix that through an initcall that runs right after the PCI subsystem and does all the scanning. Then, remove initialization in gart_iommu_init() which is a rootfs_initcall and we're running before that. What is more, since num_k8_northbridges is being used in other places beside GART IOMMU, include it whenever we add AMD CPU support. The previous dependency chain in kconfig contained K8_NB depends on AGP_AMD64|GART_IOMMU which was clearly incorrect. The more natural way in terms of hardware dependency should be AGP_AMD64|GART_IOMMU depends on K8_NB depends on CPU_SUP_AMD && PCI. Make it so Number One! Signed-off-by: Borislav Petkov <[email protected]> Cc: FUJITA Tomonori <[email protected]> Cc: Joerg Roedel <[email protected]> LKML-Reference: <20100312144303.GA29262@aftab> Signed-off-by: Ingo Molnar <[email protected]> Tested-by: Joerg Roedel <[email protected]>
2010-03-12Merge branch 'for-linus' of ↵Linus Torvalds7-7/+7
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (56 commits) doc: fix typo in comment explaining rb_tree usage Remove fs/ntfs/ChangeLog doc: fix console doc typo doc: cpuset: Update the cpuset flag file Fix of spelling in arch/sparc/kernel/leon_kernel.c no longer needed Remove drivers/parport/ChangeLog Remove drivers/char/ChangeLog doc: typo - Table 1-2 should refer to "status", not "statm" tree-wide: fix typos "ass?o[sc]iac?te" -> "associate" in comments No need to patch AMD-provided drivers/gpu/drm/radeon/atombios.h devres/irq: Fix devm_irq_match comment Remove reference to kthread_create_on_cpu tree-wide: Assorted spelling fixes tree-wide: fix 'lenght' typo in comments and code drm/kms: fix spelling in error message doc: capitalization and other minor fixes in pnp doc devres: typo fix s/dev/devm/ Remove redundant trailing semicolons from macros fix typo "definetly" -> "definitely" in comment tree-wide: s/widht/width/g typo in comments ... Fix trivial conflict in Documentation/laptops/00-INDEX
2010-03-12Add generic sys_olduname()Christoph Hellwig1-49/+0
Add generic implementations of the old and really old uname system calls. Note that sh only implements sys_olduname but not sys_oldolduname, but I'm not going to bother with another ifdef for that special case. m32r implemented an old uname but never wired it up, so kill it, too. Signed-off-by: Christoph Hellwig <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Hirokazu Takata <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Al Viro <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: "Luck, Tony" <[email protected]> Cc: James Morris <[email protected]> Cc: Andreas Schwab <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2010-03-12improve sys_newuname() for compat architecturesChristoph Hellwig1-12/+0
On an architecture that supports 32-bit compat we need to override the reported machine in uname with the 32-bit value. Instead of doing this separately in every architecture introduce a COMPAT_UTS_MACHINE define in <asm/compat.h> and apply it directly in sys_newuname(). Signed-off-by: Christoph Hellwig <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Hirokazu Takata <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Al Viro <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: "Luck, Tony" <[email protected]> Cc: James Morris <[email protected]> Cc: Andreas Schwab <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2010-03-12Add generic sys_ipc wrapperChristoph Hellwig1-85/+0
Add a generic implementation of the ipc demultiplexer syscall. Except for s390 and sparc64 all implementations of the sys_ipc are nearly identical. There are slight differences in the types of the parameters, where mips and powerpc as the only 64-bit architectures with sys_ipc use unsigned long for the "third" argument as it gets casted to a pointer later, while it traditionally is an "int" like most other paramters. frv goes even further and uses unsigned long for all parameters execept for "ptr" which is a pointer type everywhere. The change from int to unsigned long for "third" and back to "int" for the others on frv should be fine due to the in-register calling conventions for syscalls (we already had a similar issue with the generic sys_ptrace), but I'd prefer to have the arch maintainers looks over this in details. Except for that h8300, m68k and m68knommu lack an impplementation of the semtimedop sub call which this patch adds, and various architectures have gets used - at least on i386 it seems superflous as the compat code on x86-64 and ia64 doesn't even bother to implement it. [[email protected]: add sys_ipc to sys_ni.c] Signed-off-by: Christoph Hellwig <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Hirokazu Takata <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Reviewed-by: H. Peter Anvin <[email protected]> Cc: Al Viro <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: "Luck, Tony" <[email protected]> Cc: James Morris <[email protected]> Cc: Andreas Schwab <[email protected]> Acked-by: Jesper Nilsson <[email protected]> Acked-by: Russell King <[email protected]> Acked-by: David Howells <[email protected]> Acked-by: Kyle McMartin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>