aboutsummaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)AuthorFilesLines
2011-02-04x86-32: Make sure the stack is set up before we use itH. Peter Anvin4-24/+17
Since checkin ebba638ae723d8a8fc2f7abce5ec18b688b791d7 we call verify_cpu even in 32-bit mode. Unfortunately, calling a function means using the stack, and the stack pointer was not initialized in the 32-bit setup code! This code initializes the stack pointer, and simplifies the interface slightly since it is easier to rely on just a pointer value rather than a descriptor; we need to have different values for the segment register anyway. This retains start_stack as a virtual address, even though a physical address would be more convenient for 32 bits; the 64-bit code wants the other way around... Reported-by: Matthieu Castet <[email protected]> LKML-Reference: <[email protected]> Tested-by: Kees Cook <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2011-02-03memory hotplug: Define memory_block_size_bytes for x86_64 with CONFIG_X86_UVNathan Fontenot1-0/+14
Define a version of memory_block_size_bytes for x86_64 when CONFIG_X86_UV is set. Signed-off-by: Robin Holt <[email protected]> Signed-off-by: Jack Steiner <[email protected]> Signed-off-by: Nathan Fontenot <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2011-02-03x86, mm: avoid possible bogus tlb entries by clearing prev mm_cpumask after ↵Suresh Siddha1-2/+3
switching mm Clearing the cpu in prev's mm_cpumask early will avoid the flush tlb IPI's while the cr3 is still pointing to the prev mm. And this window can lead to the possibility of bogus TLB fills resulting in strange failures. One such problematic scenario is mentioned below. T1. CPU-1 is context switching from mm1 to mm2 context and got a NMI etc between the point of clearing the cpu from the mm_cpumask(mm1) and before reloading the cr3 with the new mm2. T2. CPU-2 is tearing down a specific vma for mm1 and will proceed with flushing the TLB for mm1. It doesn't send the flush TLB to CPU-1 as it doesn't see that cpu listed in the mm_cpumask(mm1). T3. After the TLB flush is complete, CPU-2 goes ahead and frees the page-table pages associated with the removed vma mapping. T4. CPU-2 now allocates those freed page-table pages for something else. T5. As the CR3 and TLB caches for mm1 is still active on CPU-1, CPU-1 can potentially speculate and walk through the page-table caches and can insert new TLB entries. As the page-table pages are already freed and being used on CPU-2, this page walk can potentially insert a bogus global TLB entry depending on the (random) contents of the page that is being used on CPU-2. T6. This bogus TLB entry being global will be active across future CR3 changes and can result in weird memory corruption etc. To avoid this issue, for the prev mm that is handing over the cpu to another mm, clear the cpu from the mm_cpumask(prev) after the cr3 is changed. Marking it for -stable, though we haven't seen any reported failure that can be attributed to this. Signed-off-by: Suresh Siddha <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: [email protected] [v2.6.32+] Signed-off-by: Linus Torvalds <[email protected]>
2011-02-03Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds1-2/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf: Fix reading in perf_event_read() watchdog: Don't change watchdog state on read of sysctl watchdog: Fix sysctl consistency watchdog: Fix broken nowatchdog logic perf: Fix Pentium4 raw event validation perf: Fix alloc_callchain_buffers()
2011-02-03x86, mtrr: Avoid MTRR reprogramming on BP during boot on UP platformsSuresh Siddha1-1/+9
Markus Kohn ran into a hard hang regression on an acer aspire 1310, when acpi is enabled. git bisect showed the following commit as the bad one that introduced the boot regression. commit d0af9eed5aa91b6b7b5049cae69e5ea956fd85c3 Author: Suresh Siddha <[email protected]> Date: Wed Aug 19 18:05:36 2009 -0700 x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init Because of the UP configuration of that platform, native_smp_prepare_cpus() bailed out (in smp_sanity_check()) before doing the set_mtrr_aps_delayed_init() Further down the boot path, native_smp_cpus_done() will call the delayed MTRR initialization for the AP's (mtrr_aps_init()) with mtrr_aps_delayed_init not set. This resulted in the boot processor reprogramming its MTRR's to the values seen during the start of the OS boot. While this is not needed ideally, this shouldn't have caused any side-effects. This is because the reprogramming of MTRR's (set_mtrr_state() that gets called via set_mtrr()) will check if the live register contents are different from what is being asked to write and will do the actual write only if they are different. BP's mtrr state is read during the start of the OS boot and typically nothing would have changed when we ask to reprogram it on BP again because of the above scenario on an UP platform. So on a normal UP platform no reprogramming of BP MTRR MSR's happens and all is well. However, on this platform, bios seems to be modifying the fixed mtrr range registers between the start of OS boot and when we double check the live registers for reprogramming BP MTRR registers. And as the live registers are modified, we end up reprogramming the MTRR's to the state seen during the start of the OS boot. During ACPI initialization, something in the bios (probably smi handler?) don't like this fact and results in a hard lockup. We didn't see this boot hang issue on this platform before the commit d0af9eed5aa91b6b7b5049cae69e5ea956fd85c3, because only the AP's (if any) will program its MTRR's to the value that BP had at the start of the OS boot. Fix this issue by checking mtrr_aps_delayed_init before continuing further in the mtrr_aps_init(). Now, only AP's (if any) will program its MTRR's to the BP values during boot. Addresses https://bugzilla.novell.com/show_bug.cgi?id=623393 [ By the way, this behavior of the bios modifying MTRR's after the start of the OS boot is not common and the kernel is not prepared to handle this situation well. Irrespective of this issue, during suspend/resume, linux kernel will try to reprogram the BP's MTRR values to the values seen during the start of the OS boot. So suspend/resume might be already broken on this platform for all linux kernel versions. ] Reported-and-bisected-by: Markus Kohn <[email protected]> Tested-by: Markus Kohn <[email protected]> Signed-off-by: Suresh Siddha <[email protected]> Cc: Thomas Renninger <[email protected]> Cc: Rafael Wysocki <[email protected]> Cc: Venkatesh Pallipadi <[email protected]> Cc: [email protected] # [v2.6.32+] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-02-02x86, nx: Don't force pages RW when setting NX bitsMatthieu CASTET1-8/+0
Xen want page table pages read only. But the initial page table (from head_*.S) live in .data or .bss. That was broken by 64edc8ed5ffae999d8d413ba006850e9e34166cb. There is absolutely no reason to force these pages RW after they have already been marked RO. Signed-off-by: Matthieu CASTET <[email protected]> Tested-by: Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2011-02-02x86: Add clock_adjtime for x86Richard Cochran4-1/+6
This patch adds the clock_adjtime system call to the x86 architecture. Signed-off-by: Richard Cochran <[email protected]> Acked-by: John Stultz <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2011-02-02Merge commit 'v2.6.38-rc3' into perf/coreIngo Molnar26-132/+98
Merge reason: Pick up latest fixes. Signed-off-by: Ingo Molnar <[email protected]>
2011-01-31x86: Rename incorrectly named parameter of numa_cpu_node()Tejun Heo1-1/+1
numa_cpu_node() prototype in numa_32.h has wrongly named parameter @apicid when it actually takes the CPU number. Change it to @cpu. Reported-by: Yinghai Lu <[email protected]> Signed-off-by: Tejun Heo <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-31Merge branch 'tip/rtmutex' of ↵Thomas Gleixner8-47/+13
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into core/locking *git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace tip/rtmutex: rtmutex: Simplify PI algorithm and make highest prio task get lock
2011-01-28x86: Fix build failure on X86_UP_APICTejun Heo2-3/+1
Commit 4c321ff8 (x86: Replace cpu_2_logical_apicid[] with early percpu variable) and following changes introduced and used x86_cpu_to_logical_apicid percpu variable. It was declared and defined inside CONFIG_SMP && CONFIG_X86_32 but if CONFIG_X86_UP_APIC is set UP configuration makes use of it and build fails. Fix it by declaring and defining it inside CONFIG_X86_LOCAL_APIC && CONFIG_X86_32. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Ingo Molnar <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-28x86: Unify NUMA initialization between 32 and 64bitTejun Heo6-84/+77
Now that everything else is unified, NUMA initialization can be unified too. * numa_init_array() and init_cpu_to_node() are moved from numa_64 to numa. * numa_32::initmem_init() is updated to call numa_init_array() and setup_arch() to call init_cpu_to_node() on 32bit too. * x86_cpu_to_node_map is now initialized to NUMA_NO_NODE on 32bit too. This is safe now as numa_init_array() will initialize it early during boot. This makes NUMA mapping fully initialized before setup_per_cpu_areas() on 32bit too and thus makes the first percpu chunk which contains all the static variables and some of dynamic area allocated with NUMA affinity correctly considered. Signed-off-by: Tejun Heo <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reported-by: Eric Dumazet <[email protected]> Reviewed-by: Pekka Enberg <[email protected]>
2011-01-28x86: Unify node_to_cpumask_map handling between 32 and 64bitTejun Heo7-115/+87
x86_32 has been managing node_to_cpumask_map explicitly from map_cpu_to_node() and friends in a rather ugly way. With previous changes, it's now possible to share the code with 64bit. * When CONFIG_NUMA_EMU is disabled, numa_add/remove_cpu() are implemented in numa.c and shared by 32 and 64bit. CONFIG_NUMA_EMU versions still live in numa_64.c. NUMA_EMU's dependency on 64bit is planned to be removed and the above should go away together. * identify_cpu() now calls numa_add_cpu() for 32bit too. This makes the explicit mask management from map_cpu_to_node() unnecessary. * The whole x86_32 specific map_cpu_to_node() chunk is no longer necessary. Dropped. Signed-off-by: Tejun Heo <[email protected]> Reviewed-by: Pekka Enberg <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: David Rientjes <[email protected]> Cc: Shaohui Zheng <[email protected]>
2011-01-28x86: Unify CPU -> NUMA node mapping between 32 and 64bitTejun Heo11-104/+85
Unlike 64bit, 32bit has been using its own cpu_to_node_map[] for CPU -> NUMA node mapping. Replace it with early_percpu variable x86_cpu_to_node_map and share the mapping code with 64bit. * USE_PERCPU_NUMA_NODE_ID is now enabled for 32bit too. * x86_cpu_to_node_map and numa_set/clear_node() are moved from numa_64 to numa. For now, on 32bit, x86_cpu_to_node_map is initialized with 0 instead of NUMA_NO_NODE. This is to avoid introducing unexpected behavior change and will be updated once init path is unified. * srat_detect_node() is now enabled for x86_32 too. It calls numa_set_node() and initializes the mapping making explicit cpu_to_node_map[] updates from map/unmap_cpu_to_node() unnecessary. Signed-off-by: Tejun Heo <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: David Rientjes <[email protected]>
2011-01-28x86: Unify cpu/apicid <-> NUMA node mapping between 32 and 64bitTejun Heo15-56/+101
The mapping between cpu/apicid and node is done via apicid_to_node[] on 64bit and apicid_2_node[] + apic->x86_32_numa_cpu_node() on 32bit. This difference makes it difficult to further unify 32 and 64bit NUMA handling. This patch unifies it by replacing both apicid_to_node[] and apicid_2_node[] with __apicid_to_node[] array, which is accessed by two accessors - set_apicid_to_node() and numa_cpu_node(). On 64bit, numa_cpu_node() always consults __apicid_to_node[] directly while 32bit goes through apic->numa_cpu_node() method to allow apic implementations to override it. srat_detect_node() for amd cpus contains workaround for broken NUMA configuration which assumes relationship between APIC ID, HT node ID and NUMA topology. Leave it to access __apicid_to_node[] directly as mapping through CPU might result in undesirable behavior change. The comment is reformatted and updated to note the ugliness. Signed-off-by: Tejun Heo <[email protected]> Reviewed-by: Pekka Enberg <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: David Rientjes <[email protected]>
2011-01-28x86: Replace apic->apicid_to_node() with ->x86_32_numa_cpu_node()Tejun Heo13-41/+37
apic->apicid_to_node() is 32bit specific apic operation which determines NUMA node for a CPU. Depending on the APIC implementation, it can be easier to determine NUMA node from either physical or logical apicid. Currently, ->apicid_to_node() takes @logical_apicid and calls hard_smp_processor_id() if the physical apicid is needed. This prevents NUMA mapping from being queried from a different CPU, which in turn makes it impossible to initialize NUMA mapping before SMP bringup. This patch replaces apic->apicid_to_node() with ->x86_32_numa_cpu_node() which takes @cpu, from which both logical and physical apicids can easily be determined. While at it, drop duplicate implementations from bigsmp_32 and summit_32, and use the default one. Signed-off-by: Tejun Heo <[email protected]> Reviewed-by: Pekka Enberg <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-28x86: Implement x86_32_early_logical_apicid() for numaq_32Tejun Heo1-2/+8
Signed-off-by: Tejun Heo <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-28x86: Implement x86_32_early_logical_apicid() for summit_32Tejun Heo1-5/+12
Factor out logical apic id calculation from summit_init_apic_ldr() and use it for the x86_32_early_logical_apicid() callback. Signed-off-by: Tejun Heo <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-28x86: Implement x86_32_early_logical_apicid() for bigsmp_32Tejun Heo1-1/+7
Signed-off-by: Tejun Heo <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-28x86: Implement the default x86_32_early_logical_apicid()Tejun Heo1-1/+6
Implement x86_32_early_logical_apicid() for the default apic flat routing. Signed-off-by: Tejun Heo <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-28x86: Add apic->x86_32_early_logical_apicid()Tejun Heo8-2/+45
On x86_32, the mapping between cpu and logical apic ID differs depending on the specific apic implementation in use. The mapping is initialized while bringing up CPUs; however, this makes early inits ignore memory topology. Add a x86_32 specific apic->x86_32_early_logical_apicid() which is called early during boot to query the mapping. The mapping is later verified against the result of init_apic_ldr(). The method is allowed to return BAD_APICID if it can't be determined early. noop variant which always returns BAD_APICID is implemented and added to all x86_32 apic implementations. Signed-off-by: Tejun Heo <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-28x86: Kill apic->cpu_to_logical_apicid()Tejun Heo11-70/+11
After the previous patch, apic->cpu_to_logical_apicid() is no longer used. Kill it. For apic types with custom cpu_to_logical_apicid() which is also used for other purposes, remove the function and modify its users to do the mapping directly. #ifdef's on CONFIG_SMP in es7000_32 and summit_32 are ignored during conversion as they are not used for UP kernels. Signed-off-by: Tejun Heo <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-28x86: Always use x86_cpu_to_logical_apicid for cpu -> logical apic idTejun Heo3-8/+15
Currently, cpu -> logical apic id translation is done by apic->cpu_to_logical_apicid() callback which may or may not use x86_cpu_to_logical_apicid. This is unnecessary as it should always equal logical_smp_processor_id() which is known early during CPU bring up. Initialize x86_cpu_to_logical_apicid after apic->init_apic_ldr() in setup_local_APIC() and always use x86_cpu_to_logical_apicid for cpu -> logical apic id mapping. Signed-off-by: Tejun Heo <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-28x86: Replace cpu_2_logical_apicid[] with early percpu variableTejun Heo8-13/+27
Unlike x86_64, on x86_32, the mapping from cpu to logical apicid may vary depending on apic in use. cpu_2_logical_apicid[] array is used for this mapping. Replace it with early percpu variable x86_cpu_to_logical_apicid to make it better aligned with other mappings. Signed-off-by: Tejun Heo <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-28x86: Make default_send_IPI_mask_sequence/allbutself_logical() 32bit onlyTejun Heo2-6/+6
Both functions are used only in 32bit. Put them inside CONFIG_X86_32. This is to prepare for logical apicid handling update. - Cyrill Gorcunov spotted that I forgot to move declarations in ipi.h under CONFIG_X86_32. Fixed. Signed-off-by: Tejun Heo <[email protected]> Reviewed-by: Pekka Enberg <[email protected]> Reviewed-by: Cyrill Gorcunov <[email protected]> Acked-by: Yinghai Lu <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-28x86: Drop x86_32 MAX_APICIDTejun Heo3-5/+3
Commit 56d91f13 (x86, acpi: Add MAX_LOCAL_APIC for 32bit) added MAX_LOCAL_APIC for x86_32 but didn't replace MAX_APICID users with it. Convert MAX_APICID users to MAX_LOCAL_APIC and drop MAX_APICID. Signed-off-by: Tejun Heo <[email protected]> Reviewed-by: Pekka Enberg <[email protected]> Acked-by: Yinghai Lu <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-28x86: Kill unused static boot_cpu_logical_apicid in smpboot.cTejun Heo1-5/+1
Signed-off-by: Tejun Heo <[email protected]> Reviewed-by: Pekka Enberg <[email protected]> Acked-by: Yinghai Lu <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-28Merge branch 'stable/bug-fixes-rc2' of ↵Linus Torvalds2-13/+13
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/bug-fixes-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/setup: Route halt operations to safe_halt pvop. xen/e820: Guard against E820_RAM not having page-aligned size or start. xen/p2m: Mark INVALID_P2M_ENTRY the mfn_list past max_pfn.
2011-01-28Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds3-35/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: percpu, x86: Fix percpu_xchg_op() x86: Remove left over system_64.h x86-64: Don't use pointer to out-of-scope variable in dump_trace()
2011-01-27x86, perf: Change two init functions to staticYinghai Lu1-2/+2
init_hw_perf_events() is called via early_initcall now. x86_pmu_event_init is x86_pmu member function. So we can change them to static. Signed-off-by: Yinghai Lu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-27perf: Fix Pentium4 raw event validationStephane Eranian1-2/+10
This patch fixes some issues with raw event validation on Pentium 4 (Netburst) based processors. As I was testing libpfm4 Netburst support, I ran into two problems in the p4_validate_raw_event() function: - the shared field must be checked ONLY when HT is on - the binding to ESCR register was missing The second item was causing raw events to not be encoded correctly compared to generic PMU events. With this patch, I can now pass Netburst events to libpfm4 examples and get meaningful results: $ task -e global_power_events:running:u noploop 1 noploop for 1 seconds 3,206,304,898 global_power_events:running Signed-off-by: Stephane Eranian <[email protected]> Acked-by: Cyrill Gorcunov <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-27xen/setup: Route halt operations to safe_halt pvop.Stefano Stabellini1-0/+1
With this patch, the cpuidle driver does not load and does not issue the mwait operations. Instead the hypervisor is doing them (b/c we call the safe_halt pvops call). This fixes quite a lot of bootup issues wherein the user had to force interrupts for the continuation of the bootup. Details are discussed in: http://lists.xensource.com/archives/html/xen-devel/2011-01/msg00535.html [v2: Wrote the commit description] Reported-by: Daniel De Graaf <[email protected]> Tested-by: Daniel De Graaf <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
2011-01-27xen/e820: Guard against E820_RAM not having page-aligned size or start.Stefano Stabellini1-1/+6
Under Dell Inspiron 1525, and Intel SandyBridge SDP's the BIOS e820 RAM is not page-aligned: [ 0.000000] Xen: 0000000000100000 - 00000000df66d800 (usable) We were not handling that and ended up setting up a pagetable that included up to df66e000 with the disastrous effect that when memset(NODE_DATA(nodeid), 0, sizeof(pg_data_t)); tried to clear the page it would crash at the 2K mark. Initially reported by Michael Young @ http://lists.xensource.com/archives/html/xen-devel/2011-01/msg00108.html The fix is to page-align the size and also take into consideration the start of the E820 (in case that is not page-aligned either). This fixes the bootup failure on those affected machines. This patch is a rework of the Micheal A Young initial patch and considers the case if the start is not page-aligned. Reported-by: Michael A Young <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: Michael A Young <[email protected]>
2011-01-27xen/p2m: Mark INVALID_P2M_ENTRY the mfn_list past max_pfn.Stefan Bader1-12/+6
In case the mfn_list does not have enough entries to fill a p2m page we do not want the entries from max_pfn up to the boundary to be filled with unknown values. Hence set them to INVALID_P2M_ENTRY. Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
2011-01-27rwsem: Move duplicate function prototypes to linux/rwsem.hThomas Gleixner1-9/+0
All architecture specific rwsem headers carry the same function prototypes. Just x86 adds asmregparm, which is an empty define on all other architectures. S390 has a stale rwsem_downgrade_write() prototype. Remove the duplicates and add the prototypes to linux/rwsem.h Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: David Howells <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Richard Henderson <[email protected]> Acked-by: Tony Luck <[email protected]> Acked-by: Heiko Carstens <[email protected]> Cc: Paul Mundt <[email protected]> Acked-by: David Miller <[email protected]> Cc: Chris Zankel <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2011-01-27rwsem: Unify the duplicate rwsem_is_locked() inlinesThomas Gleixner1-5/+0
Instead of having the same implementation in each architecture, move it to linux/rwsem.h and remove the duplicates. It's unlikely that an arch will ever implement something different, but we can deal with that when it happens. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: David Howells <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Matt Turner <[email protected]> Acked-by: Tony Luck <[email protected]> Acked-by: Heiko Carstens <[email protected]> Cc: Paul Mundt <[email protected]> Acked-by: David Miller <[email protected]> Cc: Chris Zankel <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2011-01-27rwsem: Move duplicate init macros and functions to linux/rwsem.hThomas Gleixner1-25/+0
The rwsem initializers and related macros and functions are mostly the same. Some of them lack the lockdep initializer, but having it in place does not matter for architectures which do not support lockdep. powerpc, sparc, x86: No functional change sh, s390: Removes the duplicate init_rwsem (inline and #define) alpha, ia64, xtensa: Use the lockdep capable init function in lib/rwsem.c which is just uninlining the init function for the LOCKDEP=n case Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: David Howells <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Matt Turner <[email protected]> Acked-by: Tony Luck <[email protected]> Acked-by: Heiko Carstens <[email protected]> Cc: Paul Mundt <[email protected]> Acked-by: David Miller <[email protected]> Cc: Chris Zankel <[email protected]> LKML-Reference: <[email protected]>
2011-01-27rwsem: Move duplicate struct rwsem declaration to linux/rwsem.hThomas Gleixner1-12/+0
The difference between these declarations is the data type of the count member and the lack of lockdep in some architectures/ long is equivivalent to signed long and the #ifdef guarded dep_map member does not hurt anyone. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: David Howells <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Matt Turner <[email protected]> Acked-by: Tony Luck <[email protected]> Acked-by: Heiko Carstens <[email protected]> Cc: Paul Mundt <[email protected]> Acked-by: David Miller <[email protected]> Cc: Chris Zankel <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2011-01-27x86: Cleanup rwsem_count_t typedefThomas Gleixner1-15/+10
Remove the typedef which has no real reason to be there. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: David Howells <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Matt Turner <[email protected]> Cc: Tony Luck <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Paul Mundt <[email protected]> Cc: David Miller <[email protected]> Cc: Chris Zankel <[email protected]> LKML-Reference: <[email protected]>
2011-01-27rwsem: Cleanup includesThomas Gleixner1-6/+0
All rwsem implementations include the same headers. Include them from include/linux/rwsem.h Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: David Howells <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Matt Turner <[email protected]> Acked-by: Tony Luck <[email protected]> Acked-by: Heiko Carstens <[email protected]> Cc: Paul Mundt <[email protected]> Acked-by: David Miller <[email protected]> Cc: Chris Zankel <[email protected]> LKML-Reference: <[email protected]>
2011-01-26x86: Don't copy per_cpu cpuinfo for BSP two timesYinghai Lu1-3/+3
smp_store_cpu_info(0) will do that. Signed-off-by: Yinghai Lu <[email protected]> Cc: Suresh Siddha <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Borislav Petkov <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-26x86: Move llc_shared_map out of cpu_infoYinghai Lu5-39/+21
cpu_info is already with per_cpu, We can take llc_shared_map out of cpu_info, and declare it as per_cpu variable directly. So later referencing could be simple and directly instead of diving to find cpu_info at first. Also could make smp_store_cpu_info() much simple to avoid to do save and restore trick. Signed-off-by: Yinghai Lu <[email protected]> Cc: Hans Rosenfeld <[email protected]> Cc: Alok N Kataria <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Hans J. Koch <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Andreas Herrmann <[email protected]> Cc: Robert Richter <[email protected]> Cc: Suresh Siddha <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-26x86, amd: Extend AMD northbridge caching code to support "Link Control" devicesHans Rosenfeld2-2/+10
"Link Control" devices (NB function 4) will be used by L3 cache partitioning on family 0x15. Signed-off-by: Hans Rosenfeld <[email protected]> Cc: <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-26x86, amd: Enable L3 cache index disable on family 0x15Hans Rosenfeld1-0/+3
AMD family 0x15 CPUs support L3 cache index disable, so enable it on them. Signed-off-by: Hans Rosenfeld <[email protected]> Cc: <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-26x86, amd: Normalize compute unit IDs on multi-node processorsAndreas Herrmann2-2/+7
On multi-node CPUs we don't need the socket wide compute unit ID but the node-wide compute unit ID. Thus we need to normalize the value. This is similar to what we do with cpu_core_id. A compute unit is then identified by physical_package_id, node_id, and compute_unit_id. Signed-off-by: Andreas Herrmann <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-26percpu, x86: Fix percpu_xchg_op()Eric Dumazet1-12/+12
These recent percpu commits: 2485b6464cf8: x86,percpu: Move out of place 64 bit ops into X86_64 section 8270137a0d50: cpuops: Use cmpxchg for xchg to avoid lock semantics Caused this 'perf top' crash: Kernel panic - not syncing: Fatal exception in interrupt Pid: 0, comm: swapper Tainted: G D 2.6.38-rc2-00181-gef71723 #413 Call Trace: <IRQ> [<ffffffff810465b5>] ? panic ? kmsg_dump ? kmsg_dump ? oops_end ? no_context ? __bad_area_nosemaphore ? perf_output_begin ? bad_area_nosemaphore ? do_page_fault ? __task_pid_nr_ns ? perf_event_tid ? __perf_event_header__init_id ? validate_chain ? perf_output_sample ? trace_hardirqs_off ? page_fault ? irq_work_run ? update_process_times ? tick_sched_timer ? tick_sched_timer ? __run_hrtimer ? hrtimer_interrupt ? account_system_vtime ? smp_apic_timer_interrupt ? apic_timer_interrupt ... Looking at assembly code, I found: list = this_cpu_xchg(irq_work_list, NULL); gives this wrong code : (gcc-4.1.2 cross compiler) ffffffff810bc45e: mov %gs:0xead0,%rax cmpxchg %rax,%gs:0xead0 jne ffffffff810bc45e <irq_work_run+0x3e> test %rax,%rax je ffffffff810bc4aa <irq_work_run+0x8a> Tell gcc we dirty eax/rax register in percpu_xchg_op() Compiler must use another register to store pxo_new__ We also dont need to reload percpu value after a jump, since a 'failed' cmpxchg already updated eax/rax Wrong generated code was : xor %rax,%rax /* load 0 into %rax */ 1: mov %gs:0xead0,%rax cmpxchg %rax,%gs:0xead0 jne 1b test %rax,%rax After patch : xor %rdx,%rdx /* load 0 into %rdx */ mov %gs:0xead0,%rax 1: cmpxchg %rdx,%gs:0xead0 jne 1b: test %rax,%rax Signed-off-by: Eric Dumazet <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Tejun Heo <[email protected]> LKML-Reference: <1295973114.3588.312.camel@edumazet-laptop> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-26x86: Remove left over system_64.hYinghai Lu1-22/+0
Left-over from the x86 merge ... Signed-off-by: Yinghai Lu <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-01-25x86-64, mem: Convert memmove() to assembly file and fix return value bugFenghua Yu3-192/+198
memmove_64.c only implements memmove() function which is completely written in inline assembly code. Therefore it doesn't make sense to keep the assembly code in .c file. Currently memmove() doesn't store return value to rax. This may cause issue if caller uses the return value. The patch fixes this issue. Signed-off-by: Fenghua Yu <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2011-01-26thp: fix PARAVIRT x86 32bit noPAEAndrea Arcangeli1-3/+2
This fixes TRANSPARENT_HUGEPAGE=y with PARAVIRT=y and HIGHMEM64=n. The #ifdef that this patch removes was erratically introduced to fix a build error for noPAE (where pmd.pmd doesn't exist). So then the kernel built but it failed at runtime because set_pmd_at was a noop. This will correct it by enabling set_pmd_at for noPAE mode too. Signed-off-by: Andrea Arcangeli <[email protected]> Reported-by: werner <[email protected]> Reported-by: Minchan Kim <[email protected]> Tested-by: Minchan Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-01-25percpu: align percpu readmostly subsection to cachelineTejun Heo1-2/+2
Currently percpu readmostly subsection may share cachelines with other percpu subsections which may result in unnecessary cacheline bounce and performance degradation. This patch adds @cacheline parameter to PERCPU() and PERCPU_VADDR() linker macros, makes each arch linker scripts specify its cacheline size and use it to align percpu subsections. This is based on Shaohua's x86 only patch. Signed-off-by: Tejun Heo <[email protected]> Cc: Shaohua Li <[email protected]>