aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kernel/apic/apic.c
AgeCommit message (Collapse)AuthorFilesLines
2017-01-05x86/irq, trace: Add __irq_entry annotation to x86's platform IRQ handlersDaniel Bristot de Oliveira1-4/+4
This patch adds the __irq_entry annotation to the default x86 platform IRQ handlers. ftrace's function_graph tracer uses the __irq_entry annotation to notify the entry and return of IRQ handlers. For example, before the patch: 354549.667252 | 3) d..1 | default_idle_call() { 354549.667252 | 3) d..1 | arch_cpu_idle() { 354549.667253 | 3) d..1 | default_idle() { 354549.696886 | 3) d..1 | smp_trace_reschedule_interrupt() { 354549.696886 | 3) d..1 | irq_enter() { 354549.696886 | 3) d..1 | rcu_irq_enter() { After the patch: 366416.254476 | 3) d..1 | arch_cpu_idle() { 366416.254476 | 3) d..1 | default_idle() { 366416.261566 | 3) d..1 ==========> | 366416.261566 | 3) d..1 | smp_trace_reschedule_interrupt() { 366416.261566 | 3) d..1 | irq_enter() { 366416.261566 | 3) d..1 | rcu_irq_enter() { KASAN also uses this annotation. The smp_apic_timer_interrupt() was already annotated. Signed-off-by: Daniel Bristot de Oliveira <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Cc: Aaron Lu <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Baoquan He <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Claudio Fontana <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Dou Liyang <[email protected]> Cc: Gu Zheng <[email protected]> Cc: Hidehiro Kawai <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Peter Zijlstra (Intel) <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/059fdf437c2f0c09b13c18c8fe4e69999d3ffe69.1483528431.git.bristot@redhat.com Signed-off-by: Ingo Molnar <[email protected]>
2016-12-13x86/smpboot: Make logical package management more robustThomas Gleixner1-15/+0
The logical package management has several issues: - The APIC ids provided by ACPI are not required to be the same as the initial APIC id which can be retrieved by CPUID. The APIC ids provided by ACPI are those which are written by the BIOS into the APIC. The initial id is set by hardware and can not be changed. The hardware provided ids contain the real hardware package information. Especially AMD sets the effective APIC id different from the hardware id as they need to reserve space for the IOAPIC ids starting at id 0. As a consequence those machines trigger the currently active firmware bug printouts in dmesg, These are obviously wrong. - Virtual machines have their own interesting of enumerating APICs and packages which are not reliably covered by the current implementation. The sizing of the mapping array has been tweaked to be generously large to handle systems which provide a wrong core count when HT is disabled so the whole magic which checks for space in the physical hotplug case is not needed anymore. Simplify the whole machinery and do the mapping when the CPU starts and the CPUID derived physical package information is available. This solves the observed problems on AMD machines and works for the virtualization issues as well. Remove the extra call from XEN cpu bringup code as it is not longer required. Fixes: d49597fd3bc7 ("x86/cpu: Deal with broken firmware (VMWare/XEN)") Reported-and-tested-by: Borislav Petkov <[email protected]> Tested-by: Boris Ostrovsky <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: M. Vefa Bicakci <[email protected]> Cc: xen-devel <[email protected]> Cc: Charles (Chas) Williams <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Alok Kataria <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1612121102260.3429@nanos Signed-off-by: Thomas Gleixner <[email protected]>
2016-12-12Merge branch 'x86-idle-for-linus' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 idle updates from Ingo Molnar: "There were two bigger changes in this development cycle: - remove idle notifiers: 32 files changed, 74 insertions(+), 803 deletions(-) These notifiers were of questionable value and the main usecase, the i7300 driver, was essentially unmaintained and can be removed, plus modern power management concepts don't need the callback - so use this golden opportunity and get rid of this opaque and fragile callback from a latency sensitive code path. (Len Brown, Thomas Gleixner) - improve the AMD Erratum 400 workaround that used high overhead MSR polling in the idle loop (Borisla Petkov, Thomas Gleixner)" * 'x86-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Remove empty idle.h header x86/amd: Simplify AMD E400 aware idle routine x86/amd: Check for the C1E bug post ACPI subsystem init x86/bugs: Separate AMD E400 erratum and C1E bug x86/cpufeature: Provide helper to set bugs bits x86/idle: Remove enter_idle(), exit_idle() x86: Remove x86_test_and_clear_bit_percpu() x86/idle: Remove is_idle flag x86/idle: Remove idle_notifier i7300_idle: Remove this driver
2016-12-09x86: Remove empty idle.h headerThomas Gleixner1-1/+0
One include less is always a good thing(tm). Good riddance. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2016-12-09x86/amd: Simplify AMD E400 aware idle routineBorislav Petkov1-0/+2
Reorganize the E400 detection now that we have everything in place: switch the CPUs to broadcast mode after the LAPIC has been initialized and remove the facilities that were used previously on the idle path. Unfortunately static_cpu_has_bug() cannpt be used in the E400 idle routine because alternatives have been applied when the actual detection happens, so the static switching does not take effect and the test will stay false. Use boot_cpu_has_bug() instead which is definitely an improvement over the RDMSR and the cpumask handling. Suggested-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2016-11-09x86/apic: Prevent tracing on apic_msr_write_eoi()Wanpeng Li1-0/+1
The following RCU lockdep warning led to adding irq_enter()/irq_exit() into smp_reschedule_interrupt(): RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0 RCU used illegally from extended quiescent state! no locks held by swapper/1/0. do_trace_write_msr native_write_msr native_apic_msr_eoi_write smp_reschedule_interrupt reschedule_interrupt As Peterz pointed out: | So now we're making a very frequent interrupt slower because of debug | code. | | The thing is, many many smp_reschedule_interrupt() invocations don't | actually execute anything much at all and are only sent to tickle the | return to user path (which does the actual preemption). | | Having to do the whole irq_enter/irq_exit dance just for this unlikely | debug case totally blows. Use the wrmsr_notrace() variant in native_apic_msr_write_eoi, annotate the kvm variant with notrace and add a native_apic_eoi callback to the apic structure so KVM guests are covered as well. This allows to revert the irq_enter/irq_exit dance in smp_reschedule_interrupt(). Suggested-by: Peter Zijlstra <[email protected]> Suggested-by: Paolo Bonzini <[email protected]> Signed-off-by: Wanpeng Li <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Cc: [email protected] Cc: Mike Galbraith <[email protected]> Cc: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2016-10-08x86/apic: Prevent pointless warning messagesThomas Gleixner1-3/+5
Markus reported that he sees new warnings: APIC: NR_CPUS/possible_cpus limit of 4 reached. Processor 4/0x84 ignored. APIC: NR_CPUS/possible_cpus limit of 4 reached. Processor 5/0x85 ignored. This comes from the recent persistant cpuid - nodeid changes. The code which emits the warning has been called prior to these changes only for enabled processors. Now it's called for disabled processors as well to get the possible cpu accounting correct. So if the kernel is compiled for the number of actual available/enabled CPUs and the BIOS reports disabled CPUs as well then the above warnings are printed. That's a pointless exercise as it only makes sense if there are more CPUs enabled than the kernel supports. Nake the warning conditional on enabled processors so we are back to the state before these changes. Fixes: 8f54969dc8d6 ("x86/acpi: Introduce persistent storage for cpuid <-> apicid mapping") Reported-and-tested-by: Markus Trippelsdorf <[email protected]> Cc: One Thousand Gnomes <[email protected]> Cc: Dou Liyang <[email protected]> Cc: [email protected] Cc: Gu Zheng <[email protected]> Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610071549330.19804@nanos Signed-off-by: Thomas Gleixner <[email protected]>
2016-09-26x86/apic: Fix silent & fatal merge conflict in __generic_processor_info()Thomas Gleixner1-2/+0
Fix up the silent merge conflict between commit c291b0151585 in x86/urgent and commit f7c28833c2520 in x86/apic which both remove num_processors++ from the original location and then add it at two different locations. As a result num_processors is incremented twice which can cut the number of available cpus in half. Remove the one which is added by commit c291b0151585. In hindsight I should have merged x86/urgent into x86/apic _before_ adding the nodeid bits, but in hindsight we are always smarter. Reported-and-tested-by: Borislav Petkov <[email protected]> Reported-by: Mike Galbraith <[email protected]> Fixes: 1e1b37273cf7 ("Merge branch 'x86/urgent' into x86/apic") Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1609261350090.5483@nanos Cc: Dou Liyang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2016-09-26Merge branch 'x86/urgent' into x86/apicThomas Gleixner1-0/+5
Bring in the upstream modifications so we can fixup the silent merge conflict which is introduced by this merge. Signed-off-by: Thomas Gleixner <[email protected]>
2016-09-21x86/acpi: Introduce persistent storage for cpuid <-> apicid mappingGu Zheng1-3/+57
The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that, when node online/offline happens, cache based on cpuid <-> nodeid mapping such as wq_numa_possible_cpumask will not cause any problem. It contains 4 steps: 1. Enable apic registeration flow to handle both enabled and disabled cpus. 2. Introduce a new array storing all possible cpuid <-> apicid mapping. 3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid. 4. Establish all possible cpuid <-> nodeid mapping. This patch finishes step 2. In this patch, we introduce a new static array named cpuid_to_apicid[], which is large enough to store info for all possible cpus. And then, we modify the cpuid calculation. In generic_processor_info(), it simply finds the next unused cpuid. And it is also why the cpuid <-> nodeid mapping changes with node hotplug. After this patch, we find the next unused cpuid, map it to an apicid, and store the mapping in cpuid_to_apicid[], so that cpuid <-> apicid mapping will be persistent. And finally we will use this array to make cpuid <-> nodeid persistent. cpuid <-> apicid mapping is established at local apic registeration time. But non-present or disabled cpus are ignored. In this patch, we establish all possible cpuid <-> apicid mapping when registering local apic. Signed-off-by: Gu Zheng <[email protected]> Signed-off-by: Tang Chen <[email protected]> Signed-off-by: Zhu Guihua <[email protected]> Signed-off-by: Dou Liyang <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2016-09-21x86/acpi: Enable acpi to register all possible cpus at boot timeGu Zheng1-4/+15
cpuid <-> nodeid mapping is firstly established at boot time. And workqueue caches the mapping in wq_numa_possible_cpumask in wq_numa_init() at boot time. When doing node online/offline, cpuid <-> nodeid mapping is established/destroyed, which means, cpuid <-> nodeid mapping will change if node hotplug happens. But workqueue does not update wq_numa_possible_cpumask. So here is the problem: Assume we have the following cpuid <-> nodeid in the beginning: Node | CPU ------------------------ node 0 | 0-14, 60-74 node 1 | 15-29, 75-89 node 2 | 30-44, 90-104 node 3 | 45-59, 105-119 and we hot-remove node2 and node3, it becomes: Node | CPU ------------------------ node 0 | 0-14, 60-74 node 1 | 15-29, 75-89 and we hot-add node4 and node5, it becomes: Node | CPU ------------------------ node 0 | 0-14, 60-74 node 1 | 15-29, 75-89 node 4 | 30-59 node 5 | 90-119 But in wq_numa_possible_cpumask, cpu30 is still mapped to node2, and the like. When a pool workqueue is initialized, if its cpumask belongs to a node, its pool->node will be mapped to that node. And memory used by this workqueue will also be allocated on that node. static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs){ ... /* if cpumask is contained inside a NUMA node, we belong to that node */ if (wq_numa_enabled) { for_each_node(node) { if (cpumask_subset(pool->attrs->cpumask, wq_numa_possible_cpumask[node])) { pool->node = node; break; } } } Since wq_numa_possible_cpumask is not updated, it could be mapped to an offline node, which will lead to memory allocation failure: SLUB: Unable to allocate memory on node 2 (gfp=0x80d0) cache: kmalloc-192, object size: 192, buffer size: 192, default order: 1, min order: 0 node 0: slabs: 6172, objs: 259224, free: 245741 node 1: slabs: 3261, objs: 136962, free: 127656 It happens here: create_worker(struct worker_pool *pool) |--> worker = alloc_worker(pool->node); static struct worker *alloc_worker(int node) { struct worker *worker; worker = kzalloc_node(sizeof(*worker), GFP_KERNEL, node); --> Here, useing the wrong node. ...... return worker; } [Solution] There are four mappings in the kernel: 1. nodeid (logical node id) <-> pxm 2. apicid (physical cpu id) <-> nodeid 3. cpuid (logical cpu id) <-> apicid 4. cpuid (logical cpu id) <-> nodeid 1. pxm (proximity domain) is provided by ACPI firmware in SRAT, and nodeid <-> pxm mapping is setup at boot time. This mapping is persistent, won't change. 2. apicid <-> nodeid mapping is setup using info in 1. The mapping is setup at boot time and CPU hotadd time, and cleared at CPU hotremove time. This mapping is also persistent. 3. cpuid <-> apicid mapping is setup at boot time and CPU hotadd time. cpuid is allocated, lower ids first, and released at CPU hotremove time, reused for other hotadded CPUs. So this mapping is not persistent. 4. cpuid <-> nodeid mapping is also setup at boot time and CPU hotadd time, and cleared at CPU hotremove time. As a result of 3, this mapping is not persistent. To fix this problem, we establish cpuid <-> nodeid mapping for all the possible cpus at boot time, and make it persistent. And according to init_cpu_to_node(), cpuid <-> nodeid mapping is based on apicid <-> nodeid mapping and cpuid <-> apicid mapping. So the key point is obtaining all cpus' apicid. apicid can be obtained by _MAT (Multiple APIC Table Entry) method or found in MADT (Multiple APIC Description Table). So we finish the job in the following steps: 1. Enable apic registeration flow to handle both enabled and disabled cpus. This is done by introducing an extra parameter to generic_processor_info to let the caller control if disabled cpus are ignored. 2. Introduce a new array storing all possible cpuid <-> apicid mapping. And also modify the way cpuid is calculated. Establish all possible cpuid <-> apicid mapping when registering local apic. Store the mapping in this array. 3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid. This is also done by introducing an extra parameter to these apis to let the caller control if disabled cpus are ignored. 4. Establish all possible cpuid <-> nodeid mapping. This is done via an additional acpi namespace walk for processors. This patch finished step 1. Signed-off-by: Gu Zheng <[email protected]> Signed-off-by: Tang Chen <[email protected]> Signed-off-by: Zhu Guihua <[email protected]> Signed-off-by: Dou Liyang <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2016-09-20x86/apic: Get rid of apic_version[] arrayDenys Vlasenko1-10/+7
The array has a size of MAX_LOCAL_APIC, which can be as large as 32k, so it can consume up to 128k. The array has been there forever and was never used for anything useful other than a version mismatch check which was introduced in 2009. There is no reason to store the version in an array. The kernel is not prepared to handle different APIC versions anyway, so the real important part is to detect a version mismatch and warn about it, which can be done with a single variable as well. [ tglx: Massaged changelog ] Signed-off-by: Denys Vlasenko <[email protected]> CC: Andy Lutomirski <[email protected]> CC: Borislav Petkov <[email protected]> CC: Brian Gerst <[email protected]> CC: Mike Travis <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2016-09-08x86/apic: Fix num_processors value in case of failureDou Liyang1-1/+3
If the topology package map check of the APIC ID and the CPU is a failure, we don't generate the processor info for that APIC ID yet we increase disabled_cpus by one - which is buggy. Only increase num_processors once we are sure we don't fail. Signed-off-by: Dou Liyang <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Rewrote the changelog. ] Signed-off-by: Ingo Molnar <[email protected]>
2016-08-24x86/apic: Update comment about disabling processor focusWei Jiangang1-1/+0
Fix references to discarded end_level_ioapic_irq(). Signed-off-by: Wei Jiangang <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-08-24x86/apic: Do not init irq remapping if ioapic is disabledWanpeng Li1-0/+3
native_smp_prepare_cpus -> default_setup_apic_routing -> enable_IR_x2apic -> irq_remapping_prepare -> intel_prepare_irq_remapping -> intel_setup_irq_remapping So IR table is setup even if "noapic" boot parameter is added. As a result we crash later when the interrupt affinity is set due to a half initialized remapping infrastructure. Prevent remap initialization when IOAPIC is disabled. Signed-off-by: Wanpeng Li <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Joerg Roedel <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Cc: [email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2016-08-15x86/apic, ACPI: Remove the repeated lapic address override entry parsingBaoquan He1-1/+1
The ACPI MADT has a 32-bit field providing lapic address at which each processor can access its lapic information. MADT also contains an optional entry to provide a 64-bit address to override the 32-bit one. However the current code does the lapic address override entry parsing twice. One is in early_acpi_boot_init() because AMD NUMA need get boot_cpu_id earlier. The other is in acpi_boot_init() which parses all MADT entries. So in this patch we remove the repeated code in the 2nd part. Meanwhile print lapic override entry information like other MADT entry, this will be added to boot log. This patch is not supposed to change any runtime behavior, other than improving kernel messages. Signed-off-by: Baoquan He <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-08-10Merge branch 'linus' into timers/urgent, to pick up fixesIngo Molnar1-2/+2
Signed-off-by: Ingo Molnar <[email protected]>
2016-08-10x86/timers/apic: Inform TSC deadline clockevent device about recalibrationNicolai Stange1-0/+24
This patch eliminates a source of imprecise APIC timer interrupts, which imprecision may result in double interrupts or even late interrupts. The TSC deadline clockevent devices' configuration and registration happens before the TSC frequency calibration is refined in tsc_refine_calibration_work(). This results in the TSC clocksource and the TSC deadline clockevent devices being configured with slightly different frequencies: the former gets the refined one and the latter are configured with the inaccurate frequency detected earlier by means of the "Fast TSC calibration using PIT". Within the APIC code, introduce the notifier function lapic_update_tsc_freq() which reconfigures all per-CPU TSC deadline clockevent devices with the current tsc_khz. Call it from the TSC code after TSC calibration refinement has happened. Signed-off-by: Nicolai Stange <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Christopher S. Hall <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Hidehiro Kawai <[email protected]> Cc: Len Brown <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Viresh Kumar <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Pushed #ifdef CONFIG_X86_LOCAL_APIC into header, improved changelog. ] Signed-off-by: Ingo Molnar <[email protected]>
2016-08-10x86/timers/apic: Fix imprecise timer interrupts by eliminating TSC ↵Nicolai Stange1-2/+2
clockevents frequency roundoff error I noticed the following bug/misbehavior on certain Intel systems: with a single task running on a NOHZ CPU on an Intel Haswell, I recognized that I did not only get the one expected local_timer APIC interrupt, but two per second at minimum. (!) Further tracing showed that the first one precedes the programmed deadline by up to ~50us and hence, it did nothing except for reprogramming the TSC deadline clockevent device to trigger shortly thereafter again. The reason for this is imprecise calibration, the timeout we program into the APIC results in 'too short' timer interrupts. The core (hr)timer code notices this (because it has a precise ktime source and sees the short interrupt) and fixes it up by programming an additional very short interrupt period. This is obviously suboptimal. The reason for the imprecise calibration is twofold, and this patch fixes the first reason: In setup_APIC_timer(), the registered clockevent device's frequency is calculated by first dividing tsc_khz by TSC_DIVISOR and multiplying it with 1000 afterwards: (tsc_khz / TSC_DIVISOR) * 1000 The multiplication with 1000 is done for converting from kHz to Hz and the division by TSC_DIVISOR is carried out in order to make sure that the final result fits into an u32. However, with the order given in this calculation, the roundoff error introduced by the division gets magnified by a factor of 1000 by the following multiplication. To fix it, reversing the order of the division and the multiplication a la: (tsc_khz * 1000) / TSC_DIVISOR ... reduces the roundoff error already. Furthermore, if TSC_DIVISOR divides 1000, associativity holds: (tsc_khz * 1000) / TSC_DIVISOR = tsc_khz * (1000 / TSC_DIVISOR) and thus, the roundoff error even vanishes and the whole operation can be carried out within 32 bits. The powers of two that divide 1000 are 2, 4 and 8. A value of 8 for TSC_DIVISOR still allows for TSC frequencies up to 2^32 / 10^9ns * 8 = 34.4GHz which is way larger than anything to expect in the next years. Thus we also replace the current TSC_DIVISOR value of 32 by 8. Reverse the order of the divison and the multiplication in the calculation of the registered clockevent device's frequency. Signed-off-by: Nicolai Stange <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Christopher S. Hall <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Hidehiro Kawai <[email protected]> Cc: Len Brown <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Viresh Kumar <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Improved changelog. ] Signed-off-by: Ingo Molnar <[email protected]>
2016-08-04tree-wide: replace config_enabled() with IS_ENABLED()Masahiro Yamada1-1/+1
The use of config_enabled() against config options is ambiguous. In practical terms, config_enabled() is equivalent to IS_BUILTIN(), but the author might have used it for the meaning of IS_ENABLED(). Using IS_ENABLED(), IS_BUILTIN(), IS_MODULE() etc. makes the intention clearer. This commit replaces config_enabled() with IS_ENABLED() where possible. This commit is only touching bool config options. I noticed two cases where config_enabled() is used against a tristate option: - config_enabled(CONFIG_HWMON) [ drivers/net/wireless/ath/ath10k/thermal.c ] - config_enabled(CONFIG_BACKLIGHT_CLASS_DEVICE) [ drivers/gpu/drm/gma500/opregion.c ] I did not touch them because they should be converted to IS_BUILTIN() in order to keep the logic, but I was not sure it was the authors' intention. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Masahiro Yamada <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Stas Sergeev <[email protected]> Cc: Matt Redfearn <[email protected]> Cc: Joshua Kinard <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Markos Chandras <[email protected]> Cc: "Dmitry V. Levin" <[email protected]> Cc: yu-cheng yu <[email protected]> Cc: James Hogan <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Johannes Berg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Al Viro <[email protected]> Cc: Will Drewry <[email protected]> Cc: Nikolay Martynov <[email protected]> Cc: Huacai Chen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Leonid Yegoshin <[email protected]> Cc: Rafal Milecki <[email protected]> Cc: James Cowgill <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Alex Smith <[email protected]> Cc: Adam Buchbinder <[email protected]> Cc: Qais Yousef <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Mikko Rapeli <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Brian Norris <[email protected]> Cc: Hidehiro Kawai <[email protected]> Cc: "Luis R. Rodriguez" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Roland McGrath <[email protected]> Cc: Paul Burton <[email protected]> Cc: Kalle Valo <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Tony Wu <[email protected]> Cc: Huaitong Han <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Jason Cooper <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Andrea Gelmini <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Rabin Vincent <[email protected]> Cc: "Maciej W. Rozycki" <[email protected]> Cc: David Daney <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-08-01Merge branch 'x86-headers-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 header cleanups from Ingo Molnar: "This tree is a cleanup of the x86 tree reducing spurious uses of module.h - which should improve build performance a bit" * 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, crypto: Restore MODULE_LICENSE() to glue_helper.c so it loads x86/apic: Remove duplicated include from probe_64.c x86/ce4100: Remove duplicated include from ce4100.c x86/headers: Include spinlock_types.h in x8664_ksyms_64.c for missing spinlock_t x86/platform: Delete extraneous MODULE_* tags fromm ts5500 x86: Audit and remove any remaining unnecessary uses of module.h x86/kvm: Audit and remove any unnecessary uses of module.h x86/xen: Audit and remove any unnecessary uses of module.h x86/platform: Audit and remove any unnecessary uses of module.h x86/lib: Audit and remove any unnecessary uses of module.h x86/kernel: Audit and remove any unnecessary uses of module.h x86/mm: Audit and remove any unnecessary uses of module.h x86: Don't use module.h just for AUTHOR / LICENSE tags
2016-07-27Merge tag 'for-linus-4.8-rc0-tag' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: "Features and fixes for 4.8-rc0: - ACPI support for guests on ARM platforms. - Generic steal time support for arm and x86. - Support cases where kernel cpu is not Xen VCPU number (e.g., if in-guest kexec is used). - Use the system workqueue instead of a custom workqueue in various places" * tag 'for-linus-4.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (47 commits) xen: add static initialization of steal_clock op to xen_time_ops xen/pvhvm: run xen_vcpu_setup() for the boot CPU xen/evtchn: use xen_vcpu_id mapping xen/events: fifo: use xen_vcpu_id mapping xen/events: use xen_vcpu_id mapping in events_base x86/xen: use xen_vcpu_id mapping when pointing vcpu_info to shared_info x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op xen: introduce xen_vcpu_id mapping x86/acpi: store ACPI ids from MADT for future usage x86/xen: update cpuid.h from Xen-4.7 xen/evtchn: add IOCTL_EVTCHN_RESTRICT xen-blkback: really don't leak mode property xen-blkback: constify instance of "struct attribute_group" xen-blkfront: prefer xenbus_scanf() over xenbus_gather() xen-blkback: prefer xenbus_scanf() over xenbus_gather() xen: support runqueue steal time on xen arm/xen: add support for vm_assist hypercall xen: update xen headers xen-pciback: drop superfluous variables xen-pciback: short-circuit read path used for merging write values ...
2016-07-25x86/acpi: store ACPI ids from MADT for future usageVitaly Kuznetsov1-0/+2
Currently we don't save ACPI ids (unlike LAPIC ids which go to x86_cpu_to_apicid) from MADT and we may need this information later. Particularly, ACPI ids is the only existent way for a PVHVM Xen guest to figure out Xen's idea of its vCPUs ids before these CPUs boot and in some cases these ids diverge from Linux's cpu ids. Signed-off-by: Vitaly Kuznetsov <[email protected]> Signed-off-by: David Vrabel <[email protected]>
2016-07-14x86/kernel: Audit and remove any unnecessary uses of module.hPaul Gortmaker1-1/+1
Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in doing so is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each obj-y/bool instance for the presence of either and replace as needed. Build testing revealed some implicit header usage that was fixed up accordingly. Note that some bool/obj-y instances remain since module.h is the header for some exception table entry stuff, and for things like __init_or_module (code that is tossed when MODULES=n). Signed-off-by: Paul Gortmaker <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-06-10x86/apic: Fix misspelled APICClaudio Fontana1-3/+3
Signed-off-by: Claudio Fontana <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-04-13x86/cpufeature: Replace cpu_has_apic with boot_cpu_has() usageBorislav Petkov1-10/+10
Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-04-13x86/cpufeature: Replace cpu_has_tsc with boot_cpu_has() usageBorislav Petkov1-5/+5
Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Dmitry Torokhov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Sailer <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-31x86/cpufeature: Remove cpu_has_x2apicBorislav Petkov1-1/+1
Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Tony Luck <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-17Merge branch 'x86/cleanups' into x86/urgentIngo Molnar1-1/+1
Pull in some merge window leftovers. Signed-off-by: Ingo Molnar <[email protected]>
2016-02-29x86/topology: Create logical package idThomas Gleixner1-0/+14
For per package oriented services we must be able to rely on the number of CPU packages to be within bounds. Create a tracking facility, which - calculates the number of possible packages depending on nr_cpu_ids after boot - makes sure that the package id is within the number of possible packages. If the apic id is outside we map it to a logical package id if there is enough space available. Provide interfaces for drivers to query the mapping and do translations from physcial to logical ids. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Harish Chegondi <[email protected]> Cc: Jacob Pan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luis R. Rodriguez <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Toshi Kani <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-02-24x86: Fix misspellings in commentsAdam Buchbinder1-1/+1
Signed-off-by: Adam Buchbinder <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-12-19x86/apic: Introduce apic_extnmi command line parameterHidehiro Kawai1-2/+32
This patch introduces a command line parameter apic_extnmi: apic_extnmi=( bsp|all|none ) The default value is "bsp" and this is the current behavior: only the Boot-Strapping Processor receives an external NMI. "all" allows external NMIs to be broadcast to all CPUs. This would raise the success rate of panic on NMI when BSP hangs in NMI context or the external NMI is swallowed by other NMI handlers on the BSP. If you specify "none", no CPUs receive external NMIs. This is useful for the dump capture kernel so that it cannot be shot down by accidentally pressing the external NMI button (on platforms which have it) while saving a crash dump. Signed-off-by: Hidehiro Kawai <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Bandan Das <[email protected]> Cc: Baoquan He <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: [email protected] Cc: [email protected] Cc: "Maciej W. Rozycki" <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ricardo Ribalda Delgado <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: x86-ml <[email protected]> Link: http://lkml.kernel.org/r/20151210014632.25437.43778.stgit@softrs Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2015-11-24x86/apic: Fix the saving and restoring of lapic vectors during suspend/resumeJuergen Gross1-1/+10
Saving and restoring lapic vectors in lapic_suspend() and lapic_resume() is not consistent: the thmr vector saving is guarded by a different config option than the restore part. The cmci vector isn't handled at all. Those inconsistencies are not very critical, as the missing cmci vector will be set via mce resume handling, the wrong config option used for restoring the thmr vector can't be configured differently than the one which should be used. Nevertheless correct the thmr vector restore and add cmci vector handling. Signed-off-by: Juergen Gross <[email protected]> Acked-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Minor code edits. ] Signed-off-by: Ingo Molnar <[email protected]>
2015-09-30x86/apic: Deinline various functionsDenys Vlasenko1-4/+4
__x2apic_disable: 178 bytes, 3 calls __x2apic_enable: 117 bytes, 3 calls __smp_spurious_interrupt: 110 bytes, 2 calls __smp_error_interrupt: 208 bytes, 2 calls Reduces code size by about 850 bytes. Signed-off-by: Denys Vlasenko <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-09-14x86/apic: Serialize LVTT and TSC_DEADLINE writesShaohua Li1-0/+7
The APIC LVTT register is MMIO mapped but the TSC_DEADLINE register is an MSR. The write to the TSC_DEADLINE MSR is not serializing, so it's not guaranteed that the write to LVTT has reached the APIC before the TSC_DEADLINE MSR is written. In such a case the write to the MSR is ignored and as a consequence the local timer interrupt never fires. The SDM decribes this issue for xAPIC and x2APIC modes. The serialization methods recommended by the SDM differ. xAPIC: "1. Memory-mapped write to LVT Timer Register, setting bits 18:17 to 10b. 2. WRMSR to the IA32_TSC_DEADLINE MSR a value much larger than current time-stamp counter. 3. If RDMSR of the IA32_TSC_DEADLINE MSR returns zero, go to step 2. 4. WRMSR to the IA32_TSC_DEADLINE MSR the desired deadline." x2APIC: "To allow for efficient access to the APIC registers in x2APIC mode, the serializing semantics of WRMSR are relaxed when writing to the APIC registers. Thus, system software should not use 'WRMSR to APIC registers in x2APIC mode' as a serializing instruction. Read and write accesses to the APIC registers will occur in program order. A WRMSR to an APIC register may complete before all preceding stores are globally visible; software can prevent this by inserting a serializing instruction, an SFENCE, or an MFENCE before the WRMSR." The xAPIC method is to just wait for the memory mapped write to hit the LVTT by checking whether the MSR write has reached the hardware. There is no reason why a proper MFENCE after the memory mapped write would not do the same. Andi Kleen confirmed that MFENCE is sufficient for the xAPIC case as well. Issue MFENCE before writing to the TSC_DEADLINE MSR. This can be done unconditionally as all CPUs which have TSC_DEADLINE also have MFENCE support. [ tglx: Massaged the changelog ] Signed-off-by: Shaohua Li <[email protected]> Reviewed-by: Ingo Molnar <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Andi Kleen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: [email protected] #v3.7+ Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-09-01Merge branch 'x86-apic-for-linus' of ↵Linus Torvalds1-40/+44
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic updates from Thomas Gleixner: "This udpate contains: - rework the irq vector array to store a pointer to the irq descriptor instead of the irq number to avoid a lookup of the irq descriptor in the irq entry path - lguest interrupt handling cleanups - conversion of the local apic timer to the new clockevent callbacks - preparatory changes for the irq argument removal of interrupt flow handlers" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/irq: Do not dereference irq descriptor before checking it tools/lguest: Clean up include dir tools/lguest: Fix redefinition of struct virtio_pci_cfg_cap x86/irq: Store irq descriptor in vector array genirq: Provide irq_desc_has_action x86/irq: Get rid of an indentation level x86/irq: Rename VECTOR_UNDEFINED to VECTOR_UNUSED x86/irq: Replace numeric constant x86/irq: Protect smp_cleanup_move x86/lguest: Do not setup unused irq vectors x86/lguest: Clean up lguest_setup_irq x86/apic: Drop local_irq_save/restore in timer callbacks x86/apic: Migrate apic timer to new set_state interface x86/irq: Use access helper irq_data_get_affinity_mask() x86/irq: Use accessor irq_data_get_irq_handler_data() x86/irq: Use accessor irq_data_get_node()
2015-09-01Merge branch 'x86-asm-for-linus' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 asm changes from Ingo Molnar: "The biggest changes in this cycle were: - Revamp, simplify (and in some cases fix) Time Stamp Counter (TSC) primitives. (Andy Lutomirski) - Add new, comprehensible entry and exit handlers written in C. (Andy Lutomirski) - vm86 mode cleanups and fixes. (Brian Gerst) - 32-bit compat code cleanups. (Brian Gerst) The amount of simplification in low level assembly code is already palpable: arch/x86/entry/entry_32.S | 130 +---- arch/x86/entry/entry_64.S | 197 ++----- but more simplifications are planned. There's also the usual laudry mix of low level changes - see the changelog for details" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (83 commits) x86/asm: Drop repeated macro of X86_EFLAGS_AC definition x86/asm/msr: Make wrmsrl() a function x86/asm/delay: Introduce an MWAITX-based delay with a configurable timer x86/asm: Add MONITORX/MWAITX instruction support x86/traps: Weaken context tracking entry assertions x86/asm/tsc: Add rdtscll() merge helper selftests/x86: Add syscall_nt selftest selftests/x86: Disable sigreturn_64 x86/vdso: Emit a GNU hash x86/entry: Remove do_notify_resume(), syscall_trace_leave(), and their TIF masks x86/entry/32: Migrate to C exit path x86/entry/32: Remove 32-bit syscall audit optimizations x86/vm86: Rename vm86->v86flags and v86mask x86/vm86: Rename vm86->vm86_info to user_vm86 x86/vm86: Clean up vm86.h includes x86/vm86: Move the vm86 IRQ definitions to vm86.h x86/vm86: Use the normal pt_regs area for vm86 x86/vm86: Eliminate 'struct kernel_vm86_struct' x86/vm86: Move fields from 'struct kernel_vm86_struct' to 'struct vm86' x86/vm86: Move vm86 fields out of 'thread_struct' ...
2015-08-22x86/apic: Fix fallout from x2apic cleanupThomas Gleixner1-7/+7
In the recent x2apic cleanup I got two things really wrong: 1) The safety check in __disable_x2apic which allows the function to be called unconditionally is backwards. The check is there to prevent access to the apic MSR in case that the machine has no apic. Though right now it returns if the machine has an apic and therefor the disabling of x2apic is never invoked. 2) x2apic_disable() sets x2apic_mode to 0 after registering the local apic. That's wrong, because register_lapic_address() checks x2apic mode and therefor takes the wrong code path. This results in boot failures on machines with x2apic preenabled by BIOS and can also lead to an fatal MSR access on machines without apic. The solutions are simple: 1) Correct the sanity check for apic availability 2) Clear x2apic_mode _before_ calling register_lapic_address() Fixes: 659006bf3ae3 'x86/x2apic: Split enable and setup function' Reported-and-tested-by: Javier Monteagudo <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://bugzilla.redhat.com/show_bug.cgi?id=1224764 Cc: [email protected] # 4.0+ Cc: Laura Abbott <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Tony Luck <[email protected]> Cc: Borislav Petkov <[email protected]>
2015-07-30x86/apic: Drop local_irq_save/restore in timer callbacksThomas Gleixner1-15/+3
These callbacks are called with interrupts disabled from the core code. Fixup the local caller to disable interrupts. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Viresh Kumar <[email protected]>
2015-07-30x86/apic: Migrate apic timer to new set_state interfaceViresh Kumar1-35/+51
Migrate apic driver to the new 'set-state' interface provided by clockevents core, the earlier 'set-mode' interface is marked obsolete now. This also enables us to implement callbacks for new states of clockevent devices, for example: ONESHOT_STOPPED. We weren't doing anything while switching to resume mode and so that callback isn't implemented. Signed-off-by: Viresh Kumar <[email protected]> Cc: [email protected] Cc: Jiang Liu <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Rientjes <[email protected]> Cc: Bandan Das <[email protected]> Link: http://lkml.kernel.org/r/1896ac5989d27f2ac37f4786af9bd537e1921b83.1437042675.git.viresh.kumar@linaro.org Signed-off-by: Thomas Gleixner <[email protected]>
2015-07-06x86/asm/tsc: Rename native_read_tsc() to rdtsc()Andy Lutomirski1-4/+4
Now that there is no paravirt TSC, the "native" is inappropriate. The function does RDTSC, so give it the obvious name: rdtsc(). Suggested-by: Borislav Petkov <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Huang Rui <[email protected]> Cc: John Stultz <[email protected]> Cc: Len Brown <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: kvm ML <[email protected]> Link: http://lkml.kernel.org/r/fd43e16281991f096c1e4d21574d9e1402c62d39.1434501121.git.luto@kernel.org [ Ported it to v4.2-rc1. ] Signed-off-by: Ingo Molnar <[email protected]>
2015-07-06x86/asm/tsc: Replace rdtscll() with native_read_tsc()Andy Lutomirski1-4/+4
Now that the ->read_tsc() paravirt hook is gone, rdtscll() is just a wrapper around native_read_tsc(). Unwrap it. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Huang Rui <[email protected]> Cc: John Stultz <[email protected]> Cc: Len Brown <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: kvm ML <[email protected]> Link: http://lkml.kernel.org/r/d2449ae62c1b1fb90195bcfb19ef4a35883a04dc.1434501121.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-04-01x86/apic: Remove verify_local_APIC()Bandan Das1-62/+0
__verify_local_APIC() is detritus from the early APIC days. Its return value isn't used anywhere and the information it prints when debug is enabled is already part of APIC initialization messages printed to syslog. Off with it! Signed-off-by: Bandan Das <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-02-13Revert "x86/apic: Only disable CPU x2apic mode when necessary"Linus Torvalds1-2/+1
This reverts commit 5fcee53ce705d49c766f8a302c7e93bdfc33c124. It causes the suspend to fail on at least the Chromebook Pixel, possibly other platforms too. Joerg Roedel points out that the logic should probably have been if (max_physical_apicid > 255 || !(IS_ENABLED(CONFIG_HYPERVISOR_GUEST) && hypervisor_x2apic_available())) { instead, but since the code is not in any fast-path, so we can just live without that optimization and just revert to the original code. Acked-by: Joerg Roedel <[email protected]> Acked-by: Jiang Liu <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-01-22x86: Consolidate boot cpu timer setupThomas Gleixner1-2/+2
Now that the APIC bringup is consolidated we can move the setup call for the percpu clock event device to apic_bsp_setup(). Signed-off-by: Thomas Gleixner <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Tony Luck <[email protected]> Cc: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-01-22x86/apic: Reuse apic_bsp_setup() for UP APIC setupThomas Gleixner1-31/+22
Extend apic_bsp_setup() so the same code flow can be used for APIC_init_uniprocessor(). Folded Jiangs fix to provide proper ordering of the UP setup. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Tony Luck <[email protected]> Cc: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-01-22x86/smpboot: Move apic init code to apic.cThomas Gleixner1-10/+41
We better provide proper functions which implement the required code flow in the apic code rather than letting the smpboot code open code it. That allows to make more functions static and confines the APIC functionality to apic.c where it belongs. Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Borislav Petkov <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Tony Luck <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-01-22init: Get rid of x86ismsThomas Gleixner1-0/+7
The UP local API support can be set up from an early initcall. No need for horrible hackery in the init code. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Tony Luck <[email protected]> Cc: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-01-22x86/apic: Move apic_init_uniprocessor codeThomas Gleixner1-63/+62
Move the code to a different place so we can make other functions inline. Preparatory patch for further cleanups. No change. Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Borislav Petkov <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Tony Luck <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-01-22x86/apic: Sanitize ioapic handlingThomas Gleixner1-14/+3
We have proper stubs for the IOAPIC=n case and the setup/enable function have the required checks inside now. Remove the ifdeffery and the copy&pasted conditionals. Signed-off-by: Thomas Gleixner <[email protected]>C Acked-by: Borislav Petkov <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Tony Luck <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>