aboutsummaryrefslogtreecommitdiff
path: root/arch/x86
AgeCommit message (Collapse)AuthorFilesLines
2013-10-17kvm: Add struct kvm arg to memslot APIsAneesh Kumar K.V1-2/+3
We will use that in the later patch to find the kvm ops handler Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2013-10-17x86 / msr: add 64bit _on_cpu access functionsJacob Pan2-0/+84
Having 64-bit MSR access methods on given CPU can avoid shifting and simplify MSR content manipulation. We already have other combinations of rdmsrl_xxx and wrmsrl_xxx but missing the _on_cpu version. Signed-off-by: Srinivas Pandruvada <[email protected]> Signed-off-by: Jacob Pan <[email protected]> Reviewed-by: H. Peter Anvin <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2013-10-16perf/x86: Optimize intel_pmu_pebs_fixup_ip()Peter Zijlstra1-14/+38
There's been reports of high NMI handler overhead, highlighted by such kernel messages: [ 3697.380195] perf samples too long (10009 > 10000), lowering kernel.perf_event_max_sample_rate to 13000 [ 3697.389509] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 9.331 msecs Don Zickus analyzed the source of the overhead and reported: > While there are a few places that are causing latencies, for now I focused on > the longest one first. It seems to be 'copy_user_from_nmi' > > intel_pmu_handle_irq -> > intel_pmu_drain_pebs_nhm -> > __intel_pmu_drain_pebs_nhm -> > __intel_pmu_pebs_event -> > intel_pmu_pebs_fixup_ip -> > copy_from_user_nmi > > In intel_pmu_pebs_fixup_ip(), if the while-loop goes over 50, the sum of > all the copy_from_user_nmi latencies seems to go over 1,000,000 cycles > (there are some cases where only 10 iterations are needed to go that high > too, but in generall over 50 or so). At this point copy_user_from_nmi > seems to account for over 90% of the nmi latency. The solution to that is to avoid having to call copy_from_user_nmi() for every instruction. Since we already limit the max basic block size, we can easily pre-allocate a piece of memory to copy the entire thing into in one go. Don reported this test result: > Your patch made a huge difference in improvement. The > copy_from_user_nmi() no longer hits the million of cycles. I still > have a batch of 100,000-300,000 cycles. My longest NMI paths used > to be dominated by copy_from_user_nmi, now it is not (I have to dig > up the new hot path). Reported-and-tested-by: Don Zickus <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Linus Torvalds <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-10-15Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-3/+14
Pull kvm fix from Gleb Natapov. * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Enable pvspinlock after jump_label_init() to avoid VM hang
2013-10-15Merge tag 'stable/for-linus-3.12-rc4-tag' of ↵Linus Torvalds1-0/+9
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen fixes from Stefano Stabellini: "A small fix for Xen on x86_32 and a build fix for xen-tpmfront on arm64" * tag 'stable/for-linus-3.12-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: Fix possible user space selector corruption tpm: xen-tpmfront: fix missing declaration of xen_domain
2013-10-15KVM: Enable pvspinlock after jump_label_init() to avoid VM hangRaghavendra K T1-3/+14
We use jump label to enable pv-spinlock. With the changes in (442e0973e927 Merge branch 'x86/jumplabel'), the jump label behaviour has changed that would result in eventual hang of the VM since we would end up in a situation where slow path locks would halt the vcpus but we will not be able to wakeup the vcpu by lock releaser using unlock kick. Similar problem in Xen and more detailed description is available in a945928ea270 (xen: Do not enable spinlocks before jump_label_init() has executed) This patch splits kvm_spinlock_init to separate jump label changes with pvops patching and also make jump label enabling after jump_label_init(). Signed-off-by: Raghavendra K T <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Steven Rostedt <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-10-15KVM: Drop FOLL_GET in GUP when doing async page faultchai wen1-2/+2
Page pinning is not mandatory in kvm async page fault processing since after async page fault event is delivered to a guest it accesses page once again and does its own GUP. Drop the FOLL_GET flag in GUP in async_pf code, and do some simplifying in check/clear processing. Suggested-by: Gleb Natapov <[email protected]> Signed-off-by: Gu zheng <[email protected]> Signed-off-by: chai wen <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-10-15x86: Update UV3 hub revision IDRuss Anderson1-1/+1
The UV3 hub revision ID is different than expected. The first revision was supposed to start at 1 but instead will start at 0. Signed-off-by: Russ Anderson <[email protected]> Cc: <[email protected]> # v3.9, v3.10, v3.11 Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-10-15Merge tag 'v3.12-rc5' into perf/coreIngo Molnar8-25/+40
Merge Linux v3.12-rc5, to pick up the latest fixes. Signed-off-by: Ingo Molnar <[email protected]>
2013-10-14treewide: fix "distingush" typoMichael Opdenacker1-1/+1
Signed-off-by: Michael Opdenacker <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2013-10-14treewide: Fix common typo in "identify"Maxime Jayat2-2/+2
Correct common misspelling of "identify" as "indentify" throughout the kernel Signed-off-by: Maxime Jayat <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2013-10-14x86/microcode: Correct Kconfig dependenciesBorislav Petkov1-0/+1
I have a randconfig here which has enabled only CONFIG_MICROCODE=y CONFIG_MICROCODE_OLD_INTERFACE=y with both # CONFIG_MICROCODE_INTEL is not set # CONFIG_MICROCODE_AMD is not set off. Which makes building the microcode functionality a little pointless. Don't do that in such cases then. Signed-off-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-10-14KVM: Move gfn_to_index to x86 specific codeChristoffer Dall1-0/+7
The gfn_to_index function relies on huge page defines which either may not make sense on systems that don't support huge pages or are defined in an unconvenient way for other architectures. Since this is x86-specific, move the function to arch/x86/include/asm/kvm_host.h. Signed-off-by: Christoffer Dall <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-10-12Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds3-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull gcc "asm goto" miscompilation workaround from Ingo Molnar: "This is the fix for the GCC miscompilation discussed in the following lkml thread: [x86] BUG: unable to handle kernel paging request at 00740060 The bug in GCC has been fixed by Jakub and the fix will be part of the GCC 4.8.2 release expected to be released next week - so the quirk's version test checks for <= 4.8.1. The quirk is only added to compiler-gcc4.h and not to the higher level compiler.h because all asm goto uses are behind a feature check" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: compiler/gcc4: Add quirk for 'asm goto' miscompilation bug
2013-10-12Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds2-3/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "A build fix and a reboot quirk" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/reboot: Add reboot quirk for Dell Latitude E5410 x86, build, pci: Fix PCI_MSI build on !SMP
2013-10-11x86: Apply the asm_volatile_goto() compiler quirkIngo Molnar1-1/+1
Apply the asm_volatile_goto() compiler quirk to the new rmwcc.h file as well, introduced in: c2daa3bed53a sched, x86: Provide a per-cpu preempt_count implementation Reported-and-tested-by: Fengguang Wu <[email protected]> Reported-by: Oleg Nesterov <[email protected]> Reported-by: Peter Zijlstra <[email protected]> Suggested-by: Jakub Jelinek <[email protected]> Reviewed-by: Richard Henderson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2013-10-11Merge branch 'core/urgent' into sched/coreIngo Molnar5-26/+21
Merge in asm goto fix, to be able to apply the asm/rmwcc.h fix. Signed-off-by: Ingo Molnar <[email protected]>
2013-10-11compiler/gcc4: Add quirk for 'asm goto' miscompilation bugIngo Molnar3-6/+6
Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto' constructs, as outlined here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670 Implement a workaround suggested by Jakub Jelinek. Reported-and-tested-by: Fengguang Wu <[email protected]> Reported-by: Oleg Nesterov <[email protected]> Reported-by: Peter Zijlstra <[email protected]> Suggested-by: Jakub Jelinek <[email protected]> Reviewed-by: Richard Henderson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Cc: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2013-10-10x86, hyperv: Correctly guard the local APIC calibration codeK. Y. Srinivasan1-0/+2
The code that gets the local APIC timer frequency from the hypervisor rather depends on there being a local APIC. Signed-off-by: K. Y. Srinivasan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]>
2013-10-10x86, hyperv: Get the local APIC timer frequency from the hypervisorK. Y. Srinivasan2-0/+43
Hyper-V supports a mechanism for retrieving the local APIC frequency. Use this and bypass the calibration code in the kernel . This would allow us to boot the Linux kernel as a "modern VM" on Hyper-V where many of the legacy devices (such as PIT) are not emulated. I would like to thank Olaf Hering <[email protected]>, Jan Beulich <[email protected]> and H. Peter Anvin <[email protected]> for their help in this effort. In this version of the patch, I have addressed Jan's comments. Signed-off-by: K. Y. Srinivasan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Tested-by: Olaf Hering <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2013-10-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-12/+12
Pull kvm fixes from Paolo Bonzini: "Fixes for 3.12-rc5: two old PPC bugs and one new (3.12-rc2) x86 bug" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: ppc: booke: check range page invalidation progress on page setup KVM: PPC: Book3S HV: Fix typo in saving DSCR KVM: nVMX: fix shadow on EPT
2013-10-10KVM: nVMX: Fully support nested VMX preemption timerArthur Chunqi Li2-2/+41
This patch contains the following two changes: 1. Fix the bug in nested preemption timer support. If vmexit L2->L0 with some reasons not emulated by L1, preemption timer value should be save in such exits. 2. Add support of "Save VMX-preemption timer value" VM-Exit controls to nVMX. With this patch, nested VMX preemption timer features are fully supported. Signed-off-by: Arthur Chunqi Li <[email protected]> Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2013-10-10xen: Fix possible user space selector corruptionFrediano Ziglio1-0/+9
Due to the way kernel is initialized under Xen is possible that the ring1 selector used by the kernel for the boot cpu end up to be copied to userspace leading to segmentation fault in the userspace. Xen code in the kernel initialize no-boot cpus with correct selectors (ds and es set to __USER_DS) but the boot one keep the ring1 (passed by Xen). On task context switch (switch_to) we assume that ds, es and cs already point to __USER_DS and __KERNEL_CSso these selector are not changed. If processor is an Intel that support sysenter instruction sysenter/sysexit is used so ds and es are not restored switching back from kernel to userspace. In the case the selectors point to a ring1 instead of __USER_DS the userspace code will crash on first memory access attempt (to be precise Xen on the emulated iret used to do sysexit will detect and set ds and es to zero which lead to GPF anyway). Now if an userspace process call kernel using sysenter and get rescheduled (for me it happen on a specific init calling wait4) could happen that the ring1 selector is set to ds and es. This is quite hard to detect cause after a while these selectors are fixed (__USER_DS seems sticky). Bisecting the code commit 7076aada1040de4ed79a5977dbabdb5e5ea5e249 appears to be the first one that have this issue. Signed-off-by: Frediano Ziglio <[email protected]> Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Andrew Cooper <[email protected]>
2013-10-10swiotlb-xen: use xen_alloc/free_coherent_pagesStefano Stabellini1-2/+5
Use xen_alloc_coherent_pages and xen_free_coherent_pages to allocate or free coherent pages. We need to be careful handling the pointer returned by xen_alloc_coherent_pages, because on ARM the pointer is not equal to phys_to_virt(*dma_handle). In fact virt_to_phys only works for kernel direct mapped RAM memory. In ARM case the pointer could be an ioremap address, therefore passing it to virt_to_phys would give you another physical address that doesn't correspond to it. Make xen_create_contiguous_region take a phys_addr_t as start parameter to avoid the virt_to_phys calls which would be incorrect. Changes in v6: - remove extra spaces. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
2013-10-10x86/PCI: Coalesce multiple overlapping host bridge windowsAlexey Neyman1-4/+4
Previously we coalesced windows by expanding the first overlapping one and making the second invalid. But we never look at the expanded first window again, so we fail to notice other windows that overlap it. For example, we coalesced these: [io 0x0000-0x03af] // #0 [io 0x03e0-0x0cf7] // #1 [io 0x0000-0xdfff] // #2 into these, which still overlap: [io 0x0000-0xdfff] // #0 [io 0x03e0-0x0cf7] // #1 The fix is to expand the *second* overlapping resource and ignore the first, so we get this instead with no overlaps: [io 0x0000-0xdfff] // #2 [bhelgaas: changelog] Reference: https://bugzilla.kernel.org/show_bug.cgi?id=62511 Signed-off-by: Alexey Neyman <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2013-10-10KVM: nVMX: fix shadow on EPTGleb Natapov1-12/+12
72f857950f6f19 broke shadow on EPT. This patch reverts it and fixes PAE on nEPT (which reverted commit fixed) in other way. Shadow on EPT is now broken because while L1 builds shadow page table for L2 (which is PAE while L2 is in real mode) it never loads L2's GUEST_PDPTR[0-3]. They do not need to be loaded because without nested virtualization HW does this during guest entry if EPT is disabled, but in our case L0 emulates L2's vmentry while EPT is enables, so we cannot rely on vmcs12->guest_pdptr[0-3] to contain up-to-date values and need to re-read PDPTEs from L2 memory. This is what kvm_set_cr3() is doing, but by clearing cache bits during L2 vmentry we drop values that kvm_set_cr3() read from memory. So why the same code does not work for PAE on nEPT? kvm_set_cr3() reads pdptes into vcpu->arch.walk_mmu->pdptrs[]. walk_mmu points to vcpu->arch.nested_mmu while nested guest is running, but ept_load_pdptrs() uses vcpu->arch.mmu which contain incorrect values. Fix that by using walk_mmu in ept_(load|save)_pdptrs. Signed-off-by: Gleb Natapov <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Tested-by: Paolo Bonzini <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2013-10-09of: remove HAVE_ARCH_DEVTREE_FIXUPSRob Herring1-2/+0
HAVE_ARCH_DEVTREE_FIXUPS appears to always be needed except for sparc, but it is only used for /proc/device-teee and sparc does not enable /proc/device-tree. So this option is redundant. Remove the option and always enable it. This has the side effect of fixing /proc/device-tree on arches such as arm64 which failed to define this option. Signed-off-by: Rob Herring <[email protected]> Acked-by: Vineet Gupta <[email protected]> Acked-by: Grant Likely <[email protected]> Cc: Russell King <[email protected]> Cc: James Hogan <[email protected]> Cc: Michal Simek <[email protected]> Cc: Jonas Bonn <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: [email protected] Cc: Chris Zankel <[email protected]> Cc: Max Filippov <[email protected]>
2013-10-09of: implement pci_address_to_pio as weak functionRob Herring2-13/+0
Implement pci_address_to_pio as weak function to remove the dependency on asm/prom.h. This is in preparation to make prom.h optional. Signed-off-by: Rob Herring <[email protected]> Acked-by: Grant Likely <[email protected]> Cc: Michal Simek <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: [email protected] Cc: Grant Likely <[email protected]>
2013-10-09x86: add necessary includes for prom.hRob Herring1-0/+1
Once prom.h is no longer implicitly included, we need to include setup.h to get COMMAND_LINE_SIZE. Signed-off-by: Rob Herring <[email protected]> Acked-by: Grant Likely <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: [email protected]
2013-10-09xen: introduce xen_alloc/free_coherent_pagesStefano Stabellini1-0/+24
xen_swiotlb_alloc_coherent needs to allocate a coherent buffer for cpu and devices. On native x86 is sufficient to call __get_free_pages in order to get a coherent buffer, while on ARM (and potentially ARM64) we need to call the native dma_ops->alloc implementation. Introduce xen_alloc_coherent_pages to abstract the arch specific buffer allocation. Similarly introduce xen_free_coherent_pages to free a coherent buffer: on x86 is simply a call to free_pages while on ARM and ARM64 is arm_dma_ops.free. Signed-off-by: Stefano Stabellini <[email protected]> Changes in v7: - rename __get_dma_ops to __generic_dma_ops; - call __generic_dma_ops(hwdev)->alloc/free on arm64 too. Changes in v6: - call __get_dma_ops to get the native dma_ops pointer on arm.
2013-10-09xen: make xen_create_contiguous_region return the dma addressStefano Stabellini1-1/+3
Modify xen_create_contiguous_region to return the dma address of the newly contiguous buffer. Signed-off-by: Stefano Stabellini <[email protected]> Acked-by: Konrad Rzeszutek Wilk <[email protected]> Reviewed-by: David Vrabel <[email protected]> Changes in v4: - use virt_to_machine instead of virt_to_bus.
2013-10-09xen/x86: allow __set_phys_to_machine for autotranslate guestsStefano Stabellini1-3/+3
Allow __set_phys_to_machine to be called for autotranslate guests. It can be used to keep track of phys_to_machine changes, however we don't do anything with the information at the moment. Signed-off-by: Stefano Stabellini <[email protected]>
2013-10-09of: remove early_init_dt_setup_initrd_archRob Herring1-9/+0
All arches do essentially the same thing now for early_init_dt_setup_initrd_arch, so it can now be removed. Signed-off-by: Rob Herring <[email protected]> Acked-by: Vineet Gupta <[email protected]> Cc: Russell King <[email protected]> Cc: Mark Salter <[email protected]> Cc: Aurelien Jacquiot <[email protected]> Cc: James Hogan <[email protected]> Cc: Michal Simek <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Jonas Bonn <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: [email protected] Cc: Chris Zankel <[email protected]> Cc: Max Filippov <[email protected]> Acked-by: Grant Likely <[email protected]>
2013-10-09x86: use unflatten_and_copy_device_treeRob Herring1-15/+8
Use the common unflatten_and_copy_device_tree to copy the built-in FDT out of init section. Signed-off-by: Rob Herring <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: [email protected]
2013-10-09Merge tag 'v3.12-rc4' into sched/coreIngo Molnar16-57/+102
Merge Linux v3.12-rc4 to fix a conflict and also to refresh the tree before applying more scheduler patches. Conflicts: arch/avr32/include/asm/Kbuild Signed-off-by: Ingo Molnar <[email protected]>
2013-10-08x86, msr: Use file_inode(), not f_mapping->hostAndre Richter1-1/+1
As discussed in [1], exchange f_mapping->host with file_inode(). This is a bug, but happens to be non-manifest in this case. [1] http://lkml.kernel.org/r/[email protected] Signed-off-by: Andre Richter <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]>
2013-10-08x86: mkpiggy.c: Explicitly close the output fileGeyslan G. Bem1-6/+10
Even though the resource is released when the application is closed or when returned from main function, modify the code to make it obvious, and to keep static analysis tools from complaining. Signed-off-by: Geyslan G. Bem <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: H. Peter Anvin <[email protected]>
2013-10-08Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-8/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Various fixlets: On the kernel side: - fix a race - fix a bug in the handling of the perf ring-buffer data page On the tooling side: - fix the handling of certain corrupted perf.data files - fix a bug in 'perf probe' - fix a bug in 'perf record + perf sched' - fix a bug in 'make install' - fix a bug in libaudit feature-detection on certain distros" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf session: Fix infinite loop on invalid perf.data file perf tools: Fix installation of libexec components perf probe: Fix to find line information for probe list perf tools: Fix libaudit test perf stat: Set child_pid after perf_evlist__prepare_workload() perf tools: Add default handler for mmap2 events perf/x86: Clean up cap_user_time* setting perf: Fix perf_pmu_migrate_context
2013-10-07net: fix unsafe set_memory_rw from softirqAlexei Starovoitov1-5/+13
on x86 system with net.core.bpf_jit_enable = 1 sudo tcpdump -i eth1 'tcp port 22' causes the warning: [ 56.766097] Possible unsafe locking scenario: [ 56.766097] [ 56.780146] CPU0 [ 56.786807] ---- [ 56.793188] lock(&(&vb->lock)->rlock); [ 56.799593] <Interrupt> [ 56.805889] lock(&(&vb->lock)->rlock); [ 56.812266] [ 56.812266] *** DEADLOCK *** [ 56.812266] [ 56.830670] 1 lock held by ksoftirqd/1/13: [ 56.836838] #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8118f44c>] vm_unmap_aliases+0x8c/0x380 [ 56.849757] [ 56.849757] stack backtrace: [ 56.862194] CPU: 1 PID: 13 Comm: ksoftirqd/1 Not tainted 3.12.0-rc3+ #45 [ 56.868721] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012 [ 56.882004] ffffffff821944c0 ffff88080bbdb8c8 ffffffff8175a145 0000000000000007 [ 56.895630] ffff88080bbd5f40 ffff88080bbdb928 ffffffff81755b14 0000000000000001 [ 56.909313] ffff880800000001 ffff880800000000 ffffffff8101178f 0000000000000001 [ 56.923006] Call Trace: [ 56.929532] [<ffffffff8175a145>] dump_stack+0x55/0x76 [ 56.936067] [<ffffffff81755b14>] print_usage_bug+0x1f7/0x208 [ 56.942445] [<ffffffff8101178f>] ? save_stack_trace+0x2f/0x50 [ 56.948932] [<ffffffff810cc0a0>] ? check_usage_backwards+0x150/0x150 [ 56.955470] [<ffffffff810ccb52>] mark_lock+0x282/0x2c0 [ 56.961945] [<ffffffff810ccfed>] __lock_acquire+0x45d/0x1d50 [ 56.968474] [<ffffffff810cce6e>] ? __lock_acquire+0x2de/0x1d50 [ 56.975140] [<ffffffff81393bf5>] ? cpumask_next_and+0x55/0x90 [ 56.981942] [<ffffffff810cef72>] lock_acquire+0x92/0x1d0 [ 56.988745] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380 [ 56.995619] [<ffffffff817628f1>] _raw_spin_lock+0x41/0x50 [ 57.002493] [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380 [ 57.009447] [<ffffffff8118f52a>] vm_unmap_aliases+0x16a/0x380 [ 57.016477] [<ffffffff8118f44c>] ? vm_unmap_aliases+0x8c/0x380 [ 57.023607] [<ffffffff810436b0>] change_page_attr_set_clr+0xc0/0x460 [ 57.030818] [<ffffffff810cfb8d>] ? trace_hardirqs_on+0xd/0x10 [ 57.037896] [<ffffffff811a8330>] ? kmem_cache_free+0xb0/0x2b0 [ 57.044789] [<ffffffff811b59c3>] ? free_object_rcu+0x93/0xa0 [ 57.051720] [<ffffffff81043d9f>] set_memory_rw+0x2f/0x40 [ 57.058727] [<ffffffff8104e17c>] bpf_jit_free+0x2c/0x40 [ 57.065577] [<ffffffff81642cba>] sk_filter_release_rcu+0x1a/0x30 [ 57.072338] [<ffffffff811108e2>] rcu_process_callbacks+0x202/0x7c0 [ 57.078962] [<ffffffff81057f17>] __do_softirq+0xf7/0x3f0 [ 57.085373] [<ffffffff81058245>] run_ksoftirqd+0x35/0x70 cannot reuse jited filter memory, since it's readonly, so use original bpf insns memory to hold work_struct defer kfree of sk_filter until jit completed freeing tested on x86_64 and i386 Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2013-10-07crypto: sha256_ssse3 - also test for BMI2Oliver Neukum1-1/+1
The AVX2 implementation also uses BMI2 instructions, but doesn't test for their availability. The assumption that AVX2 and BMI2 always go together is false. Some Haswells have AVX2 but not BMI2. Signed-off-by: Oliver Neukum <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2013-10-06x86/iommu: Clean up the CONFIG_GART_IOMMU config option a bitIngo Molnar1-9/+15
Improve the explanation of this config option. Cc: Andi Kleen <[email protected]> Cc: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-10-06x86/reboot: Add reboot quirk for Dell Latitude E5410Ville Syrjälä1-0/+8
Dell Latitude E5410 needs reboot=pci to actually reboot. Signed-off-by: Ville Syrjälä <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-10-06x86/iommu: Don't make AMD_GART depend on EXPERT and default yAndi Kleen1-5/+4
The AMD_GART driver was made EXPERT/EMBEDDED a long time ago to avoid unbootable 64bit systems with 32bit only devices. This was before swiotlb was there, which does the job of this fallback today. SWIOTLB is always on, so systems should always boot. The drawback is that every system has to compile that driver in (it cannot be a module). Also: - Newer AMD CPUs (the APUs) don't seem to have AMD_GART support at all anymore. - Newer AMD platforms have a much better real IOMMU - The AMD GART driver was never very good (lots of overhead, e.g. in flushing due to some workarounds) and it's doubtful it's really better than SWIOTLB. - On older K8 systems it didn't even work with all chipsets. - The 32bit device bounce buffer case should be rare/ non performance critical these days anyways. - On non AMD systems it is not needed at all. So drop the EXPERT dependency on AMD_GART and remove the default y. The driver can be still compiled in, just it's an explicit decision now, and people who don't want it can unselect it. I also clarified the description a bit. This allows to save ~8K text on most modern x86-64 systems. Signed-off-by: Andi Kleen <[email protected]> Acked-by: Borislav Petkov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-10-04Merge tag 'pci-v3.12-fixes-1' of ↵Linus Torvalds1-1/+6
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "We merged what was intended to be an MMCONFIG cleanup, but in fact, for systems without _CBA (which is almost everything), it broke extended config space for domain 0 and it broke all config space for other domains. This reverts the change" * tag 'pci-v3.12-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: Revert "x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero"
2013-10-04Revert "x86/PCI: MMCONFIG: Check earlier for MMCONFIG region at address zero"Bjorn Helgaas1-1/+6
This reverts commit 07f9b61c3915e8eb156cb4461b3946736356ad02. 07f9b61c was intended to be a cleanup that didn't change anything, but in fact, for systems without _CBA (which is almost everything), it broke extended config space for domain 0 and all config space for other domains. Reference: http://lkml.kernel.org/r/[email protected] Reported-by: Hedi Berriche <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2013-10-04x86/efi: Fix config_table_type array terminationLeif Lindholm1-1/+1
Incorrect use of 0 in terminating entry of arch_tables[] causes the following sparse warning, arch/x86/platform/efi/efi.c:74:27: sparse: Using plain integer as NULL pointer Replace with NULL. Signed-off-by: Leif Lindholm <[email protected]> [ Included sparse warning in commit message. ] Signed-off-by: Matt Fleming <[email protected]>
2013-10-04x86, build, pci: Fix PCI_MSI build on !SMPThomas Petazzoni1-3/+3
Commit ebd97be635 ('PCI: remove ARCH_SUPPORTS_MSI kconfig option') removed the ARCH_SUPPORTS_MSI option which architectures could select to indicate that they support MSI. Now, all architectures are supposed to build fine when MSI support is enabled: instead of having the architecture tell *when* MSI support can be used, it's up to the architecture code to ensure that MSI support can be enabled. On x86, commit ebd97be635 removed the following line: select ARCH_SUPPORTS_MSI if (X86_LOCAL_APIC && X86_IO_APIC) Which meant that MSI support was only available when the local APIC and I/O APIC were enabled. While this is always true on SMP or x86-64, it is not necessarily the case on i386 !SMP. The below patch makes sure that the local APIC and I/O APIC support is always enabled when MSI support is enabled. To do so, it: * Ensures the X86_UP_APIC option is not visible when PCI_MSI is enabled. This is the option that allows, on UP machines, to enable or not the APIC support. It is already not visible on SMP systems, or x86-64 systems, for example. We're simply also making it invisible on i386 MSI systems. * Ensures that the X86_LOCAL_APIC and X86_IO_APIC options are 'y' when PCI_MSI is enabled. Notice that this change requires a change in drivers/iommu/Kconfig to avoid a recursive Kconfig dependencey. The AMD_IOMMU option selects PCI_MSI, but was depending on X86_IO_APIC. This dependency is no longer needed: as soon as PCI_MSI is selected, the presence of X86_IO_APIC is guaranteed. Moreover, the AMD_IOMMU already depended on X86_64, which already guaranteed that X86_IO_APIC was enabled, so this dependency was anyway redundant. Signed-off-by: Thomas Petazzoni <[email protected]> Link: http://lkml.kernel.org/r/1380794354-9079-1-git-send-email-thomas.petazzoni@free-electrons.com Reported-by: Konrad Rzeszutek Wilk <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]>
2013-10-04Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Two simplefb fixes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/simplefb: Mark framebuffer mem-resources as IORESOURCE_BUSY to avoid bootup warning x86/simplefb: Fix overflow causing bogus fall-back
2013-10-04perf/x86: Suppress duplicated abort LBR recordsAndi Kleen3-8/+23
Haswell always give an extra LBR record after every TSX abort. Suppress the extra record. This only works when the abort is visible in the LBR If the original abort has already left the 16 LBR entries the extra entry will will stay. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-10-04perf/x86: Add Haswell specific transaction flag reportingAndi Kleen1-4/+20
In the PEBS handler report the transaction flags using the new generic transaction flags facility. Most of them come from the "tsx_tuning" field in PEBSv2, but the abort code is derived from the RAX register reported in the PEBS record. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>