aboutsummaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2017-07-03x86: kvm: mmu: use ept a/d in vmcs02 iff used in vmcs12Peter Feiner2-21/+18
EPT A/D was enabled in the vmcs02 EPTP regardless of the vmcs12's EPTP value. The problem is that enabling A/D changes the behavior of L2's x86 page table walks as seen by L1. With A/D enabled, x86 page table walks are always treated as EPT writes. Commit ae1e2d1082ae ("kvm: nVMX: support EPT accessed/dirty bits", 2017-03-30) tried to work around this problem by clearing the write bit in the exit qualification for EPT violations triggered by page walks. However, that fixup introduced the opposite bug: page-table walks that actually set x86 A/D bits were *missing* the write bit in the exit qualification. This patch fixes the problem by disabling EPT A/D in the shadow MMU when EPT A/D is disabled in vmcs12's EPTP. Signed-off-by: Peter Feiner <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2017-07-03powerpc/vmlinux.lds: Align __init_begin to 16MBalbir Singh1-2/+8
For CONFIG_STRICT_KERNEL_RWX align __init_begin to 16M. We use 16M since its the larger of 2M on radix and 16M on hash for our linear mapping. The plan is to have .text, .rodata and everything upto __init_begin marked as RX. Note we still have executable read only data. We could further align rodata to another 16M boundary. I've used keeping text plus rodata as read-only-executable as a trade-off to doing read-only-executable for text and read-only for rodata. We don't use multi PT_LOAD in PHDRS because we are not sure if all bootloaders support them. This patch keeps PHDRS in vmlinux.lds.S as the same they are with just one PT_LOAD for all of the kernel marked as RWX (7). mpe: What this means is the added alignment bloats the resulting binary on disk, a powernv kernel goes from 17M to 22M. Signed-off-by: Balbir Singh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03powerpc/lib/code-patching: Use alternate map for patch_instruction()Balbir Singh1-4/+167
This patch creates the window using text_poke_area, allocated via get_vm_area(). text_poke_area is per CPU to avoid locking. text_poke_area for each cpu is setup using late_initcall, prior to setup of these alternate mapping areas, we continue to use direct write to change/modify kernel text. With the ability to use alternate mappings to write to kernel text, it provides us the freedom to then turn text read-only and implement CONFIG_STRICT_KERNEL_RWX. This code is CPU hotplug aware to ensure that the we have mappings for any new cpus as they come online and tear down mappings for any CPUs that go offline. Signed-off-by: Balbir Singh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03powerpc/xmon: Add patch_instruction() support for xmonBalbir Singh1-2/+5
Move from mwrite() to patch_instruction() for xmon for breakpoint addition and removal. Signed-off-by: Balbir Singh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03powerpc/kprobes/optprobes: Use patch_instruction()Balbir Singh1-21/+32
So that we can implement STRICT_RWX, use patch_instruction() in optprobes. Signed-off-by: Balbir Singh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03powerpc/kprobes: Move kprobes over to patch_instruction()Balbir Singh1-6/+2
arch_arm/disarm_probe() use direct assignment for copying instructions, replace them with patch_instruction(). We don't need to call flush_icache_range() because patch_instruction() does it for us. Signed-off-by: Balbir Singh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03powerpc/mm/radix: Fix execute permissions for interrupt_vectorsBalbir Singh1-1/+2
Commit 9abcc981de97 ("powerpc/mm/radix: Only add X for pages overlapping kernel text") changed the linear mapping on Radix to only mark the kernel text executable. However if the kernel is run relocated, for example as a kdump kernel, then the exception vectors are split from the kernel text, ie. they remain at real address 0. We tend to get away with it, because the kernel itself will usually be below 1G, which means the 1G page at 0-1G is marked executable and everything works OK. However if the kernel is loaded above 1G, or the system has less than 1G in total (meaning we can't use a 1G page), then the exception vectors will not be marked executable and the kernel will fail to boot. Fix it by also checking if the address range overlaps the exception vectors when deciding if we should add PAGE_KERNEL_X. Fixes: 9abcc981de97 ("powerpc/mm/radix: Only add X for pages overlapping kernel text") Cc: [email protected] # v4.7+ Signed-off-by: Balbir Singh <[email protected]> [mpe: Combine with the existing check, rewrite change log] Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03powerpc/pseries: Fix passing of pp0 in updatepp() and updateboltedpp()Balbir Singh1-1/+10
Once upon a time there were only two PP (page protection) bits. In ISA 2.03 an additional PP bit was added, but because of the layout of the HPTE it could not be made contiguous with the existing PP bits. The result is that we now have three PP bits, named pp0, pp1, pp2, where pp0 occupies bit 63 of dword 1 of the HPTE and pp1 and pp2 occupy bits 1 and 0 respectively. Until recently Linux hasn't used pp0, however with the addition of _PAGE_KERNEL_RO we started using it. The problem arises in the LPAR code, where we need to translate the PP bits into the argument for the H_PROTECT hypercall. Currently the code only passes bits 0-2 of newpp, which covers pp1, pp2 and N (no execute), meaning pp0 is not passed to the hypervisor at all. We can't simply pass it through in bit 63, as that would collide with a different field in the flags argument, as defined in PAPR. Instead we have to shift it down to bit 8 (IBM bit 55). Fixes: e58e87adc8bf ("powerpc/mm: Update _PAGE_KERNEL_RO") Cc: [email protected] # v4.7+ Signed-off-by: Balbir Singh <[email protected]> [mpe: Simplify the test, rework change log] Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03powerpc/64s: Blacklist rtas entry/exit from kprobesNaveen N. Rao1-0/+4
We can't take traps with relocation off, so blacklist enter_rtas() and rtas_return_loc(). However, instead of blacklisting all of enter_rtas(), introduce a new symbol __enter_rtas from where on we can't take a trap and blacklist that. Signed-off-by: Naveen N. Rao <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03powerpc/64s: Blacklist functions invoked on a trapNaveen N. Rao3-13/+27
Blacklist all functions involved while handling a trap. We: - convert some of the symbols into private symbols, and - blacklist most functions involved while handling a trap. Reviewed-by: Masami Hiramatsu <[email protected]> Signed-off-by: Naveen N. Rao <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03powerpc/64s: Un-blacklist system_call() from kprobesNaveen N. Rao1-1/+13
It is actually safe to probe system_call() in entry_64.S, but only till we unset MSR_RI. To allow this, add a new symbol system_call_exit() after the mtmsrd and blacklist that. Suggested-by: Michael Ellerman <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Naveen N. Rao <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03powerpc/64s: Move system_call() symbol to just after setting MSR_EENaveen N. Rao1-3/+4
It is common to get a PMU interrupt right after the mtmsr instruction that enables interrupts. Due to this, the stack trace profile gets needlessly split across system_call_common() and system_call(). Previously, system_call() symbol was at the current place to hide a few earlier symbols which have since been made private or removed entirely. So, let's move system_call() slightly higher up, right after the mtmsr instruction that enables interrupts. Convert existing references to system_call to a local syscall symbol. Suggested-by: Nicholas Piggin <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Naveen N. Rao <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03powerpc/64s: Blacklist system_call() and system_call_common() from kprobesNaveen N. Rao1-12/+14
Convert some of the symbols into private symbols and blacklist system_call_common() and system_call() from kprobes. We can't take a trap at parts of these functions as either MSR_RI is unset or the kernel stack pointer is not yet setup. Reviewed-by: Masami Hiramatsu <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Naveen N. Rao <[email protected]> [mpe: Don't convert system_call_common to _GLOBAL()] Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03powerpc/64s: Convert .L__replay_interrupt_return to a local labelNaveen N. Rao1-2/+2
Commit b48bbb82e2b835 ("powerpc/64s: Don't unbalance the return branch predictor in __replay_interrupt()") introduced __replay_interrupt_return symbol with '.L' prefix in hopes of keeping it private. However, due to the use of LOAD_REG_ADDR(), the assembler kept this symbol visible. Fix the same by instead using the local label '1'. Fixes: Commit b48bbb82e2b835 ("powerpc/64s: Don't unbalance the return branch predictor in __replay_interrupt()") Suggested-by: Nicholas Piggin <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Naveen N. Rao <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03powerpc64/elfv1: Only dereference function descriptor for non-text symbolsNaveen N. Rao1-1/+9
Currently, we assume that the function pointer we receive in ppc_function_entry() points to a function descriptor. However, this is not always the case. In particular, assembly symbols without the right annotation do not have an associated function descriptor. Some of these symbols are added to the kprobe blacklist using _ASM_NOKPROBE_SYMBOL(). When such addresses are subsequently processed through arch_deref_entry_point() in populate_kprobe_blacklist(), we see the below errors during bootup: [ 0.663963] Failed to find blacklist at 7d9b02a648029b6c [ 0.663970] Failed to find blacklist at a14d03d0394a0001 [ 0.663972] Failed to find blacklist at 7d5302a6f94d0388 [ 0.663973] Failed to find blacklist at 48027d11e8610178 [ 0.663974] Failed to find blacklist at f8010070f8410080 [ 0.663976] Failed to find blacklist at 386100704801f89d [ 0.663977] Failed to find blacklist at 7d5302a6f94d00b0 Fix this by checking if the function pointer we receive in ppc_function_entry() already points to kernel text. If so, we just return it as is. If not, we assume that this is a function descriptor and proceed to dereference it. Suggested-by: Nicholas Piggin <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Naveen N. Rao <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03cxl: Export library to support IBM XSLChristophe Lombard1-0/+1
This patch exports a in-kernel 'library' API which can be called by other drivers to help interacting with an IBM XSL on a POWER9 system. The XSL (Translation Service Layer) is a stripped down version of the PSL (Power Service Layer) used in some cards such as the Mellanox CX5. Like the PSL, it implements the CAIA architecture, but has a number of differences, mostly in it's implementation dependent registers. The XSL also uses a special DMA cxl mode, which uses a slightly different init sequence for the CAPP and PHB. Signed-off-by: Andrew Donnellan <[email protected]> Signed-off-by: Christophe Lombard <[email protected]> Acked-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03Merge branch 'fixes' into nextMichael Ellerman27-117/+289
Merge our fixes branch, a few of them are tripping people up while working on top of next, and we also have a dependency between the CXL fixes and new CXL code we want to merge into next.
2017-07-03powerpc/dts: Use #include "..." to include local DTMasahiro Yamada3-3/+3
Most of DT files in PowerPC use #include "..." to make pre-processor include DT in the same directory, but we have 3 exceptional files that use #include <...> for that. Fix them to remove -I$(srctree)/arch/$(SRCARCH)/boot/dts path from dtc_cpp_flags. Signed-off-by: Masahiro Yamada <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-03Merge branch 'pci/host-hv' into nextBjorn Helgaas1-0/+6
* pci/host-hv: PCI: hv: Use vPCI protocol version 1.2 PCI: hv: Add vPCI version protocol negotiation PCI: hv: Temporary own CPU-number-to-vCPU-number infra PCI: hv: Use page allocation for hbus structure PCI: hv: Fix comment formatting and use proper integer fields
2017-07-03Merge branch 'pci/irq-fixups' into nextBjorn Helgaas9-54/+98
* pci/irq-fixups: arm64: PCI: Drop DT IRQ allocation from pcibios_alloc_irq() PCI: xilinx-nwl: Move to struct pci_host_bridge IRQ mapping functions PCI: rockchip: Move to struct pci_host_bridge IRQ mapping functions PCI: xgene: Move to struct pci_host_bridge IRQ mapping functions PCI: altera: Drop pci_fixup_irqs() PCI: versatile: Drop pci_fixup_irqs() PCI: generic: Drop pci_fixup_irqs() PCI: faraday: Drop pci_fixup_irqs() PCI: designware: Drop pci_fixup_irqs() PCI: iproc: Drop pci_fixup_irqs() PCI: rcar: Drop pci_fixup_irqs() PCI: xilinx: Drop pci_fixup_irqs() PCI: tegra: Drop pci_fixup_irqs() ARM/PCI: Remove pci_fixup_irqs() call for bios32 host controllers PCI: Add a call to pci_assign_irq() in pci_device_probe() OF/PCI: Update of_irq_parse_and_map_pci() comment PCI: Add pci_assign_irq() function and have pci_fixup_irqs() use it PCI: Add IRQ mapping function pointers to pci_host_bridge struct PCI: Build setup-irq.o on all arches PCI: Remove pci_scan_root_bus_msi() PCI: xilinx-nwl: Convert PCI scan API to pci_scan_root_bus_bridge() PCI: rockchip: Convert PCI scan API to pci_scan_root_bus_bridge() PCI: generic: Convert PCI scan API to pci_scan_root_bus_bridge() PCI: xgene: Convert PCI scan API to pci_scan_root_bus_bridge() PCI: xilinx: Convert PCI scan API to pci_scan_root_bus_bridge() PCI: altera: Convert PCI scan API to pci_scan_root_bus_bridge() PCI: versatile: Convert PCI scan API to pci_scan_root_bus_bridge() PCI: iproc: Convert PCI scan API to pci_scan_root_bus_bridge() PCI: rcar: Convert PCI scan API to pci_scan_root_bus_bridge() PCI: aardvark: Convert PCI scan API to pci_scan_root_bus_bridge() PCI: designware: Convert PCI scan API to pci_scan_root_bus_bridge() ARM/PCI: Convert PCI scan API to pci_scan_root_bus_bridge() PCI: Make pci_register_host_bridge() PCI core internal PCI: Add pci_scan_root_bus_bridge() interface PCI: tegra: Fix host bridge memory leakage PCI: faraday: Fix host bridge memory leakage PCI: Add devm_pci_alloc_host_bridge() interface PCI: Add pci_free_host_bridge() interface PCI: Initialize bridge release function at bridge allocation PCI: faraday: Convert IRQ masking to raw PCI config accessors PCI: iproc: Convert link check to raw PCI config accessors PCI: xilinx-nwl: Remove nwl_pcie_enable_msi() unused bus parameter
2017-07-03ARM: owl: smp: Drop bogus holding penAndreas Färber2-46/+3
The S500 SoC can start secondary CPUs without busy-looping for pen_release, so simplify the SMP code compared to the LeMaker kernel tree. Fixes: 172067e0bc87 ("ARM: owl: Implement CPU enable-method for S500") Suggested-by: Arnd Bergmann <[email protected]> Cc: David Liu <[email protected]> Signed-off-by: Andreas Färber <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2017-07-03ARM: owl: Drop custom machineAndreas Färber2-29/+0
Rely on the fallback to "Generic DT based system". This change is visible in /proc/cpuinfo. Cc: Arnd Bergmann <[email protected]> Signed-off-by: Andreas Färber <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2017-07-03Merge branch 'pm-sleep'Rafael J. Wysocki2-6/+5
* pm-sleep: PM: hibernate: constify attribute_group structures. PM / hibernate: Drop redundant parameter of swsusp_alloc() PM / hibernate: Use CONFIG_HAVE_SET_MEMORY for include condition x86/power/64: Use char arrays for asm function names
2017-07-03Merge branches 'pm-cpufreq', 'intel_pstate' and 'pm-cpuidle'Rafael J. Wysocki4-9/+84
* pm-cpufreq: cpufreq / CPPC: Initialize policy->min to lowest nonlinear performance cpufreq: sfi: make freq_table static cpufreq: exynos5440: Fix inconsistent indenting cpufreq: imx6q: imx6ull should use the same flow as imx6ul cpufreq: dt: Add support for hi3660 * intel_pstate: cpufreq: Update scaling_cur_freq documentation cpufreq: intel_pstate: Clean up after performance governor changes intel_pstate: skip scheduler hook when in "performance" mode intel_pstate: delete scheduler hook in HWP mode x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF cpufreq: intel_pstate: Remove max/min fractions to limit performance x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz" * pm-cpuidle: cpuidle: menu: allow state 0 to be disabled intel_idle: Use more common logging style x86/ACPI/cstate: Allow ACPI C1 FFH MWAIT use on AMD systems ARM: cpuidle: Support asymmetric idle definition
2017-07-03Merge branch 'pm-tools'Rafael J. Wysocki1-6/+12
* pm-tools: cpupower: Add support for new AMD family 0x17 cpupower: Fix bug where return value was not used tools/power turbostat: update version number tools/power turbostat: decode MSR_IA32_MISC_ENABLE only on Intel tools/power turbostat: stop migrating, unless '-m' tools/power turbostat: if --debug, print sampling overhead tools/power turbostat: hide SKL counters, when not requested intel_pstate: use updated msr-index.h HWP.EPP values tools/power x86_energy_perf_policy: support HWP.EPP x86: msr-index.h: fix shifts to ULL results in HWP macros. x86: msr-index.h: define HWP.EPP values x86: msr-index.h: define EPB mid-points
2017-07-03Merge branch 'uuid-types'Rafael J. Wysocki2-3/+3
Merge 'uuid-types' from git://git.infradead.org/users/hch/uuid.git
2017-07-03Merge tag 'mvebu-fixes-4.12-2' of git://git.infradead.org/linux-mvebu into ↵Arnd Bergmann3-5/+3
next/fixes-non-critical mvebu fixes for 4.12 (part 2) Fix Openblock A6 (kirkwood base board) nand partition overlap * tag 'mvebu-fixes-4.12-2' of git://git.infradead.org/linux-mvebu: ARM: dts: kirkwood: Fix Openblock A6 nand partition overlap arm64: marvell: dts: fix interrupts in 7k/8k crypto nodes
2017-07-03x86/xen: allow userspace access during hypercallsMarek Marczykowski-Górecki1-1/+8
Userspace application can do a hypercall through /dev/xen/privcmd, and some for some hypercalls argument is a pointers to user-provided structure. When SMAP is supported and enabled, hypervisor can't access. So, lets allow it. The same applies to HYPERVISOR_dm_op, where additionally privcmd driver carefully verify buffer addresses. Cc: [email protected] Signed-off-by: Marek Marczykowski-Górecki <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2017-07-03x86: xen: remove unnecessary variable in xen_foreach_remap_area()Gustavo A. R. Silva1-5/+2
Remove unnecessary variable mfn in function xen_foreach_remap_area() and, refactor the code. Variable mfn at line 518:mfn = xen_remap_buf.mfns[i]; is only being used to store a value to be passed as an argument to the xen_update_mem_tables() function. This value can be passed directly, which makes variable mfn unnecessary. Also, value assigned to variable mfn at line 534:mfn = xen_remap_mfn; is never used. Addresses-Coverity-ID: 1260110 Signed-off-by: Gustavo A. R. Silva <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2017-07-03sparc: kernel: pmc: make of_device_ids const.Arvind Yadav1-1/+1
of_device_ids are not supposed to change at runtime. All functions working with of_device_ids provided by <linux/of.h> work with const of_device_ids. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-03kvm: x86: mmu: allow A/D bits to be disabled in an mmuPeter Feiner3-33/+91
Adds the plumbing to disable A/D bits in the MMU based on a new role bit, ad_disabled. When A/D is disabled, the MMU operates as though A/D aren't available (i.e., using access tracking faults instead). To avoid SP -> kvm_mmu_page.role.ad_disabled lookups all over the place, A/D disablement is now stored in the SPTE. This state is stored in the SPTE by tweaking the use of SPTE_SPECIAL_MASK for access tracking. Rather than just setting SPTE_SPECIAL_MASK when an access-tracking SPTE is non-present, we now always set SPTE_SPECIAL_MASK for access-tracking SPTEs. Signed-off-by: Peter Feiner <[email protected]> [Use role.ad_disabled even for direct (non-shadow) EPT page tables. Add documentation and a few MMU_WARN_ONs. - Paolo] Signed-off-by: Paolo Bonzini <[email protected]>
2017-07-03x86: kvm: mmu: make spte mmio mask more explicitPeter Feiner4-6/+10
Specify both a mask (i.e., bits to consider) and a value (i.e., pattern of bits that indicates a special PTE) for mmio SPTEs. On Intel, this lets us pack even more information into the (SPTE_SPECIAL_MASK | EPT_VMX_RWX_MASK) mask we use for access tracking liberating all (SPTE_SPECIAL_MASK | (non-misconfigured-RWX)) values. Signed-off-by: Peter Feiner <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2017-07-03x86: kvm: mmu: dead code thanks to access trackingPeter Feiner1-21/+9
The MMU always has hardware A bits or access tracking support, thus it's unnecessary to handle the scenario where we have neither. Signed-off-by: Peter Feiner <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2017-07-03Merge branch 'kvm-ppc-next' of ↵Paolo Bonzini15-211/+697
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD - Better machine check handling for HV KVM - Ability to support guests with threads=2, 4 or 8 on POWER9 - Fix for a race that could cause delayed recognition of signals - Fix for a bug where POWER9 guests could sleep with interrupts pending.
2017-07-03KVM: PPC: Book3S: Fix typo in XICS-on-XIVE state saving codePaul Mackerras1-2/+2
This fixes a typo where the wrong loop index was used to index the kvmppc_xive_vcpu.queues[] array in xive_pre_save_scan(). The variable i contains the vcpu number; we need to index queues[] using j, which iterates from 0 to KVMPPC_XIVE_Q_COUNT-1. The effect of this bug is that things that save the interrupt controller state, such as "virsh dump", on a VM with more than 8 vCPUs, result in xive_pre_save_queue() getting called on a bogus queue structure, usually resulting in a crash like this: [ 501.821107] Unable to handle kernel paging request for data at address 0x00000084 [ 501.821212] Faulting instruction address: 0xc008000004c7c6f8 [ 501.821234] Oops: Kernel access of bad area, sig: 11 [#1] [ 501.821305] SMP NR_CPUS=1024 [ 501.821307] NUMA [ 501.821376] PowerNV [ 501.821470] Modules linked in: vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables ses enclosure scsi_transport_sas ipmi_powernv ipmi_devintf ipmi_msghandler powernv_op_panel kvm_hv nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc kvm tg3 ptp pps_core [ 501.822477] CPU: 3 PID: 3934 Comm: live_migration Not tainted 4.11.0-4.git8caa70f.el7.centos.ppc64le #1 [ 501.822633] task: c0000003f9e3ae80 task.stack: c0000003f9ed4000 [ 501.822745] NIP: c008000004c7c6f8 LR: c008000004c7c628 CTR: 0000000030058018 [ 501.822877] REGS: c0000003f9ed7980 TRAP: 0300 Not tainted (4.11.0-4.git8caa70f.el7.centos.ppc64le) [ 501.823030] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> [ 501.823047] CR: 28022244 XER: 00000000 [ 501.823203] CFAR: c008000004c7c77c DAR: 0000000000000084 DSISR: 40000000 SOFTE: 1 [ 501.823203] GPR00: c008000004c7c628 c0000003f9ed7c00 c008000004c91450 00000000000000ff [ 501.823203] GPR04: c0000003f5580000 c0000003f559bf98 9000000000009033 0000000000000000 [ 501.823203] GPR08: 0000000000000084 0000000000000000 00000000000001e0 9000000000001003 [ 501.823203] GPR12: c00000000008a7d0 c00000000fdc1b00 000000000a9a0000 0000000000000000 [ 501.823203] GPR16: 00000000402954e8 000000000a9a0000 0000000000000004 0000000000000000 [ 501.823203] GPR20: 0000000000000008 c000000002e8f180 c000000002e8f1e0 0000000000000001 [ 501.823203] GPR24: 0000000000000008 c0000003f5580008 c0000003f4564018 c000000002e8f1e8 [ 501.823203] GPR28: 00003ff6e58bdc28 c0000003f4564000 0000000000000000 0000000000000000 [ 501.825441] NIP [c008000004c7c6f8] xive_get_attr+0x3b8/0x5b0 [kvm] [ 501.825671] LR [c008000004c7c628] xive_get_attr+0x2e8/0x5b0 [kvm] [ 501.825887] Call Trace: [ 501.825991] [c0000003f9ed7c00] [c008000004c7c628] xive_get_attr+0x2e8/0x5b0 [kvm] (unreliable) [ 501.826312] [c0000003f9ed7cd0] [c008000004c62ec4] kvm_device_ioctl_attr+0x64/0xa0 [kvm] [ 501.826581] [c0000003f9ed7d20] [c008000004c62fcc] kvm_device_ioctl+0xcc/0xf0 [kvm] [ 501.826843] [c0000003f9ed7d40] [c000000000350c70] do_vfs_ioctl+0xd0/0x8c0 [ 501.827060] [c0000003f9ed7de0] [c000000000351534] SyS_ioctl+0xd4/0xf0 [ 501.827282] [c0000003f9ed7e30] [c00000000000b8e0] system_call+0x38/0xfc [ 501.827496] Instruction dump: [ 501.827632] 419e0078 3b760008 e9160008 83fb000c 83db0010 80fb0008 2f280000 60000000 [ 501.827901] 60000000 60420000 419a0050 7be91764 <7d284c2c> 552a0ffe 7f8af040 419e003c [ 501.828176] ---[ end trace 2d0529a5bbbbafed ]--- Cc: [email protected] Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller") Acked-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2017-07-02Merge branch 'pci/resource' into nextBjorn Helgaas1-0/+32
* pci/resource: PCI: Work around poweroff & suspend-to-RAM issue on Macbook Pro 11 PCI: Do not disregard parent resources starting at 0x0 Conflicts: arch/x86/pci/fixup.c
2017-07-02Merge branch 'pci/pm' into nextBjorn Helgaas1-0/+15
* pci/pm: PCI/PM: Avoid using device_may_wakeup() for runtime PM x86/PCI: Avoid AMD SB7xx EHCI USB wakeup defect PCI/PM: Restore the status of PCI devices across hibernation drm/radeon: make MacBook Pro d3_delay quirk more generic drm/amdgpu: remove unnecessary save/restore of pdev->d3_delay PCI/PM: Add needs_resume flag to avoid suspend complete optimization PCI: imx6: Fix config read timeout handling switchtec: Fix minor bug with partition ID register switchtec: Use new cdev_device_add() helper function PCI: endpoint: Make PCI_ENDPOINT depend on HAS_DMA
2017-07-02PCI: hv: Use vPCI protocol version 1.2Jork Loeser1-0/+6
Update the Hyper-V vPCI driver to use the Server-2016 version of the vPCI protocol, fixing MSI creation and retargeting issues. Signed-off-by: Jork Loeser <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: K. Y. Srinivasan <[email protected]> Acked-by: K. Y. Srinivasan <[email protected]>
2017-07-02arm64: PCI: Drop DT IRQ allocation from pcibios_alloc_irq()Lorenzo Pieralisi1-6/+4
With the introduction of struct pci_host_bridge.map_irq pointer it is possible to assign IRQs for all devices originating from a PCI host bridge at probe time; this is implemented through pci_assign_irq() that relies on the struct pci_host_bridge.map_irq pointer to map IRQ for a given device. The benefits this brings are twofold: - the IRQ for a device is assigned once at probe time - the IRQ assignment works also for hotplugged devices With all DT based PCI host bridges converted to the struct pci_host_bridge.{map/swizzle}_irq hooks mechanism the DT IRQ allocation in ARM64 pcibios_alloc_irq() is now redundant and can be removed. Signed-off-by: Lorenzo Pieralisi <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Will Deacon <[email protected]>
2017-07-02ARM/PCI: Remove pci_fixup_irqs() call for bios32 host controllersLorenzo Pieralisi1-2/+3
Legacy PCI host controllers (ie host controllers that set-up the PCI bus through the ARM pci_common_init() API) are currently relying on pci_fixup_irqs() to assign legacy PCI irqs to devices. This is not ideal in that pci_fixup_irqs() assigns IRQs for all PCI devices present in a given system some of which may well be enabled by the time pci_fixup_irqs() is called (ie a system with multiple host controllers). With the introduction of struct pci_host_bridge.(*map_irq) pointer it is possible to assign IRQs for all devices originating from a PCI host bridge at probe time; this is implemented through pci_assign_irq() that relies on the struct pci_host_bridge.map_irq pointer to map IRQ for a given device. The benefits this brings are twofold: - the IRQ for a device is assigned once at probe time - the IRQ assignment works also for hotplugged devices Remove pci_fixup_irqs() call from bios32 code and rely on pci_assign_irq() to carry out the IRQ mapping at device probe time. The map_irq() and swizzle_irq() struct pci_host_bridge callbacks are set-up in the struct pci_host_bridge created in the bios32 pcibios_init_hw() function and mach-* code paths (for PCI mach implementations that require a specific struct hw_pci.(*scan) function callback). Signed-off-by: Lorenzo Pieralisi <[email protected]> [bhelgaas: folded in fixes from Lorenzo: http://lkml.kernel.org/r/20170701140629.GC8977@red-moon] Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Jason Cooper <[email protected]> Cc: Russell King <[email protected]> Cc: Andrew Lunn <[email protected]>
2017-07-02parisc: Report SIGSEGV instead of SIGBUS when running out of stackHelge Deller1-1/+1
When a process runs out of stack the parisc kernel wrongly faults with SIGBUS instead of the expected SIGSEGV signal. This example shows how the kernel faults: do_page_fault() command='a.out' type=15 address=0xfaac2000 in libc-2.24.so[f8308000+16c000] trap #15: Data TLB miss fault, vm_start = 0xfa2c2000, vm_end = 0xfaac2000 The vma->vm_end value is the first address which does not belong to the vma, so adjust the check to include vma->vm_end to the range for which to send the SIGSEGV signal. This patch unbreaks building the debian libsigsegv package. Cc: [email protected] Signed-off-by: Helge Deller <[email protected]>
2017-07-02parisc: use compat_sys_keyctl()Eric Biggers1-1/+1
Architectures with a compat syscall table must put compat_sys_keyctl() in it, not sys_keyctl(). The parisc architecture was not doing this; fix it. Cc: [email protected] Signed-off-by: Eric Biggers <[email protected]> Acked-by: Helge Deller <[email protected]> Signed-off-by: Helge Deller <[email protected]>
2017-07-02Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds7-16/+33
Pull MIPS fixes from Ralf Baechle: "Here's a final round of fixes for 4.12: - Fix misordered instructions in assembly code making kenel startup via UHB unreliable. - Fix special case of MADDF and MADDF emulation. - Fix alignment issue in address calculation in pm-cps on 64 bit. - Fix IRQ tracing & lockdep when rescheduling - Systems with MAARs require post-DMA cache flushes. The reordering fix and the MADDF/MSUBF fix have sat in linux-next for a number of days. The others haven't propagated from my pull tree to linux-next yet but all have survived manual testing and Imagination's automated test system and there are no pending bug reports" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: Avoid accidental raw backtrace MIPS: Perform post-DMA cache flushes on systems with MAARs MIPS: Fix IRQ tracing & lockdep when rescheduling MIPS: pm-cps: Drop manual cache-line alignment of ready_count MIPS: math-emu: Handle zero accumulator case in MADDF and MSUBF separately MIPS: head: Reorder instructions missing a delay slot
2017-07-02Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds1-4/+4
Pull ARM fix from Russell King: "One final fix for 4.12 - Doug found a boot failure case triggered by requesting a non-even MB vmalloc size" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8685/1: ensure memblock-limit is pmd-aligned
2017-07-02powerpc/perf/hv-24x7: Aggregate result elements on POWER9 SMT8Thiago Jung Bauermann1-11/+42
On POWER9 SMT8 the 24x7 API returns two result elements for physical core and virtual CPU events and we need to add their counts to get the final result. Reviewed-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: Thiago Jung Bauermann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-02powerpc/perf/hv-24x7: Support v2 of the hypervisor APIThiago Jung Bauermann3-35/+160
POWER9 introduces a new version of the hypervisor API to access the 24x7 perf counters. The new version changed some of the structures used for requests and results. Signed-off-by: Thiago Jung Bauermann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-02powerpc/perf/hv-24x7: Minor improvementsThiago Jung Bauermann2-6/+14
There's an H24x7_DATA_BUFFER_SIZE constant, so use it in init_24x7_request. There's also an HV_PERF_DOMAIN_MAX constant, so use it in h_24x7_event_init. This makes the comment above the check redundant, so remove it. In add_event_to_24x7_request, a statement is terminated with a comma instead of a semicolon. Fix it. In hv-24x7.h, improve comments in struct hv_24x7_result. Signed-off-by: Thiago Jung Bauermann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-02powerpc/perf/hv-24x7: Fix return value of hcallsThiago Jung Bauermann1-15/+13
The H_GET_24X7_CATALOG_PAGE hcall can return a signed error code, so fix this in the code. The H_GET_24X7_DATA hcall can return a signed error code, so fix this in the code. Also, don't truncate it to 32 bit to use as return value for make_24x7_request. In case of error h_24x7_event_commit_txn passes that return value to generic code, so it should be a proper errno. The other caller of make_24x7_request is single_24x7_request, whose callers don't actually care which error code is returned so they are not affected by this change. Finally, h_24x7_get_value doesn't use the error code from single_24x7_request, so there's no need to store it. Reviewed-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: Thiago Jung Bauermann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-02powerpc-perf/hx-24x7: Don't log failed hcall twiceThiago Jung Bauermann1-23/+12
make_24x7_request already calls log_24x7_hcall if it fails, so callers don't have to do it again. In fact, since the latter is now only called from the former, there's no need for a separate log_24x7_hcall anymore so remove it. Reviewed-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: Thiago Jung Bauermann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-02powerpc/perf/hv-24x7: Properly iterate through resultsThiago Jung Bauermann1-9/+23
hv-24x7.h has a comment mentioning that result_buffer->results can't be indexed as a normal array because it may contain results of variable sizes, so fix the loop in h_24x7_event_commit_txn to take the variation into account when iterating through results. Another problem in that loop is that it sets h24x7hw->events[i] to NULL. This assumes that only the i'th result maps to the i'th request, but that is not guaranteed to be true. We need to leave the event in the array so that we don't dereference a NULL pointer in case more than one result maps to one request. We still assume that each result has only one result element, so warn if that assumption is violated. Reviewed-by: Sukadev Bhattiprolu <[email protected]> Signed-off-by: Thiago Jung Bauermann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>