Age | Commit message (Collapse) | Author | Files | Lines |
|
EPT A/D was enabled in the vmcs02 EPTP regardless of the vmcs12's EPTP
value. The problem is that enabling A/D changes the behavior of L2's
x86 page table walks as seen by L1. With A/D enabled, x86 page table
walks are always treated as EPT writes.
Commit ae1e2d1082ae ("kvm: nVMX: support EPT accessed/dirty bits",
2017-03-30) tried to work around this problem by clearing the write
bit in the exit qualification for EPT violations triggered by page
walks. However, that fixup introduced the opposite bug: page-table walks
that actually set x86 A/D bits were *missing* the write bit in the exit
qualification.
This patch fixes the problem by disabling EPT A/D in the shadow MMU
when EPT A/D is disabled in vmcs12's EPTP.
Signed-off-by: Peter Feiner <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
For CONFIG_STRICT_KERNEL_RWX align __init_begin to 16M. We use 16M
since its the larger of 2M on radix and 16M on hash for our linear
mapping. The plan is to have .text, .rodata and everything upto
__init_begin marked as RX. Note we still have executable read only
data. We could further align rodata to another 16M boundary. I've used
keeping text plus rodata as read-only-executable as a trade-off to
doing read-only-executable for text and read-only for rodata.
We don't use multi PT_LOAD in PHDRS because we are not sure if all
bootloaders support them. This patch keeps PHDRS in vmlinux.lds.S as
the same they are with just one PT_LOAD for all of the kernel marked
as RWX (7).
mpe: What this means is the added alignment bloats the resulting
binary on disk, a powernv kernel goes from 17M to 22M.
Signed-off-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
This patch creates the window using text_poke_area, allocated via
get_vm_area(). text_poke_area is per CPU to avoid locking.
text_poke_area for each cpu is setup using late_initcall, prior to
setup of these alternate mapping areas, we continue to use direct
write to change/modify kernel text. With the ability to use alternate
mappings to write to kernel text, it provides us the freedom to then
turn text read-only and implement CONFIG_STRICT_KERNEL_RWX.
This code is CPU hotplug aware to ensure that the we have mappings for
any new cpus as they come online and tear down mappings for any CPUs
that go offline.
Signed-off-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Move from mwrite() to patch_instruction() for xmon for
breakpoint addition and removal.
Signed-off-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
So that we can implement STRICT_RWX, use patch_instruction() in
optprobes.
Signed-off-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
arch_arm/disarm_probe() use direct assignment for copying
instructions, replace them with patch_instruction(). We don't need to
call flush_icache_range() because patch_instruction() does it for us.
Signed-off-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Commit 9abcc981de97 ("powerpc/mm/radix: Only add X for pages
overlapping kernel text") changed the linear mapping on Radix to only
mark the kernel text executable.
However if the kernel is run relocated, for example as a kdump kernel,
then the exception vectors are split from the kernel text, ie. they
remain at real address 0.
We tend to get away with it, because the kernel itself will usually be
below 1G, which means the 1G page at 0-1G is marked executable and
everything works OK. However if the kernel is loaded above 1G, or the
system has less than 1G in total (meaning we can't use a 1G page),
then the exception vectors will not be marked executable and the
kernel will fail to boot.
Fix it by also checking if the address range overlaps the exception
vectors when deciding if we should add PAGE_KERNEL_X.
Fixes: 9abcc981de97 ("powerpc/mm/radix: Only add X for pages overlapping kernel text")
Cc: [email protected] # v4.7+
Signed-off-by: Balbir Singh <[email protected]>
[mpe: Combine with the existing check, rewrite change log]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Once upon a time there were only two PP (page protection) bits. In ISA
2.03 an additional PP bit was added, but because of the layout of the
HPTE it could not be made contiguous with the existing PP bits.
The result is that we now have three PP bits, named pp0, pp1, pp2,
where pp0 occupies bit 63 of dword 1 of the HPTE and pp1 and pp2
occupy bits 1 and 0 respectively. Until recently Linux hasn't used
pp0, however with the addition of _PAGE_KERNEL_RO we started using it.
The problem arises in the LPAR code, where we need to translate the PP
bits into the argument for the H_PROTECT hypercall. Currently the code
only passes bits 0-2 of newpp, which covers pp1, pp2 and N (no
execute), meaning pp0 is not passed to the hypervisor at all.
We can't simply pass it through in bit 63, as that would collide with a
different field in the flags argument, as defined in PAPR. Instead we
have to shift it down to bit 8 (IBM bit 55).
Fixes: e58e87adc8bf ("powerpc/mm: Update _PAGE_KERNEL_RO")
Cc: [email protected] # v4.7+
Signed-off-by: Balbir Singh <[email protected]>
[mpe: Simplify the test, rework change log]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
We can't take traps with relocation off, so blacklist enter_rtas() and
rtas_return_loc(). However, instead of blacklisting all of enter_rtas(),
introduce a new symbol __enter_rtas from where on we can't take a trap
and blacklist that.
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Blacklist all functions involved while handling a trap. We:
- convert some of the symbols into private symbols, and
- blacklist most functions involved while handling a trap.
Reviewed-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
It is actually safe to probe system_call() in entry_64.S, but only till
we unset MSR_RI. To allow this, add a new symbol system_call_exit()
after the mtmsrd and blacklist that.
Suggested-by: Michael Ellerman <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
It is common to get a PMU interrupt right after the mtmsr instruction that
enables interrupts. Due to this, the stack trace profile gets needlessly split
across system_call_common() and system_call().
Previously, system_call() symbol was at the current place to hide a few
earlier symbols which have since been made private or removed entirely.
So, let's move system_call() slightly higher up, right after the mtmsr
instruction that enables interrupts. Convert existing references to
system_call to a local syscall symbol.
Suggested-by: Nicholas Piggin <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Convert some of the symbols into private symbols and blacklist
system_call_common() and system_call() from kprobes. We can't take a
trap at parts of these functions as either MSR_RI is unset or the
kernel stack pointer is not yet setup.
Reviewed-by: Masami Hiramatsu <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Signed-off-by: Naveen N. Rao <[email protected]>
[mpe: Don't convert system_call_common to _GLOBAL()]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Commit b48bbb82e2b835 ("powerpc/64s: Don't unbalance the return branch
predictor in __replay_interrupt()") introduced __replay_interrupt_return
symbol with '.L' prefix in hopes of keeping it private. However, due to
the use of LOAD_REG_ADDR(), the assembler kept this symbol visible. Fix
the same by instead using the local label '1'.
Fixes: Commit b48bbb82e2b835 ("powerpc/64s: Don't unbalance the return branch
predictor in __replay_interrupt()")
Suggested-by: Nicholas Piggin <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Currently, we assume that the function pointer we receive in
ppc_function_entry() points to a function descriptor. However, this is
not always the case. In particular, assembly symbols without the right
annotation do not have an associated function descriptor. Some of these
symbols are added to the kprobe blacklist using _ASM_NOKPROBE_SYMBOL().
When such addresses are subsequently processed through
arch_deref_entry_point() in populate_kprobe_blacklist(), we see the
below errors during bootup:
[ 0.663963] Failed to find blacklist at 7d9b02a648029b6c
[ 0.663970] Failed to find blacklist at a14d03d0394a0001
[ 0.663972] Failed to find blacklist at 7d5302a6f94d0388
[ 0.663973] Failed to find blacklist at 48027d11e8610178
[ 0.663974] Failed to find blacklist at f8010070f8410080
[ 0.663976] Failed to find blacklist at 386100704801f89d
[ 0.663977] Failed to find blacklist at 7d5302a6f94d00b0
Fix this by checking if the function pointer we receive in
ppc_function_entry() already points to kernel text. If so, we just
return it as is. If not, we assume that this is a function descriptor
and proceed to dereference it.
Suggested-by: Nicholas Piggin <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
This patch exports a in-kernel 'library' API which can be called by
other drivers to help interacting with an IBM XSL on a POWER9 system.
The XSL (Translation Service Layer) is a stripped down version of the
PSL (Power Service Layer) used in some cards such as the Mellanox CX5.
Like the PSL, it implements the CAIA architecture, but has a number
of differences, mostly in it's implementation dependent registers.
The XSL also uses a special DMA cxl mode, which uses a slightly
different init sequence for the CAPP and PHB.
Signed-off-by: Andrew Donnellan <[email protected]>
Signed-off-by: Christophe Lombard <[email protected]>
Acked-by: Frederic Barrat <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Merge our fixes branch, a few of them are tripping people up while
working on top of next, and we also have a dependency between the CXL
fixes and new CXL code we want to merge into next.
|
|
Most of DT files in PowerPC use #include "..." to make pre-processor
include DT in the same directory, but we have 3 exceptional files
that use #include <...> for that.
Fix them to remove -I$(srctree)/arch/$(SRCARCH)/boot/dts path from
dtc_cpp_flags.
Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
* pci/host-hv:
PCI: hv: Use vPCI protocol version 1.2
PCI: hv: Add vPCI version protocol negotiation
PCI: hv: Temporary own CPU-number-to-vCPU-number infra
PCI: hv: Use page allocation for hbus structure
PCI: hv: Fix comment formatting and use proper integer fields
|
|
* pci/irq-fixups:
arm64: PCI: Drop DT IRQ allocation from pcibios_alloc_irq()
PCI: xilinx-nwl: Move to struct pci_host_bridge IRQ mapping functions
PCI: rockchip: Move to struct pci_host_bridge IRQ mapping functions
PCI: xgene: Move to struct pci_host_bridge IRQ mapping functions
PCI: altera: Drop pci_fixup_irqs()
PCI: versatile: Drop pci_fixup_irqs()
PCI: generic: Drop pci_fixup_irqs()
PCI: faraday: Drop pci_fixup_irqs()
PCI: designware: Drop pci_fixup_irqs()
PCI: iproc: Drop pci_fixup_irqs()
PCI: rcar: Drop pci_fixup_irqs()
PCI: xilinx: Drop pci_fixup_irqs()
PCI: tegra: Drop pci_fixup_irqs()
ARM/PCI: Remove pci_fixup_irqs() call for bios32 host controllers
PCI: Add a call to pci_assign_irq() in pci_device_probe()
OF/PCI: Update of_irq_parse_and_map_pci() comment
PCI: Add pci_assign_irq() function and have pci_fixup_irqs() use it
PCI: Add IRQ mapping function pointers to pci_host_bridge struct
PCI: Build setup-irq.o on all arches
PCI: Remove pci_scan_root_bus_msi()
PCI: xilinx-nwl: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: rockchip: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: generic: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: xgene: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: xilinx: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: altera: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: versatile: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: iproc: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: rcar: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: aardvark: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: designware: Convert PCI scan API to pci_scan_root_bus_bridge()
ARM/PCI: Convert PCI scan API to pci_scan_root_bus_bridge()
PCI: Make pci_register_host_bridge() PCI core internal
PCI: Add pci_scan_root_bus_bridge() interface
PCI: tegra: Fix host bridge memory leakage
PCI: faraday: Fix host bridge memory leakage
PCI: Add devm_pci_alloc_host_bridge() interface
PCI: Add pci_free_host_bridge() interface
PCI: Initialize bridge release function at bridge allocation
PCI: faraday: Convert IRQ masking to raw PCI config accessors
PCI: iproc: Convert link check to raw PCI config accessors
PCI: xilinx-nwl: Remove nwl_pcie_enable_msi() unused bus parameter
|
|
The S500 SoC can start secondary CPUs without busy-looping for pen_release,
so simplify the SMP code compared to the LeMaker kernel tree.
Fixes: 172067e0bc87 ("ARM: owl: Implement CPU enable-method for S500")
Suggested-by: Arnd Bergmann <[email protected]>
Cc: David Liu <[email protected]>
Signed-off-by: Andreas Färber <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
|
|
Rely on the fallback to "Generic DT based system".
This change is visible in /proc/cpuinfo.
Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Andreas Färber <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
|
|
* pm-sleep:
PM: hibernate: constify attribute_group structures.
PM / hibernate: Drop redundant parameter of swsusp_alloc()
PM / hibernate: Use CONFIG_HAVE_SET_MEMORY for include condition
x86/power/64: Use char arrays for asm function names
|
|
* pm-cpufreq:
cpufreq / CPPC: Initialize policy->min to lowest nonlinear performance
cpufreq: sfi: make freq_table static
cpufreq: exynos5440: Fix inconsistent indenting
cpufreq: imx6q: imx6ull should use the same flow as imx6ul
cpufreq: dt: Add support for hi3660
* intel_pstate:
cpufreq: Update scaling_cur_freq documentation
cpufreq: intel_pstate: Clean up after performance governor changes
intel_pstate: skip scheduler hook when in "performance" mode
intel_pstate: delete scheduler hook in HWP mode
x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF
cpufreq: intel_pstate: Remove max/min fractions to limit performance
x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"
* pm-cpuidle:
cpuidle: menu: allow state 0 to be disabled
intel_idle: Use more common logging style
x86/ACPI/cstate: Allow ACPI C1 FFH MWAIT use on AMD systems
ARM: cpuidle: Support asymmetric idle definition
|
|
* pm-tools:
cpupower: Add support for new AMD family 0x17
cpupower: Fix bug where return value was not used
tools/power turbostat: update version number
tools/power turbostat: decode MSR_IA32_MISC_ENABLE only on Intel
tools/power turbostat: stop migrating, unless '-m'
tools/power turbostat: if --debug, print sampling overhead
tools/power turbostat: hide SKL counters, when not requested
intel_pstate: use updated msr-index.h HWP.EPP values
tools/power x86_energy_perf_policy: support HWP.EPP
x86: msr-index.h: fix shifts to ULL results in HWP macros.
x86: msr-index.h: define HWP.EPP values
x86: msr-index.h: define EPB mid-points
|
|
Merge 'uuid-types' from git://git.infradead.org/users/hch/uuid.git
|
|
next/fixes-non-critical
mvebu fixes for 4.12 (part 2)
Fix Openblock A6 (kirkwood base board) nand partition overlap
* tag 'mvebu-fixes-4.12-2' of git://git.infradead.org/linux-mvebu:
ARM: dts: kirkwood: Fix Openblock A6 nand partition overlap
arm64: marvell: dts: fix interrupts in 7k/8k crypto nodes
|
|
Userspace application can do a hypercall through /dev/xen/privcmd, and
some for some hypercalls argument is a pointers to user-provided
structure. When SMAP is supported and enabled, hypervisor can't access.
So, lets allow it.
The same applies to HYPERVISOR_dm_op, where additionally privcmd driver
carefully verify buffer addresses.
Cc: [email protected]
Signed-off-by: Marek Marczykowski-Górecki <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>
|
|
Remove unnecessary variable mfn in function xen_foreach_remap_area() and,
refactor the code.
Variable mfn at line 518:mfn = xen_remap_buf.mfns[i];
is only being used to store a value to be passed as
an argument to the xen_update_mem_tables() function.
This value can be passed directly, which makes variable
mfn unnecessary. Also, value assigned to variable mfn
at line 534:mfn = xen_remap_mfn; is never used.
Addresses-Coverity-ID: 1260110
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>
|
|
of_device_ids are not supposed to change at runtime. All functions
working with of_device_ids provided by <linux/of.h> work with const
of_device_ids. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Adds the plumbing to disable A/D bits in the MMU based on a new role
bit, ad_disabled. When A/D is disabled, the MMU operates as though A/D
aren't available (i.e., using access tracking faults instead).
To avoid SP -> kvm_mmu_page.role.ad_disabled lookups all over the
place, A/D disablement is now stored in the SPTE. This state is stored
in the SPTE by tweaking the use of SPTE_SPECIAL_MASK for access
tracking. Rather than just setting SPTE_SPECIAL_MASK when an
access-tracking SPTE is non-present, we now always set
SPTE_SPECIAL_MASK for access-tracking SPTEs.
Signed-off-by: Peter Feiner <[email protected]>
[Use role.ad_disabled even for direct (non-shadow) EPT page tables. Add
documentation and a few MMU_WARN_ONs. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Specify both a mask (i.e., bits to consider) and a value (i.e.,
pattern of bits that indicates a special PTE) for mmio SPTEs. On
Intel, this lets us pack even more information into the
(SPTE_SPECIAL_MASK | EPT_VMX_RWX_MASK) mask we use for access
tracking liberating all (SPTE_SPECIAL_MASK | (non-misconfigured-RWX))
values.
Signed-off-by: Peter Feiner <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
The MMU always has hardware A bits or access tracking support, thus
it's unnecessary to handle the scenario where we have neither.
Signed-off-by: Peter Feiner <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD
- Better machine check handling for HV KVM
- Ability to support guests with threads=2, 4 or 8 on POWER9
- Fix for a race that could cause delayed recognition of signals
- Fix for a bug where POWER9 guests could sleep with interrupts
pending.
|
|
This fixes a typo where the wrong loop index was used to index
the kvmppc_xive_vcpu.queues[] array in xive_pre_save_scan().
The variable i contains the vcpu number; we need to index queues[]
using j, which iterates from 0 to KVMPPC_XIVE_Q_COUNT-1.
The effect of this bug is that things that save the interrupt
controller state, such as "virsh dump", on a VM with more than
8 vCPUs, result in xive_pre_save_queue() getting called on a
bogus queue structure, usually resulting in a crash like this:
[ 501.821107] Unable to handle kernel paging request for data at address 0x00000084
[ 501.821212] Faulting instruction address: 0xc008000004c7c6f8
[ 501.821234] Oops: Kernel access of bad area, sig: 11 [#1]
[ 501.821305] SMP NR_CPUS=1024
[ 501.821307] NUMA
[ 501.821376] PowerNV
[ 501.821470] Modules linked in: vhost_net vhost tap xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables ses enclosure scsi_transport_sas ipmi_powernv ipmi_devintf ipmi_msghandler powernv_op_panel kvm_hv nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc kvm tg3 ptp pps_core
[ 501.822477] CPU: 3 PID: 3934 Comm: live_migration Not tainted 4.11.0-4.git8caa70f.el7.centos.ppc64le #1
[ 501.822633] task: c0000003f9e3ae80 task.stack: c0000003f9ed4000
[ 501.822745] NIP: c008000004c7c6f8 LR: c008000004c7c628 CTR: 0000000030058018
[ 501.822877] REGS: c0000003f9ed7980 TRAP: 0300 Not tainted (4.11.0-4.git8caa70f.el7.centos.ppc64le)
[ 501.823030] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>
[ 501.823047] CR: 28022244 XER: 00000000
[ 501.823203] CFAR: c008000004c7c77c DAR: 0000000000000084 DSISR: 40000000 SOFTE: 1
[ 501.823203] GPR00: c008000004c7c628 c0000003f9ed7c00 c008000004c91450 00000000000000ff
[ 501.823203] GPR04: c0000003f5580000 c0000003f559bf98 9000000000009033 0000000000000000
[ 501.823203] GPR08: 0000000000000084 0000000000000000 00000000000001e0 9000000000001003
[ 501.823203] GPR12: c00000000008a7d0 c00000000fdc1b00 000000000a9a0000 0000000000000000
[ 501.823203] GPR16: 00000000402954e8 000000000a9a0000 0000000000000004 0000000000000000
[ 501.823203] GPR20: 0000000000000008 c000000002e8f180 c000000002e8f1e0 0000000000000001
[ 501.823203] GPR24: 0000000000000008 c0000003f5580008 c0000003f4564018 c000000002e8f1e8
[ 501.823203] GPR28: 00003ff6e58bdc28 c0000003f4564000 0000000000000000 0000000000000000
[ 501.825441] NIP [c008000004c7c6f8] xive_get_attr+0x3b8/0x5b0 [kvm]
[ 501.825671] LR [c008000004c7c628] xive_get_attr+0x2e8/0x5b0 [kvm]
[ 501.825887] Call Trace:
[ 501.825991] [c0000003f9ed7c00] [c008000004c7c628] xive_get_attr+0x2e8/0x5b0 [kvm] (unreliable)
[ 501.826312] [c0000003f9ed7cd0] [c008000004c62ec4] kvm_device_ioctl_attr+0x64/0xa0 [kvm]
[ 501.826581] [c0000003f9ed7d20] [c008000004c62fcc] kvm_device_ioctl+0xcc/0xf0 [kvm]
[ 501.826843] [c0000003f9ed7d40] [c000000000350c70] do_vfs_ioctl+0xd0/0x8c0
[ 501.827060] [c0000003f9ed7de0] [c000000000351534] SyS_ioctl+0xd4/0xf0
[ 501.827282] [c0000003f9ed7e30] [c00000000000b8e0] system_call+0x38/0xfc
[ 501.827496] Instruction dump:
[ 501.827632] 419e0078 3b760008 e9160008 83fb000c 83db0010 80fb0008 2f280000 60000000
[ 501.827901] 60000000 60420000 419a0050 7be91764 <7d284c2c> 552a0ffe 7f8af040 419e003c
[ 501.828176] ---[ end trace 2d0529a5bbbbafed ]---
Cc: [email protected]
Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
Acked-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|
|
* pci/resource:
PCI: Work around poweroff & suspend-to-RAM issue on Macbook Pro 11
PCI: Do not disregard parent resources starting at 0x0
Conflicts:
arch/x86/pci/fixup.c
|
|
* pci/pm:
PCI/PM: Avoid using device_may_wakeup() for runtime PM
x86/PCI: Avoid AMD SB7xx EHCI USB wakeup defect
PCI/PM: Restore the status of PCI devices across hibernation
drm/radeon: make MacBook Pro d3_delay quirk more generic
drm/amdgpu: remove unnecessary save/restore of pdev->d3_delay
PCI/PM: Add needs_resume flag to avoid suspend complete optimization
PCI: imx6: Fix config read timeout handling
switchtec: Fix minor bug with partition ID register
switchtec: Use new cdev_device_add() helper function
PCI: endpoint: Make PCI_ENDPOINT depend on HAS_DMA
|
|
Update the Hyper-V vPCI driver to use the Server-2016 version of the vPCI
protocol, fixing MSI creation and retargeting issues.
Signed-off-by: Jork Loeser <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: K. Y. Srinivasan <[email protected]>
Acked-by: K. Y. Srinivasan <[email protected]>
|
|
With the introduction of struct pci_host_bridge.map_irq pointer it is
possible to assign IRQs for all devices originating from a PCI host bridge
at probe time; this is implemented through pci_assign_irq() that relies on
the struct pci_host_bridge.map_irq pointer to map IRQ for a given device.
The benefits this brings are twofold:
- the IRQ for a device is assigned once at probe time
- the IRQ assignment works also for hotplugged devices
With all DT based PCI host bridges converted to the struct
pci_host_bridge.{map/swizzle}_irq hooks mechanism the DT IRQ allocation in
ARM64 pcibios_alloc_irq() is now redundant and can be removed.
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Acked-by: Will Deacon <[email protected]>
|
|
Legacy PCI host controllers (ie host controllers that set-up the PCI bus
through the ARM pci_common_init() API) are currently relying on
pci_fixup_irqs() to assign legacy PCI irqs to devices. This is not ideal
in that pci_fixup_irqs() assigns IRQs for all PCI devices present in a given
system some of which may well be enabled by the time pci_fixup_irqs() is
called (ie a system with multiple host controllers). With the introduction
of struct pci_host_bridge.(*map_irq) pointer it is possible to assign IRQs
for all devices originating from a PCI host bridge at probe time; this is
implemented through pci_assign_irq() that relies on the struct
pci_host_bridge.map_irq pointer to map IRQ for a given device.
The benefits this brings are twofold:
- the IRQ for a device is assigned once at probe time
- the IRQ assignment works also for hotplugged devices
Remove pci_fixup_irqs() call from bios32 code and rely on pci_assign_irq()
to carry out the IRQ mapping at device probe time.
The map_irq() and swizzle_irq() struct pci_host_bridge callbacks are set-up
in the struct pci_host_bridge created in the bios32 pcibios_init_hw()
function and mach-* code paths (for PCI mach implementations that require a
specific struct hw_pci.(*scan) function callback).
Signed-off-by: Lorenzo Pieralisi <[email protected]>
[bhelgaas: folded in fixes from Lorenzo:
http://lkml.kernel.org/r/20170701140629.GC8977@red-moon]
Signed-off-by: Bjorn Helgaas <[email protected]>
Cc: Jason Cooper <[email protected]>
Cc: Russell King <[email protected]>
Cc: Andrew Lunn <[email protected]>
|
|
When a process runs out of stack the parisc kernel wrongly faults with SIGBUS
instead of the expected SIGSEGV signal.
This example shows how the kernel faults:
do_page_fault() command='a.out' type=15 address=0xfaac2000 in libc-2.24.so[f8308000+16c000]
trap #15: Data TLB miss fault, vm_start = 0xfa2c2000, vm_end = 0xfaac2000
The vma->vm_end value is the first address which does not belong to the vma, so
adjust the check to include vma->vm_end to the range for which to send the
SIGSEGV signal.
This patch unbreaks building the debian libsigsegv package.
Cc: [email protected]
Signed-off-by: Helge Deller <[email protected]>
|
|
Architectures with a compat syscall table must put compat_sys_keyctl()
in it, not sys_keyctl(). The parisc architecture was not doing this;
fix it.
Cc: [email protected]
Signed-off-by: Eric Biggers <[email protected]>
Acked-by: Helge Deller <[email protected]>
Signed-off-by: Helge Deller <[email protected]>
|
|
Pull MIPS fixes from Ralf Baechle:
"Here's a final round of fixes for 4.12:
- Fix misordered instructions in assembly code making kenel startup
via UHB unreliable.
- Fix special case of MADDF and MADDF emulation.
- Fix alignment issue in address calculation in pm-cps on 64 bit.
- Fix IRQ tracing & lockdep when rescheduling
- Systems with MAARs require post-DMA cache flushes.
The reordering fix and the MADDF/MSUBF fix have sat in linux-next for
a number of days. The others haven't propagated from my pull tree to
linux-next yet but all have survived manual testing and Imagination's
automated test system and there are no pending bug reports"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: Avoid accidental raw backtrace
MIPS: Perform post-DMA cache flushes on systems with MAARs
MIPS: Fix IRQ tracing & lockdep when rescheduling
MIPS: pm-cps: Drop manual cache-line alignment of ready_count
MIPS: math-emu: Handle zero accumulator case in MADDF and MSUBF separately
MIPS: head: Reorder instructions missing a delay slot
|
|
Pull ARM fix from Russell King:
"One final fix for 4.12 - Doug found a boot failure case triggered by
requesting a non-even MB vmalloc size"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8685/1: ensure memblock-limit is pmd-aligned
|
|
On POWER9 SMT8 the 24x7 API returns two result elements for physical core
and virtual CPU events and we need to add their counts to get the final
result.
Reviewed-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: Thiago Jung Bauermann <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
POWER9 introduces a new version of the hypervisor API to access the 24x7
perf counters. The new version changed some of the structures used for
requests and results.
Signed-off-by: Thiago Jung Bauermann <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
There's an H24x7_DATA_BUFFER_SIZE constant, so use it in init_24x7_request.
There's also an HV_PERF_DOMAIN_MAX constant, so use it in
h_24x7_event_init. This makes the comment above the check redundant,
so remove it.
In add_event_to_24x7_request, a statement is terminated with a comma
instead of a semicolon. Fix it.
In hv-24x7.h, improve comments in struct hv_24x7_result.
Signed-off-by: Thiago Jung Bauermann <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The H_GET_24X7_CATALOG_PAGE hcall can return a signed error code, so fix
this in the code.
The H_GET_24X7_DATA hcall can return a signed error code, so fix this in
the code. Also, don't truncate it to 32 bit to use as return value for
make_24x7_request. In case of error h_24x7_event_commit_txn passes that
return value to generic code, so it should be a proper errno. The other
caller of make_24x7_request is single_24x7_request, whose callers don't
actually care which error code is returned so they are not affected by this
change.
Finally, h_24x7_get_value doesn't use the error code from
single_24x7_request, so there's no need to store it.
Reviewed-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: Thiago Jung Bauermann <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
make_24x7_request already calls log_24x7_hcall if it fails, so callers
don't have to do it again.
In fact, since the latter is now only called from the former, there's no
need for a separate log_24x7_hcall anymore so remove it.
Reviewed-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: Thiago Jung Bauermann <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
hv-24x7.h has a comment mentioning that result_buffer->results can't be
indexed as a normal array because it may contain results of variable sizes,
so fix the loop in h_24x7_event_commit_txn to take the variation into
account when iterating through results.
Another problem in that loop is that it sets h24x7hw->events[i] to NULL.
This assumes that only the i'th result maps to the i'th request, but that
is not guaranteed to be true. We need to leave the event in the array so
that we don't dereference a NULL pointer in case more than one result maps
to one request.
We still assume that each result has only one result element, so warn if
that assumption is violated.
Reviewed-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: Thiago Jung Bauermann <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|