Age | Commit message (Collapse) | Author | Files | Lines |
|
The instruction code for xxlor that commit 0016a4cf5582 ("powerpc:
Emulate most Book I instructions in emulate_step()", 2010-06-15)
added is actually the code for xxlnor. It is used in get_vsr()
and put_vsr() and the effect of the error is that if emulate_step
is used to emulate a VSX load or store from any register other
than vsr0, the bitwise complement of the correct value will be
loaded or stored. This corrects the error.
Fixes: 0016a4cf5582 ("powerpc: Emulate most Book I instructions in emulate_step()")
Signed-off-by: Paul Mackerras <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Anton noticed that if we fault part way through emulating an unaligned
instruction, we don't update the DAR to reflect that.
The DAR value is eventually reported back to userspace as the address
in the SEGV signal, and if userspace is using that value to demand
fault then it can be confused by us not setting the value correctly.
This patch is ugly as hell, but is intended to be the minimal fix and
back ports easily.
Cc: [email protected]
Signed-off-by: Michael Ellerman <[email protected]>
Reviewed-by: Paul Mackerras <[email protected]>
|
|
A 31-bit compat process can force a BUG_ON in crst_table_upgrade
with specific, invalid mmap calls, e.g.
mmap((void*) 0x7fff8000, 0x10000, 3, 32, -1, 0)
The arch_get_unmapped_area[_topdown] functions miss an if condition
in the decision to do a page table upgrade.
Fixes: 9b11c7912d00 ("s390/mm: simplify arch_get_unmapped_area[_topdown]")
Cc: <[email protected]> # v4.12+
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
The mm->context.asce field of a new process is not set up correctly
in case of a fork with a 5 level page table.
Add the missing case to init_new_context().
Fixes: 1aea9b3f9210 ("s390/mm: implement 5 level pages tables")
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
The machine check information is part of the vsie_page.
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Christian Borntraeger <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Christian Borntraeger <[email protected]>
|
|
Move the real logic that always has to be executed out of the
WARN_ON_ONCE.
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Christian Borntraeger <[email protected]>
|
|
Looks like the "overflowing" range check is wrong.
|=======b-------a=======|
addr >= a || addr <= b
Signed-off-by: David Hildenbrand <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Christian Borntraeger <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]>
Signed-off-by: Christian Borntraeger <[email protected]>
|
|
assigned lmbs
Check if an LMB is assigned before attempting to call dlpar_acquire_drc
in order to avoid any unnecessary rtas calls. This substantially
reduces the running time of memory hot add on lpars with large amounts
of memory.
[mpe: We need to explicitly set rc to 0 in the success case, otherwise
the compiler might think we use rc without initialising it.]
Fixes: c21f515c7436 ("powerpc/pseries: Make the acquire/release of the drc for memory a seperate step")
Cc: [email protected] # v4.11+
Signed-off-by: John Allen <[email protected]>
Reviewed-by: Nathan Fontenot <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
and EFI_LOADER_* from KASLR's choice
There's a potential bug in how we select the KASLR kernel address n
the early boot code.
The KASLR boot code currently chooses the kernel image's physical memory
location from E820_TYPE_RAM regions by walking over all e820 entries.
E820_TYPE_RAM includes EFI_BOOT_SERVICES_CODE and EFI_BOOT_SERVICES_DATA
as well, so those regions can end up hosting the kernel image. According to
the UEFI spec, all memory regions marked as EfiBootServicesCode and
EfiBootServicesData are available as free memory after the first call
to ExitBootServices(). I.e. so such regions should be usable for the
kernel, per spec.
In real life however, we have workarounds for broken x86 firmware,
where we keep such regions reserved until SetVirtualAddressMap() is done.
See the following code in should_map_region():
static bool should_map_region(efi_memory_desc_t *md)
{
...
/*
* Map boot services regions as a workaround for buggy
* firmware that accesses them even when they shouldn't.
*
* See efi_{reserve,free}_boot_services().
*/
if (md->type =3D=3D EFI_BOOT_SERVICES_CODE ||
md->type =3D=3D EFI_BOOT_SERVICES_DATA)
return false;
This workaround suppressed a boot crash, but potential issues still
remain because no one prevents the regions from overlapping with kernel
image by KASLR.
So let's make sure that EFI_BOOT_SERVICES_{CODE|DATA} regions are never
chosen as kernel memory for the workaround to work fine.
Furthermore, EFI_LOADER_{CODE|DATA} regions are also excluded because
they can be used after ExitBootServices() as defined in EFI spec.
As a result, we choose kernel address only from EFI_CONVENTIONAL_MEMORY
which is the only memory type we know to be safely free.
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Junichi Nomura <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Garnier <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
[ Rewrote/fixed/clarified the changelog and the in code comments. ]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
struct platform_suspend_ops are not supposed to change at runtime.
Functions suspend_set_ops working with const platform_suspend_ops. So
mark the non-const structs as const.
Signed-off-by: Arvind Yadav <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
without the feature
When booting 4.13 on a VirtualBox VM on a Skylake host the following
error shows up in the logs:
[ 0.000000] [Firmware Bug]: TSC_DEADLINE disabled due to Errata;
please update microcode to version: 0xb2 (or later)
This is caused by apic_check_deadline_errata() only checking CPU model
and not the X86_FEATURE_TSC_DEADLINE_TIMER flag (which VirtualBox does
NOT export to the guest), combined with VirtualBox not exporting the
micro-code version to the guest.
This commit adds a check for X86_FEATURE_TSC_DEADLINE_TIMER to
apic_check_deadline_errata(), silencing this error on VirtualBox VMs.
Signed-off-by: Hans de Goede <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Cc: Frank Mehnert <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Michael Thayer <[email protected]>
Cc: Michal Necasek <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Fixes: bd9240a18e ("x86/apic: Add TSC_DEADLINE quirk due to errata")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
There's a subtle bug in how some of the paravirt guest code handles
page table freeing on x86:
On x86 software page table walkers depend on the fact that remote TLB flush
does an IPI: walk is performed lockless but with interrupts disabled and in
case the page table is freed the freeing CPU will get blocked as remote TLB
flush is required. On other architectures which don't require an IPI to do
remote TLB flush we have an RCU-based mechanism (see
include/asm-generic/tlb.h for more details).
In virtualized environments we may want to override the ->flush_tlb_others
callback in pv_mmu_ops and use a hypercall asking the hypervisor to do a
remote TLB flush for us. This breaks the assumption about IPIs. Xen PV has
been doing this for years and the upcoming remote TLB flush for Hyper-V will
do it too.
This is not safe, as software page table walkers may step on an already
freed page.
Fix the bug by enabling the RCU-based page table freeing mechanism,
CONFIG_HAVE_RCU_TABLE_FREE=y.
Testing with kernbench and mmap/munmap microbenchmarks, and neither showed
any noticeable performance impact.
Suggested-by: Peter Zijlstra <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Acked-by: Juergen Gross <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Andrew Cooper <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Jork Loeser <[email protected]>
Cc: KY Srinivasan <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
[ Rewrote/fixed/clarified the changelog. ]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The lack of newlines in preceding format strings is a clear indication
that these were meant to be continuations of one another, and indeed
output ends up quite a bit more compact (and readable) that way.
Switch other plain printk()-s in the function instances to pr_info(),
as requested.
Signed-off-by: Jan Beulich <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Stephen reported a merge conflict with the XEN tree. That also shows that the
IDT cleanup forgot to remove the now unused trace_{trap} defines.
Remove them.
Reported-by: Stephen Rothwell <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Juergen Gross <[email protected]>
|
|
In previous generations of Power processors each core had a private L2
cache. The Power 9 processor has a slightly different design where the
L2 cache is shared among pairs of cores rather than being completely
private.
Making the scheduler aware of this cache sharing allows the scheduler to
make better migration decisions. For example, if two CPU heavy tasks
share a core then one task can be migrated to the paired core to improve
throughput. Under the existing three level topology the task could be
migrated to any core on the same chip, while with the new topology it
would be preferentially migrated to the paired core so it remains
cache-hot.
Signed-off-by: Oliver O'Halloran <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
We want to add an extra level to the CPU scheduler topology to account
for cores which share a cache. To do this we need to build a cpumask
for each CPU that indicates which CPUs share this cache to use as an
input to the scheduler.
Signed-off-by: Oliver O'Halloran <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The CPU scheduler topology is constructed from a number of per-cpu
cpumasks which describe which sets of logical CPUs are related in some
fashion. Current code that handles constructing these masks when CPUs
are hot(un)plugged can be simplified a bit by exploiting the fact that
the scheduler requires higher levels of the toplogy (e.g package level
groupings) to be supersets of the lower levels (e.g. threas in a core).
This patch reworks the cpumask construction to be simpler and easier to
extend with extra topology levels.
Signed-off-by: Oliver O'Halloran <[email protected]>
[mpe: Fix CONFIG_HOTPLUG_CPU=n build]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
When building the CPU scheduler topology the kernel uses the ibm,chipid
property from the devicetree to group logical CPUs. Currently the DT
search for this property is open-coded in smp.c and this functionality
is a duplication of what's in cpu_to_chip_id() already. This patch
removes the existing search in favor of that.
It's worth mentioning that the semantics of the search are different
in cpu_to_chip_id(). When there is no ibm,chipid in the CPUs node it
will also search /cpus and / for the property, but this should not
effect the output topology.
Signed-off-by: Oliver O'Halloran <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
mpsc.c and mpc52xx-psc.c are platform-specific serial drivers, and
should be compiled for the respective platforms only.
Signed-off-by: Hannes Reinecke <[email protected]>
Reviewed-by: Torsten Duwe <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
.llong is an undocumented PPC specific directive. The generic
equivalent is .quad, but even better (because it's self describing) is
.8byte.
Convert all .llong directives to .8byte.
Signed-off-by: Tobin C. Harding <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Enable 64K page size and THP. I use ppc64le_defconfig when I need
a single config across guest and host, but having 4K page size
as default is not what I expect. I could move these over to
server.config and merge if ppc64_defconfig is meant for systems
that use 4k pages by default.
Signed-off-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Most (all?) distros turn these on, so it makes sense to enable them
for testing coverage, and they're also useful for developers.
Signed-off-by: Balbir Singh <[email protected]>
Acked-by: Naveen N. Rao <[email protected]>
[mpe: Reword change log]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Add support for printing the PIDR/TIDR for ISA 300 and PSSCR and PTCR
in ISA 3.0 hypervisor mode.
SPRN_PSSCR_PR is the privileged mode access and is used when we are
not in hypervisor mode.
Signed-off-by: Balbir Singh <[email protected]>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
This patch adds support to xmon for dumping the AMR, UAMOR, AMOR and
IAMR SPRs based on their supported ISA revisions.
Signed-off-by: Balbir Singh <[email protected]>
[mpe: Split out of larger patch]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
ISA 3.0 defines hypervisor decrementer to be 64 bits in length.
This patch extends the print format for to be 64 bits.
Signed-off-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Remove unneeded variables and assignments.
Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
When we map memory at boot we print out the ranges of real addresses
that we mapped and the page size that was used.
Currently it's a bit ugly:
Mapped range 0x0 - 0x2000000000 with 0x40000000
Mapped range 0x200000000000 - 0x202000000000 with 0x40000000
Pad the addresses so they line up, and print the page size using
actual units, eg:
Mapped 0x0000000000000000-0x0000000001200000 with 64.0 KiB pages
Mapped 0x0000000001200000-0x0000000040000000 with 2.00 MiB pages
Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Make the printks look a bit nicer by adding a prefix.
Signed-off-by: Michael Ellerman <[email protected]>
|
|
For a PCI device it's pci_dn can be retrieved from
pdev->dev.archdata.firmware_data, PCI_DN(devnode), or parent's list.
Thus, we should just use the existing function pci_get_pdn_by_devfn
to get the pci_dn.
Signed-off-by: Bryant G. Ly <[email protected]>
Reviewed-by: Sam Bobroff <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Define interfaces (wrappers) to the 'copy' and 'paste'
instructions (which are new in PowerISA 3.0). These are intended to be
used to by NX driver(s) to submit Coprocessor Request Blocks (CRBs) to
the NX hardware engines.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Define an interface to open a VAS send window. This interface is
intended to be used the Nest Accelerator (NX) driver(s) to open
a send window and use it to submit compression/encryption requests
to a VAS receive window.
The receive window, identified by the [vasid, cop] parameters, must
already be open in VAS (i.e connected to an NX engine).
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Define the vas_win_close() interface which should be used to close a
send or receive windows.
While the hardware configurations required to open send and receive
windows differ, the configuration to close a window is the same for
both. So we use a single interface to close the window.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Define the vas_rx_win_open() interface. This interface is intended to
be used by the Nest Accelerator (NX) driver(s) to setup receive
windows for one or more NX engines (which implement compression &
encryption algorithms in the hardware).
Follow-on patches will provide an interface to close the window and to
open a send window that kernel subsystems can use to access the NX
engines.
The interface to open a receive window is expected to be invoked for
each instance of VAS in the system.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Define helpers to allocate/free VAS window objects. These will be used
in follow-on patches when opening/closing windows.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Define helpers to initialize window context registers of the VAS
hardware. These will be used in follow-on patches when opening/closing
VAS windows.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Define some helper functions to access the MMIO regions. We use these
in follow-on patches to read/write VAS hardware registers. They are
also used to later issue 'paste' instructions to submit requests to
the NX hardware engines.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Implement vas_init() and vas_exit() functions for a new VAS module.
This VAS module is essentially a library for other device drivers
and kernel users of the NX coprocessors like NX-842 and NX-GZIP.
In the future this will be extended to add support for user space
to access the NX coprocessors.
VAS is currently only supported with 64K page size.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Move the GET_FIELD and SET_FIELD macros to vas.h as VAS and other
users of VAS, including NX-842 can use those macros.
There is a lot of related code between the VAS/NX kernel drivers
and skiboot. For consistency, switch the order of parameters in
SET_FIELD to match the order in skiboot.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Reviewed-by: Dan Streetman <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Define macros for the VAS hardware registers and bit-fields as well
as couple of data structures needed by the VAS driver.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
[mpe: Fixup include guard to use _ASM_POWERPC_VAS_H]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Convert 0.16x to 0.16lx. Otherwise we lose the top 8 nibbles and
effectively print only the last 32 bits.
Fixes: 1846193b178d ("powerpc/xmon: Dump ISA 2.06 SPRs")
Signed-off-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The check_req() helper uses pci_get_pdn() to get an OF node pointer.
pci_get_pdn() returns a pci_dn pointer which either:
1) from the OF node returned by pci_device_to_OF_node();
2) from the parent child_list where entries don't have OF node pointers.
Since check_req() does not care about 2), it can call
pci_device_to_OF_node() directly, hence the change.
The find_pe_dn() helper uses embedded pci_dn to get an OF node which is
also stored in edev->pdev so let's take a shortcut and call
pci_device_to_OF_node() directly.
With these 2 changes, we can finally get rid of the OF node back pointer.
Signed-off-by: Alexey Kardashevskiy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The pci_dn struct caches a OF device node pointer in order to access
the "ibm,loc-code" property when EEH is recovering.
However, when this happens in eeh_dev_check_failure(), we also have
a pci_dev pointer which should have a valid pointer to the device node
when pci_dn has one (both pointers are not NULL for physical functions
and are NULL for virtual functions).
This changes pci_remove_device_node_info() to look for a parent of
the node being removed, just like pci_add_device_node_info() does when it
references the parent node.
This is the first step to get rid of pci_dn::node.
Signed-off-by: Alexey Kardashevskiy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The eeh_dev struct hold a config space address of an associated node
and the very same address is also stored in the pci_dn struct which
is always present during the eeh_dev lifetime.
This uses bus:devfn directly from pci_dn instead of cached and packed
config_addr.
Since config_addr is made from device's bus:dev.fn, there is no point
in keeping it in the debugfs either so remove that too.
Signed-off-by: Alexey Kardashevskiy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The eeh_dev struct already holds a pointer to pci_dn which it does not
exist without and pci_dn itself holds the very same pointer so just
use it.
Signed-off-by: Alexey Kardashevskiy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
arch/powerpc/kernel/eeh_dev.c:57 is the only legit place where edev
is allocated; other 2 places allocate it on stack and in the heap for
a very short period of time to use eeh_pe_get() as takes edev.
This changes eeh_pe_get() to receive required parameters explicitly.
This removes unnecessary temporary allocation of edev.
This uses the "pe_no" name instead of the "pe_config_addr" name as
it actually is a PE number and not a config space address as it seemed.
Signed-off-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: Andrew Donnellan <[email protected]>
Acked-by: Russell Currey <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
pdev is always NULL, remove it.
To make checkpatch.pl happy, this also removes the "out of memory"
message.
Signed-off-by: Alexey Kardashevskiy <[email protected]>
Reviewed-by: Andrew Donnellan <[email protected]>
Acked-by: Russell Currey <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
clk_div_tables are not supposed to change at runtime.
mpc512x_clk_divtable function working with const clk_div_table. So
mark the non-const structs as const.
Signed-off-by: Arvind Yadav <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
My static checker complains that 0x00001800 >> 13 is zero. Looking at
the context, it seems like a copy and paste bug from the line below
and probably 0x3 << 13 or 0x00006000 was intended.
Fixes: 2af59f7d5c3e ("[POWERPC] 4xx: Add 405GPr and 405EP support in boot wrapper")
Signed-off-by: Dan Carpenter <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
There is a cut and paste error here so we use sizeof(struct mpc83xx_pmc)
to remap the memory for "clock_regs". That sizeof() is 20 bytes and we
only need to remap 12 bytes. It presumably doesn't affect run time too
much...
I changed them to both use sizeof(*variable_name) because that's the
preferred kernel style these days.
Fixes: d49747bdfb2d ("powerpc/mpc83xx: Power Management support")
Signed-off-by: Dan Carpenter <[email protected]>
[mpe: It will map at least one page anyway, but still a good cleanup]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Use nmi_enter similarly to system reset interrupts. This uses NMI
printk NMI buffers and turns off various debugging facilities that
helps avoid tripping on ourselves or other CPUs.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|