Age | Commit message (Collapse) | Author | Files | Lines |
|
Raghu noticed an issue with excessive memory allocation on power with a
simple cgroup test, specifically, in mem_cgroup_css_alloc ->
for_each_node -> alloc_mem_cgroup_per_zone_info(), which ends up blowing
up the kmalloc-2048 slab (to the order of 200MB for 400 cgroup
directories).
The underlying issue is that NODES_SHIFT on power is 8 (256 NUMA nodes
possible), which defines node_possible_map, which in turn defines the
value of nr_node_ids in setup_nr_node_ids and the iteration of
for_each_node.
In practice, we never see a system with 256 NUMA nodes, and in fact, we
do not support node hotplug on power in the first place, so the nodes
that are online when we come up are the nodes that will be present for
the lifetime of this kernel. So let's, at least, drop the NUMA possible
map down to the online map at runtime. This is similar to what x86 does
in its initialization routines.
mem_cgroup_css_alloc should also be fixed to only iterate over
memory-populated nodes and handle hotplug, but that is a separate
change.
Signed-off-by: Nishanth Aravamudan <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Anton Blanchard <[email protected]>
Cc: Raghavendra K T <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The 'arg' argument to copy_thread() is only ever used when forking a new
kernel thread. Hence, rename it to 'kthread_arg' for clarity.
Signed-off-by: Alex Dowad <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
This patch changes the name of the make variable TARGETS, to prevent it
from colliding with a value set by the user on the command line (as they
are recommended to do by tools/testing/selftests/README.txt).
Without this patch, "make -C tools/testing/selftests TARGETS=powerpc"
will fail.
Signed-off-by: Sam Bobroff <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The goal is to verify vphn_unpack_associativity() parses VPHN numbers
correctly. We feed it with a variety of input values and compare with
expected results.
PAPR+ does not say much about VPHN parsing: I came up with a list of
tests that check many simple cases and some corner ones. I wouldn't
dare to say the list is exhaustive though.
Signed-off-by: Greg Kurz <[email protected]>
[mpe: Rework harness logic, rename to test-vphn, add -m64]
Signed-off-by: Michael Ellerman <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
|
|
The current VPHN parsing logic has some flaws that this patch aims to fix:
1) when the value 0xffff is read, the value 0xffffffff gets added to the
the output list and its element count isn't incremented. This is wrong.
According to PAPR+ the domain identifiers are packed into a sequence
terminated by the "reserved value of all ones". This means that 0xffff
is a stream terminator.
2) the combination of byteswaps and casts make the code hardly readable.
Let's parse the stream one 16-bit field at a time instead.
3) it is assumed that the hypercall returns 12 32-bit values packed into
6 64-bit registers. According to PAPR+, the domain identifiers may be
streamed as 16-bit values. Let's increase the number of expected numbers
to 24.
Signed-off-by: Greg Kurz <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The goal behind this patch is to be able to write userland tests for the
VPHN parsing code.
Suggested-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kurz <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The first argument to vphn_unpack_associativity() is a const long *, but the
parsing code expects __be64 values actually. Let's move the endian fixing
down for consistency.
Signed-off-by: Greg Kurz <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The number of values returned by the H_HOME_NODE_ASSOCIATIVITY h_call deserves
to be explicitly defined, for a better understanding of the code.
Signed-off-by: Greg Kurz <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
We have set CONFIG_PPC_OF to always 'y' in commit 0a498d96a332
("powerpc: set CONFIG_PPC_OF=y always for ARCH=powerpc") nine years
ago. And the arch/ppc also has gone away for many years. The OF
functionality was also moved to a common place and be used by many
archs. So it does make no sense to keep such a option in the current
kernel. Just kill it.
Signed-off-by: Kevin Hao <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The PPC_OF is a ppc specific option which is used to mean that the
firmware device tree access functions are available. Since all the
ppc platforms have a device tree, it is aways set to 'y' for ppc.
So it makes no sense to keep a such option in the current kernel.
Replace it with PPC.
Signed-off-by: Kevin Hao <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Acked-by: Tomi Valkeinen <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
In the current kernel, we don't need to include these arch specific
header files for ppc.
Signed-off-by: Kevin Hao <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Acked-by: Tomi Valkeinen <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The OF functionality has moved to a common place and be used by many
archs. So we don't need to include the ppc arch specific header files
and depend on PPC_OF option any more. This is a preparation for
killing PPC_OF.
Signed-off-by: Kevin Hao <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Acked-by: Tomi Valkeinen <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The OF functionality has moved to a common place and be used by many
archs. So we don't need to include the ppc arch specific header files
and depend on PPC_OF option any more. This is a preparation for
killing PPC_OF.
Signed-off-by: Kevin Hao <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Acked-by: Tomi Valkeinen <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The OF functionality has moved to a common place and be used by many
archs. So we don't need to depend on PPC_OF option any more. This is
a preparation for killing PPC_OF.
Signed-off-by: Kevin Hao <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Acked-by: Tomi Valkeinen <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The PPC_OF is a ppc specific option which is used to mean that the
firmware device tree access functions are available. Since all the
ppc platforms have a device tree, it is aways set to 'y' for ppc.
So it makes no sense to keep a such option in the current kernel.
Replace it with PPC.
Signed-off-by: Kevin Hao <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Acked-by: Tomi Valkeinen <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The PPC_OF is a ppc specific option which is used to mean that the
firmware device tree access functions are available. Since all the
ppc platforms have a device tree, it is aways set to 'y' for ppc.
So it makes no sense to keep a such option in the current kernel.
Replace it with PPC.
Signed-off-by: Kevin Hao <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Acked-by: Tomi Valkeinen <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The OF functionality has moved to a common place and be used by many
archs. So we don't need to include the ppc arch specific header files
and depend on PPC_OF option any more. This is a preparation for
killing PPC_OF.
Signed-off-by: Kevin Hao <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Replacing strncpy with strlcpy to avoid strings that lacks null terminate.
And removed unnecessary magic numbers.
Signed-off-by: Rickard Strandqvist <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
The recent change to remove the vrX defines exposed the fact that we are
building the copyloops tests without altivec enabled. It depends on the
toolchain as to whether altivec is on by default or not, so it only
breaks on some toolchains. But we should always enable it.
Fixes: c2ce6f9f3dc0 ("powerpc: Change vrX register defines to vX to match gcc and glibc")
Signed-off-by: Michael Ellerman <[email protected]>
|
|
If we of_find_node_by_name() then we must of_node_put() too.
Signed-off-by: Phil Carmody <[email protected]>
Signed-off-by: Aaro Koskinen <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
Cleanup was not in the reverse order from the set-up, so not all
the gotos made sense, and also it was being avoided completely upon
failure of init_pmu().
Signed-off-by: Phil Carmody <[email protected]>
Signed-off-by: Aaro Koskinen <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
These functions are only used from one place each. If the cacheable_*
versions really are more efficient, then those changes should be
migrated into the common code instead.
NOTE: The old routines are just flat buggy on kernels that support
hardware with different cacheline sizes.
Signed-off-by: Kyle Moffett <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
The patch removes unused file eeh-ioda.c and updates makefile
accordingly. Besides, the definition of "struct pnv_eeh_ops" and
the instances are all removed. Until now, the chip layer of EEH
implementation for PowerNV platform is removed completely.
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
The patch drops PHB EEH operation reset() and merges its logic to
eeh_ops::reset().
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
The patch drops PHB EEH operation next_error() and merges its
logic to eeh_ops::next_error().
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
The patch drops PHB EEH operation get_state() and merges its logic
to eeh_ops::get_state().
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
The patch drops PHB EEH operation set_option() and merges its
logic to eeh_ops::set_option().
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
The patch drops PHB EEH operation configure_bridge() and merges
its logic to eeh_ops::configure_bridge().
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
The patch drops PHB operation get_log() and merges its logic to
eeh_ops::get_log().
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
The patch drops PHB EEH operation post_init() and merge its logic
to eeh_ops::post_init().
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
The patch drops PHB EEH operation err_inject() and merge its logic
to eeh_ops::err_inject().
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
The patch shortens names of EEH functions in powernv-eeh.c and no
logic change introduced by this patch.
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
The patch fixes the comments about ppc_md.pcibios_fixup(), which
should be called after allocating resources.
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
Function pcibios_set_pcie_reset_state() is possibly called by
pci_reset_function(), on which VFIO infrastructure depends to
issue reset. pcibios_set_pcie_reset_state() is issuing reset
on the parent PE of the indicated PCI device. The reset causes
state lost on all PCI devices except the indicated one as the
argument to pcibios_set_pcie_reset_state(). Also, sideband
MMIO access from guest when issuing reset would cause unexpected
EEH error.
For above two issues, the patch applies following enhancements
to pcibios_set_pcie_reset_state():
* For all PCI devices except the indicated one, save their
state prior to reset and restore state after that.
* Explicitly freeze PE prior to reset and unfreeze it after
that, in order to avoid unexpected EEH error.
Tested-by: Priya M. A <[email protected]>
Signed-off-by: Gavin Shan <[email protected]>
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
|
|
The flush_tlb hook in cpu_spec was introduced as a generic function hook
to invalidate TLBs. But the current implementation of flush_tlb hook
takes IS (invalidation selector) as an argument which is architecture
dependent. Hence, It is not right to have a generic routine where caller
has to pass non-generic argument.
This patch fixes this and makes flush_tlb hook as high level API.
Reported-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Mahesh Salgaonkar <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
We use r6 and r7 for epapr boot, but the current pre-C init will clobber
both of these.
This change does a simple replacement, of r6 -> r12 and r7 -> r13, so
that we hit platform init with these registers intact.
Signed-off-by: Jeremy Kerr <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Currently, a 64-bit little-endian zImage.epapr won't boot in epapr mode,
as we never return from platform_init.
Before entering C, we initialise our stack by setting r1 16 bytes below
the end of the _bss_stack:
stwu r0,-16(r1) /* establish a stack frame */
However, the called function will save the caller's lr in the caller's
frame's lr save area, at -16(r1) to -32(r1).
This means that writes to the fdt variable will corrupt the saved link
register:
0000000020c06018 l O .bss 0000000000001000 _bss_stack
0000000020c07018 l O .bss 0000000000000008 fdt
We'll need at least 32 bytes in the initial stack frame, to handle the
LR save area. We bump this to 112 bytes, as that'll be the max required
by ABIv1.
Thanks to Alistair Popple for debugging help.
Signed-off-by: Jeremy Kerr <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
We'll likely be entering the zImage.epapr as BE, so include the pseries
implementation of _zimage_start, which adds the endian fixup magic.
Although the endian fixup won't work on Book III-E machines starting LE,
the current entry point doesn't support LE anyway, so we shouldn't be
breaking anything.
Signed-off-by: Jeremy Kerr <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
For epapr-style boot, we may be little-endian. This change implements
the proper conversion for fdt*_to_cpu and cpu_to_fdt*. We also need the
full cpu_to_* and *_to_cpu macros for this.
Signed-off-by: Jeremy Kerr <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Now that the wrapper supports 64-bit builds, we see warnings when
attempting to cast pointers to int. Use unsigned long instead.
Signed-off-by: Jeremy Kerr <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Drop unused fsl_mpic_primary_get_version(), mpic_set_clk_ratio(),
mpic_set_serial_int().
+ fsl_mpic_primary_get_version() is just a safe wrapper around
fsl_mpic_get_version() for SMP configurations. While the latter is
called explicitly for handling PIC initialization and setting up error
interrupt vector depending on PIC hardware version, the former isn't
used for anything.
+ As for mpic_set_clk_ratio() and mpic_set_serial_int(), they both are
almost nine years old[1] but still have no chance to be called even from
out-of-tree modules because they both are __init and of course aren't
exported.
[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2006-June/023867.html
Signed-off-by: Arseny Solokha <[email protected]>
Cc: [email protected]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Drop ucc_slow_poll_transmitter_now() which has no users since its
inception in 2007 in commit 986585385131 ("[POWERPC] Add QUICC
Engine (QE) infrastructure").
Signed-off-by: Arseny Solokha <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Drop planetcore_set_serial_speed() which had no users since its
inception in commit fec6047047fd ("[POWERPC] bootwrapper: Add PlanetCore
firmware support") in 2007.
Signed-off-by: Arseny Solokha <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
This removes definitions in opal-api.h that are completely unused in
Linux.
For each of these I see three possibilities, 1) we *should* be using
them in Linux and patches will arrive to do that, 2) they are not used
but should stay in the header to document the API for some important
reason, 3) they are not used and needn't be part of the API.
Signed-off-by: Michael Ellerman <[email protected]>
Reviewed-by: Stewart Smith <[email protected]>
|
|
This commit gets opal-api.h to mostly match the version in Skiboot as of
commit ea7d806ab0ba.
The exceptions are things which are not (currently) used in Linux.
Most of this is just whitespace and a few things moving around. I think
the diff is readable.
Also OpalMessageType became opal_msg_type, requiring a change in the
Linux code.
Finally Skiboot and Linux disagree on CAPI vs CXL, because CAPI means
something else in Linux. To handle that we just point the Linux wrapper,
which is named "cxl" to the OPAL token OPAL_PCI_SET_PHB_CAPI_MODE.
Signed-off-by: Michael Ellerman <[email protected]>
Reviewed-by: Stewart Smith <[email protected]>
|
|
We'd like to get to the stage where the OPAL API is defined in a header
that is identical between Linux and Skiboot.
As step one, split the bits that actually define the API into
opal-api.h. The Linux specific parts stay in opal.h.
Signed-off-by: Michael Ellerman <[email protected]>
Acked-by: Stewart Smith <[email protected]>
|
|
The $(image-n) variable will never exist, because unset Kconfig options
are '' and not 'n'.
Signed-off-by: Michal Marek <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The kfree() function tests whether its argument is NULL and then returns
immediately. Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Not all OPAL platforms support resending system dumps, so check
that current firmware supports it first. Otherwise we get firmware
complaining:
"OPAL: Called with bad token 91 !"
Signed-off-by: Stewart Smith <[email protected]>
Acked-by: Vasant Hegde <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Otherwise firmware complains: "OPAL: Called with bad token 74 !"
as not all OPAL systems have the ability to resend error logs.
Signed-off-by: Stewart Smith <[email protected]>
Acked-by: Vasant Hegde <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|