Age | Commit message (Collapse) | Author | Files | Lines |
|
MSIs should be fully managed by the PCI and IRQ subsystems now.
Signed-off-by: Cédric Le Goater <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The PowerNV and pSeries platforms now have support for both the XICS
and XIVE IRQ domains.
Signed-off-by: Cédric Le Goater <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
PHB3s need an extra OPAL call to EOI the interrupt. The call takes an
OPAL HW IRQ number but it is translated into a vector number in OPAL.
Here, we directly use the vector number of the in-the-middle "PNV-MSI"
domain instead of grabbing the OPAL HW IRQ number in the XICS parent
domain.
Signed-off-by: Cédric Le Goater <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Passthrough PCI MSI interrupts are detected in KVM with a check on a
specific EOI handler (P8) or on XIVE (P9). We can now check the
PCI-MSI IRQ chip which is cleaner.
Signed-off-by: Cédric Le Goater <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This is very similar to the MSI domains of the pSeries platform. The
MSI allocator is directly handled under the Linux PHB in the
in-the-middle "PNV-MSI" domain.
Only the XIVE (P9/P10) parent domain is supported for now. Support for
XICS will come later.
Signed-off-by: Cédric Le Goater <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
It will be used as a 'compose_msg' handler of the MSI domain introduced
later.
Signed-off-by: Cédric Le Goater <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Simply allocate or release the MSI domains when a PHB is inserted in
or removed from the machine.
Signed-off-by: Cédric Le Goater <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The MSI domain clears the IRQ with msi_domain_free(), which calls
irq_domain_free_irqs_top(), which clears the handler data. This is a
problem for the XIVE controller since we need to unmap MMIO pages and
free a specific XIVE structure.
The 'msi_free()' handler is called before irq_domain_free_irqs_top()
when the handler data is still available. Use that to clear the XIVE
controller data.
Signed-off-by: Cédric Le Goater <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The RTAS firmware can not disable one MSI at a time. It's all or
nothing. We need a custom free IRQ handler for that.
Signed-off-by: Cédric Le Goater <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Two IRQ domains are added on top of default machine IRQ domain.
First, the top level "pSeries-PCI-MSI" domain deals with the MSI
specificities. In this domain, the HW IRQ numbers are generated by the
PCI MSI layer, they compose a unique ID for an MSI source with the PCI
device identifier and the MSI vector number.
These numbers can be quite large on a pSeries machine running under
the IBM Hypervisor and /sys/kernel/irq/ and /proc/interrupts will
require small fixes to show them correctly.
Second domain is the in-the-middle "pSeries-MSI" domain which acts as
a proxy between the PCI MSI subsystem and the machine IRQ subsystem.
It usually allocate the MSI vector numbers but, on pSeries machines,
this is done by the RTAS FW and RTAS returns IRQ numbers in the IRQ
number space of the machine. This is why the in-the-middle "pSeries-MSI"
domain has the same HW IRQ numbers as its parent domain.
Only the XIVE (P9/P10) parent domain is supported for now. We still
need to add support for IRQ domain hierarchy under XICS.
Signed-off-by: Cédric Le Goater <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This splits the routine setting the MSIs in two parts: allocation of
MSIs for the PCI device at the FW level (RTAS) and the actual mapping
and activation of the IRQs.
rtas_prepare_msi_irqs() will serve as a handler for the PCI MSI domain.
Signed-off-by: Cédric Le Goater <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
It will help to size the PCI MSI domain.
Signed-off-by: Cédric Le Goater <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().
Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
When a CPU is hot added, the CPU ids are taken from the available mask
from the lower possible set. If that set of values was previously used
for a CPU attached to a different node, it appears to an application as
if these CPUs have migrated from one node to another node which is not
expected.
To prevent this, it is needed to record the CPU ids used for each node
and to not reuse them on another node. However, to prevent CPU hot plug
to fail, in the case the CPU ids is starved on a node, the capability to
reuse other nodes’ free CPU ids is kept. A warning is displayed in such
a case to warn the user.
A new CPU bit mask (node_recorded_ids_map) is introduced for each
possible node. It is populated with the CPU onlined at boot time, and
then when a CPU is hot plugged to a node. The bits in that mask remain
when the CPU is hot unplugged, to remind this CPU ids have been used for
this node.
If no id set was found, a retry is made without removing the ids used on
the other nodes to try reusing them. This is the way ids have been
allocated prior to this patch.
The effect of this patch can be seen by removing and adding CPUs using
the Qemu monitor. In the following case, the first CPU from the node 2
is removed, then the first one from the node 1 is removed too. Later,
the first CPU of the node 2 is added back. Without that patch, the
kernel will number these CPUs using the first CPU ids available which
are the ones freed when removing the second CPU of the node 0. This
leads to the CPU ids 16-23 to move from the node 1 to the node 2. With
the patch applied, the CPU ids 32-39 are used since they are the lowest
free ones which have not been used on another node.
At boot time:
[root@vm40 ~]# numactl -H | grep cpus
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
node 1 cpus: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
node 2 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
Vanilla kernel, after the CPU hot unplug/plug operations:
[root@vm40 ~]# numactl -H | grep cpus
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
node 1 cpus: 24 25 26 27 28 29 30 31
node 2 cpus: 16 17 18 19 20 21 22 23 40 41 42 43 44 45 46 47
Patched kernel, after the CPU hot unplug/plug operations:
[root@vm40 ~]# numactl -H | grep cpus
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
node 1 cpus: 24 25 26 27 28 29 30 31
node 2 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
Signed-off-by: Laurent Dufour <[email protected]>
Reviewed-by: Nathan Lynch <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
After a LPM, the device tree node ibm,dynamic-reconfiguration-memory may be
updated by the hypervisor in the case the NUMA topology of the LPAR's
memory is updated.
This is handled by the kernel, but the memory's node is not updated because
there is no way to move a memory block between nodes from the Linux kernel
point of view.
If later a memory block is added or removed, drmem_update_dt() is called
and it is overwriting the DT node ibm,dynamic-reconfiguration-memory to
match the added or removed LMB. But the LMB's associativity node has not
been updated after the DT node update and thus the node is overwritten by
the Linux's topology instead of the hypervisor one.
Introduce a hook called when the ibm,dynamic-reconfiguration-memory node is
updated to force an update of the LMB's associativity. However, ignore the
call to that hook when the update has been triggered by drmem_update_dt().
Because, in that case, the LMB tree has been used to set the DT property
and thus it doesn't need to be updated back. Since drmem_update_dt() is
called under the protection of the device_hotplug_lock and the hook is
called in the same context, use a simple boolean variable to detect that
call.
Signed-off-by: Laurent Dufour <[email protected]>
Reviewed-by: Nathan Lynch <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The .map_sg() op now expects an error code instead of zero on failure.
Propagate the error up if vio_dma_iommu_map_sg() fails.
ppc_iommu_map_sg() may fail either because of iommu_range_alloc() or
because of tbl->it_ops->set(). The former only supports returning an
error with DMA_MAPPING_ERROR and an examination of the latter indicates
that it may return arch-specific errors (for example,
tce_buildmulti_pSeriesLP()). Hence, coalesce all of those errors into
-EIO, per the documentation on dma_map_sgtable().
Signed-off-by: Martin Oliveira <[email protected]>
Signed-off-by: Logan Gunthorpe <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Geoff Levand <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
We need the driver core fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
After LPM, when migrating from a system with security mitigation enabled
to a system with mitigation disabled, the security flavor exposed in
/proc is not correctly set back to 0.
Do not assume the value of the security flavor is set to 0 when entering
init_cpu_char_feature_flags(), so when called after a LPM, the value is
set correctly even if the mitigation are not turned off.
Fixes: 6ce56e1ac380 ("powerpc/pseries: export LPAR security flavor in lparcfg")
Cc: [email protected] # v5.13+
Signed-off-by: Laurent Dufour <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Don't use r30 in VDSO code, to avoid breaking existing Go lang
programs.
- Change an export symbol to allow non-GPL modules to use spinlocks
again.
Thanks to Paul Menzel, and Srikar Dronamraju.
* tag 'powerpc-5.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/vdso: Don't use r30 to avoid breaking Go lang
powerpc/pseries: Fix regression while building external modules
|
|
Commit ad6c00283163 ("swiotlb: Free tbl memory in swiotlb_exit()")
introduced a set_memory_encrypted() call to swiotlb_exit() so that the
buffer pages are returned to an encrypted state prior to being freed.
Sachin reports that this leads to the following crash on a Power server:
[ 0.010799] software IO TLB: tearing down default memory pool
[ 0.010805] ------------[ cut here ]------------
[ 0.010808] kernel BUG at arch/powerpc/kernel/interrupt.c:98!
Nick spotted that this is because set_memory_encrypted() is issuing an
ultracall which doesn't exist for the processor, and should therefore
be gated by mem_encrypt_active() to mirror the x86 implementation.
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Claire Chang <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Robin Murphy <[email protected]>
Fixes: ad6c00283163 ("swiotlb: Free tbl memory in swiotlb_exit()")
Suggested-by: Nicholas Piggin <[email protected]>
Reported-by: Sachin Sant <[email protected]>
Tested-by: Sachin Sant <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Link: https://lore.kernel.org/r/[email protected]/
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
|
|
With commit c9f3401313a5 ("powerpc: Always enable queued spinlocks for
64s, disable for others") CONFIG_PPC_QUEUED_SPINLOCKS is always
enabled on ppc64le, external modules that use spinlock APIs are
failing.
ERROR: modpost: GPL-incompatible module XXX.ko uses GPL-only symbol 'shared_processor'
Before the above commit, modules were able to build without any
issues. Also this problem is not seen on other architectures. This
problem can be workaround if CONFIG_UNINLINE_SPIN_UNLOCK is enabled in
the config. However CONFIG_UNINLINE_SPIN_UNLOCK is not enabled by
default and only enabled in certain conditions like
CONFIG_DEBUG_SPINLOCKS is set in the kernel config.
#include <linux/module.h>
spinlock_t spLock;
static int __init spinlock_test_init(void)
{
spin_lock_init(&spLock);
spin_lock(&spLock);
spin_unlock(&spLock);
return 0;
}
static void __exit spinlock_test_exit(void)
{
printk("spinlock_test unloaded\n");
}
module_init(spinlock_test_init);
module_exit(spinlock_test_exit);
MODULE_DESCRIPTION ("spinlock_test");
MODULE_LICENSE ("non-GPL");
MODULE_AUTHOR ("Srikar Dronamraju");
Given that spin locks are one of the basic facilities for module code,
this effectively makes it impossible to build/load almost any non GPL
modules on ppc64le.
This was first reported at https://github.com/openzfs/zfs/issues/11172
Currently shared_processor is exported as GPL only symbol.
Fix this for parity with other architectures by exposing
shared_processor to non-GPL modules too.
Fixes: 14c73bd344da ("powerpc/vcpu: Assume dedicated processors as non-preempt")
Cc: [email protected] # v5.5+
Reported-by: [email protected]
Signed-off-by: Srikar Dronamraju <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
We need the driver-core fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
As kprobe does not handle events happening in real mode, blacklist the
functions that only get called in real mode or in kexec sequence with
MMU turned off.
Signed-off-by: Hari Bathini <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/162626687834.155313.4692863392927831843.stgit@hbathini-workstation.ibm.com
|
|
The driver core ignores the return value of this callback because there
is only little it can do when a device disappears.
This is the final bit of a long lasting cleanup quest where several
buses were converted to also return void from their remove callback.
Additionally some resource leaks were fixed that were caused by drivers
returning an error code in the expectation that the driver won't go
away.
With struct bus_type::remove returning void it's prevented that newly
implemented buses return an ignored error code and so don't anticipate
wrong expectations for driver authors.
Reviewed-by: Tom Rix <[email protected]> (For fpga)
Reviewed-by: Mathieu Poirier <[email protected]>
Reviewed-by: Cornelia Huck <[email protected]> (For drivers/s390 and drivers/vfio)
Acked-by: Russell King (Oracle) <[email protected]> (For ARM, Amba and related parts)
Acked-by: Mark Brown <[email protected]>
Acked-by: Chen-Yu Tsai <[email protected]> (for sunxi-rsb)
Acked-by: Pali Rohár <[email protected]>
Acked-by: Mauro Carvalho Chehab <[email protected]> (for media)
Acked-by: Hans de Goede <[email protected]> (For drivers/platform)
Acked-by: Alexandre Belloni <[email protected]>
Acked-By: Vinod Koul <[email protected]>
Acked-by: Juergen Gross <[email protected]> (For xen)
Acked-by: Lee Jones <[email protected]> (For mfd)
Acked-by: Johannes Thumshirn <[email protected]> (For mcb)
Acked-by: Johan Hovold <[email protected]>
Acked-by: Srinivas Kandagatla <[email protected]> (For slimbus)
Acked-by: Kirti Wankhede <[email protected]> (For vfio)
Acked-by: Maximilian Luz <[email protected]>
Acked-by: Heikki Krogerus <[email protected]> (For ulpi and typec)
Acked-by: Samuel Iglesias Gonsálvez <[email protected]> (For ipack)
Acked-by: Geoff Levand <[email protected]> (For ps3)
Acked-by: Yehezkel Bernat <[email protected]> (For thunderbolt)
Acked-by: Alexander Shishkin <[email protected]> (For intel_th)
Acked-by: Dominik Brodowski <[email protected]> (For pcmcia)
Acked-by: Rafael J. Wysocki <[email protected]> (For ACPI)
Acked-by: Bjorn Andersson <[email protected]> (rpmsg and apr)
Acked-by: Srinivas Pandruvada <[email protected]> (For intel-ish-hid)
Acked-by: Dan Williams <[email protected]> (For CXL, DAX, and NVDIMM)
Acked-by: William Breathitt Gray <[email protected]> (For isa)
Acked-by: Stefan Richter <[email protected]> (For firewire)
Acked-by: Benjamin Tissoires <[email protected]> (For hid)
Acked-by: Thorsten Scherer <[email protected]> (For siox)
Acked-by: Sven Van Asbroeck <[email protected]> (For anybuss)
Acked-by: Ulf Hansson <[email protected]> (For MMC)
Acked-by: Wolfram Sang <[email protected]> # for I2C
Acked-by: Sudeep Holla <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]>
Acked-by: Dmitry Torokhov <[email protected]>
Acked-by: Finn Thain <[email protected]>
Signed-off-by: Uwe Kleine-König <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Fix the following fallthrough warning:
arch/powerpc/platforms/pasemi/idle.c:45:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/60efbf18.d9n6eXv275OJcc7T%[email protected]/
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
Fix the following fallthrough warning:
arch/powerpc/platforms/powermac/smp.c:149:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/60ef0750.I8J+C6KAtb0xVOAa%[email protected]/
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Fix crashes on 64-bit Book3E due to use of Book3S only mtmsrd
instruction.
Fix "scheduling while atomic" warnings at boot due to preempt count
underflow.
Two commits fixing our handling of BPF atomic instructions.
Fix error handling in xive when allocating an IPI.
Fix lockup on kernel exec fault on 603.
Thanks to Bharata B Rao, Cédric Le Goater, Christian Zigotzky,
Christophe Leroy, Guenter Roeck, Jiri Olsa, Naveen N. Rao, Nicholas
Piggin, and Valentin Schneider"
* tag 'powerpc-5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/preempt: Don't touch the idle task's preempt_count during hotplug
powerpc/64e: Fix system call illegal mtmsrd instruction
powerpc/xive: Fix error handling when allocating an IPI
powerpc/bpf: Reject atomic ops in ppc32 JIT
powerpc/bpf: Fix detecting BPF atomic instructions
powerpc/mm: Fix lockup on kernel exec fault
|
|
mremap HAVE_MOVE_PMD/PUD optimization time comparison for 1GB region:
1GB mremap - Source PTE-aligned, Destination PTE-aligned
mremap time: 2292772ns
1GB mremap - Source PMD-aligned, Destination PMD-aligned
mremap time: 1158928ns
1GB mremap - Source PUD-aligned, Destination PUD-aligned
mremap time: 63886ns
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Kalesh Singh <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Powerpc currently resets a CPU's idle task preempt_count to 0 before
said task starts executing the secondary startup routine (and becomes an
idle task proper).
This conflicts with commit f1a0a376ca0c ("sched/core: Initialize the
idle task with preemption disabled").
which initializes all of the idle tasks' preempt_count to
PREEMPT_DISABLED during smp_init(). Note that this was superfluous
before said commit, as back then the hotplug machinery would invoke
init_idle() via idle_thread_get(), which would have already reset the
CPU's idle task's preempt_count to PREEMPT_ENABLED.
Get rid of this preempt_count write.
Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled")
Reported-by: Bharata B Rao <[email protected]>
Signed-off-by: Valentin Schneider <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Tested-by: Bharata B Rao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- A big series refactoring parts of our KVM code, and converting some
to C.
- Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on
some CPUs.
- Support for the Microwatt soft-core.
- Optimisations to our interrupt return path on 64-bit.
- Support for userspace access to the NX GZIP accelerator on PowerVM on
Power10.
- Enable KUAP and KUEP by default on 32-bit Book3S CPUs.
- Other smaller features, fixes & cleanups.
Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira
Rajeev, Baokun Li, Benjamin Herrenschmidt, Bharata B Rao, Christophe
Leroy, Daniel Axtens, Daniel Henrique Barboza, Finn Thain, Geoff Levand,
Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley, Jordan Niethe,
Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas
Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika
Vasireddy, Shaokun Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar
Singh, Tom Rix, Vaibhav Jain, YueHaibing, Zhang Jianhua, and Zhen Lei.
* tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (218 commits)
powerpc: Only build restart_table.c for 64s
powerpc/64s: move ret_from_fork etc above __end_soft_masked
powerpc/64s/interrupt: clean up interrupt return labels
powerpc/64/interrupt: add missing kprobe annotations on interrupt exit symbols
powerpc/64: enable MSR[EE] in irq replay pt_regs
powerpc/64s/interrupt: preserve regs->softe for NMI interrupts
powerpc/64s: add a table of implicit soft-masked addresses
powerpc/64e: remove implicit soft-masking and interrupt exit restart logic
powerpc/64e: fix CONFIG_RELOCATABLE build warnings
powerpc/64s: fix hash page fault interrupt handler
powerpc/4xx: Fix setup_kuep() on SMP
powerpc/32s: Fix setup_{kuap/kuep}() on SMP
powerpc/interrupt: Use names in check_return_regs_valid()
powerpc/interrupt: Also use exit_must_hard_disable() on PPC32
powerpc/sysfs: Replace sizeof(arr)/sizeof(arr[0]) with ARRAY_SIZE
powerpc/ptrace: Refactor regs_set_return_{msr/ip}
powerpc/ptrace: Move set_return_regs_changed() before regs_set_return_{msr/ip}
powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()
powerpc/pseries/vas: Include irqdomain.h
powerpc: mark local variables around longjmp as volatile
...
|
|
Merge more updates from Andrew Morton:
"190 patches.
Subsystems affected by this patch series: mm (hugetlb, userfaultfd,
vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock,
migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap,
zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc,
core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs,
signals, exec, kcov, selftests, compress/decompress, and ipc"
* emailed patches from Andrew Morton <[email protected]>: (190 commits)
ipc/util.c: use binary search for max_idx
ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock
ipc: use kmalloc for msg_queue and shmid_kernel
ipc sem: use kvmalloc for sem_undo allocation
lib/decompressors: remove set but not used variabled 'level'
selftests/vm/pkeys: exercise x86 XSAVE init state
selftests/vm/pkeys: refill shadow register after implicit kernel write
selftests/vm/pkeys: handle negative sys_pkey_alloc() return code
selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random
kcov: add __no_sanitize_coverage to fix noinstr for all architectures
exec: remove checks in __register_bimfmt()
x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned
hfsplus: report create_date to kstat.btime
hfsplus: remove unnecessary oom message
nilfs2: remove redundant continue statement in a while-loop
kprobes: remove duplicated strong free_insn_page in x86 and s390
init: print out unknown kernel parameters
checkpatch: do not complain about positive return values starting with EPOLL
checkpatch: improve the indented label test
checkpatch: scripts/spdxcheck.py now requires python3
...
|
|
ZONE_[DMA|DMA32] configs have duplicate definitions on platforms that
subscribe to them. Instead, just make them generic options which can be
selected on applicable platforms.
Also only x86/arm64 architectures could enable both ZONE_DMA and
ZONE_DMA32 if EXPERT, add ARCH_HAS_ZONE_DMA_SET to make dma zone
configurable and visible on the two architectures.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Kefeng Wang <[email protected]>
Acked-by: Catalin Marinas <[email protected]> [arm64]
Acked-by: Geert Uytterhoeven <[email protected]> [m68k]
Acked-by: Mike Rapoport <[email protected]>
Acked-by: Palmer Dabbelt <[email protected]> [RISC-V]
Acked-by: Michal Simek <[email protected]> [microblaze]
Acked-by: Michael Ellerman <[email protected]> [powerpc]
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Russell King <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Updates for the interrupt subsystem:
Core changes:
- Cleanup and simplification of common code to invoke the low level
interrupt flow handlers when this invocation requires irqdomain
resolution. Add the necessary core infrastructure.
- Provide a proper interface for modular PMU drivers to set the
interrupt affinity.
- Add a request flag which allows to exclude interrupts from spurious
interrupt detection. Useful especially for IPI handlers which
always return IRQ_HANDLED which turns the spurious interrupt
detection into a pointless waste of CPU cycles.
Driver changes:
- Bulk convert interrupt chip drivers to the new irqdomain low level
flow handler invocation mechanism.
- Add device tree bindings for the Renesas R-Car M3-W+ SoC
- Enable modular build of the Qualcomm PDC driver
- The usual small fixes and improvements"
* tag 'irq-core-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
dt-bindings: interrupt-controller: arm,gic-v3: Describe GICv3 optional properties
irqchip: gic-pm: Remove redundant error log of clock bulk
irqchip/sun4i: Remove unnecessary oom message
irqchip/irq-imx-gpcv2: Remove unnecessary oom message
irqchip/imgpdc: Remove unnecessary oom message
irqchip/gic-v3-its: Remove unnecessary oom message
irqchip/gic-v2m: Remove unnecessary oom message
irqchip/exynos-combiner: Remove unnecessary oom message
irqchip: Bulk conversion to generic_handle_domain_irq()
genirq: Move non-irqdomain handle_domain_irq() handling into ARM's handle_IRQ()
genirq: Add generic_handle_domain_irq() helper
irqchip/nvic: Convert from handle_IRQ() to handle_domain_irq()
irqdesc: Fix __handle_domain_irq() comment
genirq: Use irq_resolve_mapping() to implement __handle_domain_irq() and co
irqdomain: Introduce irq_resolve_mapping()
irqdomain: Protect the linear revmap with RCU
irqdomain: Cache irq_data instead of a virq number in the revmap
irqdomain: Use struct_size() helper when allocating irqdomain
irqdomain: Make normal and nomap irqdomains exclusive
powerpc: Move the use of irq_domain_add_nomap() behind a config option
...
|
|
There are patches in flight to break the dependency between asm/irq.h
and linux/irqdomain.h, which would break compilation of vas.c because it
needs the declaration of irq_create_mapping() etc.
So add an explicit include of irqdomain.h to avoid that becoming a
problem in future.
Reported-by: Stephen Rothwell <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Instead of making bare calls to get-sensor-state, use
rtas_get_sensor(), which correctly handles busy and extended delay
statuses.
Fixes: ab519a011caa ("powerpc/pseries: Kernel DLPAR Infrastructure")
Signed-off-by: Nathan Lynch <[email protected]>
Reviewed-by: Laurent Dufour <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
There is a spelling mistake "byes" -> "bytes" in a comment of
function drc_pmem_query_stats(). Fix that typo.
Signed-off-by: Kajol Jain <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Commit a21d1becaa3f ("powerpc: Reintroduce is_kvm_guest() as a fast-path
check") added is_kvm_guest() and changed kvm_para_available() to use it.
is_kvm_guest() checks a static key, kvm_guest, and that static key is
set in check_kvm_guest().
The problem is check_kvm_guest() is only called on pseries, and even
then only in some configurations. That means is_kvm_guest() always
returns false on all non-pseries and some pseries depending on
configuration. That's a bug.
For PR KVM guests this is noticable because they no longer do live
patching of themselves, which can be detected by the omission of a
message in dmesg such as:
KVM: Live patching for a fast VM worked
To fix it make check_kvm_guest() an initcall, to ensure it's always
called at boot. It needs to be core so that it runs before
kvm_guest_init() which is postcore. To be an initcall it needs to return
int, where 0 means success, so update that.
We still call it manually in pSeries_smp_probe(), because that runs
before init calls are run.
Fixes: a21d1becaa3f ("powerpc: Reintroduce is_kvm_guest() as a fast-path check")
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
When we boot from open firmware (OF) using PPC_OF_BOOT_TRAMPOLINE, aka.
prom_init, we run parts of the kernel at an address other than the link
address. That happens because OF loads the kernel above zero (OF is at
zero) and we run prom_init before copying the kernel down to zero.
Currently that works even for non-relocatable kernels, because we do
various fixups to the prom_init code to make it run where it's loaded.
However those fixups are not sufficient if the kernel becomes large
enough. In that case prom_init()'s final call to __start() can end up
generating a plt branch:
bl c000000002000018 <00000078.plt_branch.__start>
That results in the kernel jumping to the linked address of __start,
0xc000000000000000, when really it needs to jump to the
0xc000000000000000 + the runtime address because the kernel is still
running at the load address.
We could do further shenanigans to handle that, see Jordan's patch for
example:
https://lore.kernel.org/linuxppc-dev/[email protected]
However it is much simpler to just require a kernel with prom_init() to
be built relocatable. The result works in all configurations without
further work, and requires less code.
This should have no effect on most people, as our defconfigs and
essentially all distro configs already have RELOCATABLE enabled.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Persistent memory devices like NVDIMMs can loose cached writes in case
something prevents flush on power-fail. Such situations are termed as
dirty shutdown and are exposed to applications as
last-shutdown-state (LSS) flag and a dirty-shutdown-counter(DSC) as
described at [1]. The latter being useful in conditions where multiple
applications want to detect a dirty shutdown event without racing with
one another.
PAPR-NVDIMMs have so far only exposed LSS style flags to indicate a
dirty-shutdown-state. This patch further adds support for DSC via the
"ibm,persistence-failed-count" device tree property of an NVDIMM. This
property is a monotonic increasing 64-bit counter thats an indication
of number of times an NVDIMM has encountered a dirty-shutdown event
causing persistence loss.
Since this value is not expected to change after system-boot hence
papr_scm reads & caches its value during NVDIMM probe and exposes it
as a PAPR sysfs attributed named 'dirty_shutdown' to match the name of
similarly named NFIT sysfs attribute. Also this value is available to
libnvdimm via PAPR_PDSM_HEALTH payload. 'struct nd_papr_pdsm_health'
has been extended to add a new member called 'dimm_dsc' presence of
which is indicated by the newly introduced PDSM_DIMM_DSC_VALID flag.
References:
[1] https://pmem.io/documents/Dirty_Shutdown_Handling-V1.0.pdf
Signed-off-by: Vaibhav Jain <[email protected]>
Reviewed-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
In case performance stats for an nvdimm are not available, reading the
'perf_stats' sysfs file returns an -ENOENT error. A better approach is
to make the 'perf_stats' file entirely invisible to indicate that
performance stats for an nvdimm are unavailable.
So this patch updates 'papr_nd_attribute_group' to add a 'is_visible'
callback implemented as newly introduced 'papr_nd_attribute_visible()'
that returns an appropriate mode in case performance stats aren't
supported in a given nvdimm.
Also the initialization of 'papr_scm_priv.stat_buffer_len' is moved
from papr_scm_nvdimm_init() to papr_scm_probe() so that it value is
available when 'papr_nd_attribute_visible()' is called during nvdimm
initialization.
Even though 'perf_stats' attribute is available since v5.9, there are
no known user-space tools/scripts that are dependent on presence of its
sysfs file. Hence I dont expect any user-space breakage with this
patch.
Fixes: 2d02bf835e57 ("powerpc/papr_scm: Fetch nvdimm performance stats from PHYP")
Signed-off-by: Vaibhav Jain <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
klimit is a global variable initialised at build time with the
value of _end.
This variable is never modified, so _end symbol can be used directly.
Remove klimit.
Signed-off-by: Christophe Leroy <[email protected]>
Reviewed-by: Kefeng Wang <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/9fa9ba6807c17f93f35a582c199c646c4a8bfd9c.1622800638.git.christophe.leroy@csgroup.eu
|
|
Parse to and export from UUID own type, before dereferencing.
This also fixes wrong comment (Little Endian UUID is something else)
and should eliminate the direct strict types assignments.
Fixes: 43001c52b603 ("powerpc/papr_scm: Use ibm,unit-guid as the iset cookie")
Fixes: 259a948c4ba1 ("powerpc/pseries/scm: Use a specific endian format for storing uuid from the device tree")
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The validation done at the start of dlpar_memory_add_by_ic() is an all
of nothing scenario - if any LMBs in the range is marked as RESERVED we
can fail right away.
We then can remove the 'lmbs_available' var and its check with
'lmbs_to_add' since the whole LMB range was already validated in the
previous step.
Signed-off-by: Daniel Henrique Barboza <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
After a successful dlpar_add_lmb() call the LMB is marked as reserved.
Later on, depending whether we added enough LMBs or not, we rely on
the marked LMBs to see which ones might need to be removed, and we
remove the reservation of all of them.
These are done in for_each_drmem_lmb() loops without any break
condition. This means that we're going to check all LMBs of the partition
even after going through all the reserved ones.
This patch adds break conditions in both loops to avoid this. The
'lmbs_added' variable was renamed to 'lmbs_reserved', and it's now
being decremented each time a lmb reservation is removed, indicating
if there are still marked LMBs to be processed.
Signed-off-by: Daniel Henrique Barboza <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The function is counting reserved LMBs as available to be added, but
they aren't. This will cause the function to miscalculate the available
LMBs and can trigger errors later on when executing dlpar_add_lmb().
Signed-off-by: Daniel Henrique Barboza <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
When an interrupt is taken, the SRR registers are set to return to where
it left off. Unless they are modified in the meantime, or the return
address or MSR are modified, there is no need to reload these registers
when returning from interrupt.
Introduce per-CPU flags that track the validity of SRR and HSRR
registers. These are cleared when returning from interrupt, when
using the registers for something else (e.g., OPAL calls), when
adjusting the return address or MSR of a context, and when context
switching (which changes the return address and MSR).
This improves the performance of interrupt returns.
Signed-off-by: Nicholas Piggin <[email protected]>
[mpe: Fold in fixup patch from Nick]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Microwatt's hardware RNG is accessed using the DARN instruction.
Signed-off-by: Paul Mackerras <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Reviewed-by: Segher Boessenkool <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/YMwXPHlV/[email protected]
|
|
This adds support to the Microwatt platform to use the standard
16550-style UART which available in the standalone Microwatt FPGA.
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
Reviewed-by: Segher Boessenkool <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This is a simple native ICS backend that matches the layout of
the Microwatt implementation of ICS.
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
Reviewed-by: Segher Boessenkool <[email protected]>
[mpe: Add empty ics_native_init() to unbreak non-microwatt builds]
Signed-off-by: Michael Ellerman <[email protected]>
fixup-ics
Link: https://lore.kernel.org/r/[email protected]
|
|
Just like any other embedded platform.
Add an empty soc node.
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
Reviewed-by: Segher Boessenkool <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/YMwWx98+PMibZq/[email protected]
|