Age | Commit message (Collapse) | Author | Files | Lines |
|
No functionality change. Just code movement to ease code changes later
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
These function are not used in the code. Remove them.
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
On POWER9 the Nest MMU may fail to invalidate some translations when
doing a tlbie "by PID" or "by LPID" that is targeted at the TLB only
and not the page walk cache.
This works around it by forcing such invalidations to escalate to
RIC=2 (full invalidation of TLB *and* PWC) when a coprocessor is in
use for the context.
Fixes: 03b8abedf4f4 ("cxl: Enable global TLBIs for cxl contexts")
Cc: [email protected] # v4.15+
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Balbir Singh <[email protected]>
[balbirs: fixed spelling and coding style to quiesce checkpatch.pl]
Tested-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Currently, when using coprocessors (which use the Nest MMU), we
simply increment the active_cpu count to force all TLB invalidations
to be come broadcast.
Unfortunately, due to an errata in POWER9, we will need to know
more specifically that coprocessors are in use.
This maintains a separate copros counter in the MMU context for
that purpose.
NB. The commit mentioned in the fixes tag below is not at fault for
the bug we're fixing in this commit and the next, but this fix applies
on top the infrastructure it introduced.
Fixes: 03b8abedf4f4 ("cxl: Enable global TLBIs for cxl contexts")
Cc: [email protected] # v4.15+
Signed-off-by: Benjamin Herrenschmidt <[email protected]>
Tested-by: Balbir Singh <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
irq_happened
force_external_irq_replay() can be called in the do_IRQ path with
interrupts hard enabled and soft disabled if may_hard_irq_enable() set
MSR[EE]=1. It updates local_paca->irq_happened with a load, modify,
store sequence. If a maskable interrupt hits during this sequence, it
will go to the masked handler to be marked pending in irq_happened.
This update will be lost when the interrupt returns and the store
instruction executes. This can result in unpredictable latencies,
timeouts, lockups, etc.
Fix this by ensuring hard interrupts are disabled before modifying
irq_happened.
This could cause any maskable asynchronous interrupt to get lost, but
it was noticed on P9 SMP system doing RDMA NVMe target over 100GbE,
so very high external interrupt rate and high IPI rate. The hang was
bisected down to enabling doorbell interrupts for IPIs. These provided
an interrupt type that could run at high rates in the do_IRQ path,
stressing the race.
Fixes: 1d607bb3bd60 ("powerpc/irq: Add mechanism to force a replay of interrupts")
Cc: [email protected] # v4.8+
Reported-by: Carol L. Soto <[email protected]>
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
'linux,stdout-path' has been deprecated for some time in favor of
'stdout-path'. Now dtc will warn on occurrences of 'linux,stdout-path'.
Search and replace all the of occurrences with 'stdout-path'.
Signed-off-by: Rob Herring <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: [email protected]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Signed-off-by: Nicholas Piggin <[email protected]>
[mpe: Add SPDX, and fixup formatting]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
It's slightly less error prone to use sizeof(*foo) rather than
specifying the type.
Signed-off-by: Markus Elfring <[email protected]>
[mpe: Consolidate into one patch, rewrite change log]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The flush_dcache_phys_range() function is no longer used in the
kernel. The last usage was removed in c40785ad305b ("powerpc/dart: Use
a cachable DART").
This patch removes the function and declaration.
Signed-off-by: Matt Brown <[email protected]>
[mpe: Munge change log, include commit that removed last user]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Previously the raid6 test Makefile did not build the POWER specific files
(altivec and vpermxor).
This patch fixes the bug, so that all appropriate files for powerpc are built.
This patch also fixes the missing and mismatched ifdef statements to allow the
altivec.uc file to be built correctly.
Signed-off-by: Matt Brown <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
This patch uses the vpermxor instruction to optimise the raid6 Q
syndrome. This instruction was made available with POWER8, ISA version
2.07. It allows for both vperm and vxor instructions to be done in a
single instruction. This has been tested for correctness on a ppc64le
vm with a basic RAID6 setup containing 5 drives.
The performance benchmarks are from the raid6test in the
/lib/raid6/test directory. These results are from an IBM Firestone
machine with ppc64le architecture. The benchmark results show a 35%
speed increase over the best existing algorithm for powerpc (altivec).
The raid6test has also been run on a big-endian ppc64 vm to ensure it
also works for big-endian architectures.
Performance benchmarks:
raid6: altivecx4 gen() 18773 MB/s
raid6: altivecx8 gen() 19438 MB/s
raid6: vpermxor4 gen() 25112 MB/s
raid6: vpermxor8 gen() 26279 MB/s
Signed-off-by: Matt Brown <[email protected]>
Reviewed-by: Daniel Axtens <[email protected]>
[mpe: Add VPERMXOR macro so we can build with old binutils]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The proper compatible for rv3029 is microcrystal,rv3029.
Acked-by: Anatolij Gustschin <[email protected]>
Signed-off-by: Alexandre Belloni <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The RTC core is always calling rtc_valid_tm after the read_time callback.
It is not necessary to call it just before returning from the callback.
Signed-off-by: Alexandre Belloni <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
When running virtualised the powerpc kernel is able to run the system
in "compat mode" - which means the kernel and hardware are pretending
to userspace that the CPU is an older version than it actually is.
AT_BASE_PLATFORM is an AUXV entry that we export to userspace for use
when we're running in that mode, which tells userspace the "platform"
string for the real CPU version, as opposed to the faked version.
Although we don't support compat mode when using DT CPU features, and
arguably don't need to set AT_BASE_PLATFORM, the existing cputable
based code always sets it even when we're running bare metal. That
means the lack of AT_BASE_PLATFORM is a user-visible artifact of the
fact that the kernel is using DT CPU features, which we don't want.
So set it in the DT CPU features code also.
This results in eg:
$ LD_SHOW_AUXV=1 /bin/true | grep "AT_.*PLATFORM"
AT_PLATFORM: power9
AT_BASE_PLATFORM:power9
Signed-off-by: Michael Ellerman <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
|
|
Add a couple of trace points in the VAS driver
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
[mpe: Add SPDX tag to new header]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
When VAS is not configured, unregister the platform driver. Also simplify
cleanup by delaying vas debugfs init until we know VAS is configured.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
pnv_npu2_init_context wasn't checking the return code from
__mmu_notifier_register. If __mmu_notifier_register failed, the
npu_context was still assigned to the mm and the caller wasn't given any
indication that things went wrong. Later on pnv_npu2_destroy_context would
be called, which in turn called mmu_notifier_unregister and dropped
mm->mm_count without having incremented it in the first place. This led to
various forms of corruption like mm use-after-free and mm double-free.
__mmu_notifier_register can fail with EINTR if a signal is pending, so
this case can be frequent.
This patch calls opal_npu_destroy_context on the failure paths, and makes
sure not to assign mm->context.npu_context until past the failure points.
Signed-off-by: Mark Hairgrove <[email protected]>
Acked-By: Alistair Popple <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The PSL Timebase register is updated by the PSL to maintain the
timebase.
On P9, the Timebase value is only provided by the CAPP as received the
last time a timebase request was performed.
The timebase requests are initiated through the adapter configuration
or application registers.
The specific sysfs entry "/sys/class/cxl/cardxx/psl_timebase_synced"
is now dynamically updated according the content of the PSL Timebase
register.
Fixes: f24be42aab37 ("cxl: Add psl9 specific code")
Signed-off-by: Christophe Lombard <[email protected]>
Reviewed-by: Vaibhav Jain <[email protected]>
Acked-by: Andrew Donnellan <[email protected]>
Acked-by: Frederic Barrat <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
This is a tidy up which removes radix MMU calls into the slice
code.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The slice_mask cache was a basic conversion which copied the slice
mask into caller's structures, because that's how the original code
worked. In most cases the pointer can be used directly instead, saving
a copy and an on-stack structure.
On POWER8, this increases vfork+exec+exit performance by 0.3%
and reduces time to mmap+munmap a 64kB page by 2%.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
This code is never compiled in, and it gets broken by the next
patch, so remove it.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
This converts the slice_mask bit operation helpers to be the usual
3-operand kind, which allows 2 inputs to set a different output
without an extra copy, which is used in the next patch.
Adds slice_copy_mask, which will be used in the next patch.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Rather than build slice masks from a range then use that to check for
fit in a candidate mask, implement slice_check_range_fits that checks
if a range fits in a mask directly.
This allows several structures to be removed from stacks, and also we
don't expect a huge range in a lot of these cases, so building and
comparing a full mask is going to be more expensive than testing just
one or two bits of the range.
On POWER8, this increases vfork+exec+exit performance by 0.3%
and reduces time to mmap+munmap a 64kB page by 5%.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Calculating the slice mask can become a signifcant overhead for
get_unmapped_area. This patch adds a struct slice_mask for
each page size in the mm_context, and keeps these in synch with
the slices psize arrays and slb_addr_limit.
On Book3S/64 this adds 288 bytes to the mm_context_t for the
slice mask caches.
On POWER8, this increases vfork+exec+exit performance by 9.9%
and reduces time to mmap+munmap a 64kB page by 28%.
Reduces time to mmap+munmap by about 10% on 8xx.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Pass around const pointers to struct slice_mask where possible, rather
than copies of slice_mask, to reduce stack and call overhead.
checkstack.pl gives, before:
0x00000d1c slice_get_unmapped_area [slice.o]: 592
0x00001864 is_hugepage_only_range [slice.o]: 448
0x00000754 slice_find_area_topdown [slice.o]: 400
0x00000484 slice_find_area_bottomup.isra.1 [slice.o]: 272
0x000017b4 slice_set_range_psize [slice.o]: 224
0x00000a4c slice_find_area [slice.o]: 128
0x00000160 slice_check_fit [slice.o]: 112
after:
0x00000ad0 slice_get_unmapped_area [slice.o]: 448
0x00001464 is_hugepage_only_range [slice.o]: 288
0x000006c0 slice_find_area [slice.o]: 144
0x0000016c slice_check_fit [slice.o]: 128
0x00000528 slice_find_area_bottomup.isra.2 [slice.o]: 128
0x000013e4 slice_set_range_psize [slice.o]: 128
This increases vfork+exec+exit performance by 1.5%.
Reduces time to mmap+munmap a 64kB page by 17%.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Make these loops look the same, and change their form so the
important part is not wrapped over so many lines.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The slice state of an mm gets zeroed then initialised upon exec.
This is the only caller of slice_set_user_psize now, so that can be
removed and instead implement a faster and simplified approach that
requires no locking or checking existing state.
This speeds up vfork+exec+exit performance on POWER8 by 3%.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Now that plpar_wrappers.h has an #ifdef PSERIES we can move the empty
version of plpar_set_ciabr() which xmon wants into there.
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Back in 2013 we added some hypercall wrappers which misspelled
"plpar" (P-series Logical PARtition) as "plapr".
Visually they're hard to distinguish and it almost doesn't matter, but
it is confusing when grepping to miss some calls because of the typo.
They've also started spreading, so before they take over let's fix
them all to be "plpar".
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Currently plpar_wrappers.h is not safe to include when
CONFIG_PPC_PSERIES=n, or at least it can be depending on other config
options and so on.
Fix that by wrapping the entire content in an ifdef.
Signed-off-by: Michael Ellerman <[email protected]>
|
|
smp_query_cpu_stopped() and related #defines are currently in
plpar_wrappers.h. The function actually does an RTAS call, not an
hcall, and basically has nothing to do with plpar_wrappers.h
Move it into pseries.h, where it can easily be used by the only two
callers in pseries/smp.c and pseries/hotplug-cpu.c.
Signed-off-by: Michael Ellerman <[email protected]>
|
|
early_init() and machine_init() have no prototype, add one in
asm-prototypes.h.
Fixes the following warnings (treated as error in W=1):
arch/powerpc/kernel/setup_32.c:68:30: error: no previous prototype for ‘early_init’
arch/powerpc/kernel/setup_32.c:99:21: error: no previous prototype for ‘machine_init’
Signed-off-by: Mathieu Malaterre <[email protected]>
[mpe: Move them to asm-prototypes.h, drop other functions]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
These functions can all be static, make it so.
Signed-off-by: Mathieu Malaterre <[email protected]>
[mpe: Combine a patch of Mathieu's with some other static conversions]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Rewrite function-like macro into regular static inline function to
avoid a warning during macro expansion.
Fix warning (treated as error in W=1):
./arch/powerpc/include/asm/uaccess.h:52:35: error: comparison of unsigned expression >= 0 is always true
(((size) == 0) || (((size) - 1) <= ((segment).seg - (addr)))))
^
Suggested-by: Segher Boessenkool <[email protected]>
Signed-off-by: Mathieu Malaterre <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Rewrite comparison since all values compared are of type `unsigned long`.
Instead of using unsigned properties and rewriting the original code as:
(originally suggested by Segher Boessenkool <[email protected]>)
#define pfn_valid(pfn) \
(((pfn) - ARCH_PFN_OFFSET) < (max_mapnr - ARCH_PFN_OFFSET))
Prefer a static inline function to make code as readable as possible.
Fix a warning (treated as error in W=1):
arch/powerpc/include/asm/page.h:129:32: error: comparison of unsigned expression >= 0 is always true [-Werror=type-limits]
#define pfn_valid(pfn) ((pfn) >= ARCH_PFN_OFFSET && (pfn) < max_mapnr)
^
Suggested-by: Christophe Leroy <[email protected]>
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
When neither CONFIG_ALTIVEC, nor CONFIG_VSX or CONFIG_PPC64 is
defined, the array feature_properties is defined as an empty array,
which in turn triggers the following warning (treated as error on
W=1):
arch/powerpc/kernel/prom.c: In function ‘check_cpu_feature_properties’:
arch/powerpc/kernel/prom.c:298:16: error: comparison of unsigned expression < 0 is always false
for (i = 0; i < ARRAY_SIZE(feature_properties); ++i, ++fp) {
^
Suggested-by: Michael Ellerman <[email protected]>
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Add missing prototypes for ppc_select() & ppc_fadvise64_64() to header
asm-prototypes.h. Fix the following warnings (treated as errors in W=1)
arch/powerpc/kernel/syscalls.c:87:1: error: no previous prototype for ‘ppc_select’
arch/powerpc/kernel/syscalls.c:119:6: error: no previous prototype for ‘ppc_fadvise64_64’
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
arch_unregister_hw_breakpoint()
In commit 5aae8a537080 ("powerpc, hw_breakpoints: Implement
hw_breakpoints for 64-bit server processors") function
hw_breakpoint_handler() and arch_unregister_hw_breakpoint() were added
without function prototypes in hw_breakpoint.h header.
Fix the following warning(s) (treated as error in W=1):
arch/powerpc/kernel/hw_breakpoint.c:106:6: error: no previous prototype for ‘arch_unregister_hw_breakpoint’
arch/powerpc/kernel/hw_breakpoint.c:209:5: error: no previous prototype for ‘hw_breakpoint_handler’
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Two functions did not have a prototype defined in signal.h header. Fix
the following two warnings (treated as errors in W=1):
arch/powerpc/kernel/signal_32.c:1135:6: error: no previous prototype for ‘sys_rt_sigreturn’
arch/powerpc/kernel/signal_32.c:1422:6: error: no previous prototype for ‘sys_sigreturn’
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
In commit 81e7009ea46c ("powerpc: merge ppc signal.c and ppc64
signal32.c") the function sys_debug_setcontext was added without a
prototype.
Fix compilation warning (treated as error in W=1):
arch/powerpc/kernel/signal_32.c:1227:5: error: no previous prototype for ‘sys_debug_setcontext’
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
A function init_IRQ() was added without a prototype declared in header
irq.h. Fix the following warning (treated as error in W=1):
arch/powerpc/kernel/irq.c:662:13: error: no previous prototype for ‘init_IRQ’
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
In commit 4f8b50bbbe63 ("irq_work, ppc: Fix up arch hooks") a new
function arch_irq_work_raise() was added without a prototype in header
irq_work.h.
Fix the following warning (treated as error in W=1):
arch/powerpc/kernel/time.c:523:6: error: no previous prototype for ‘arch_irq_work_raise’
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
In commit 55ccf3fe3f9a ("fork: move the real prepare_to_copy() users to
arch_dup_task_struct()") a new arch_dup_task_struct() was added
without a prototype declared in thread_info.h header. Fix the
following warning (treated as error in W=1):
arch/powerpc/kernel/process.c:1609:5: error: no previous prototype for ‘arch_dup_task_struct’
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The function time_init did not have a prototype defined in the time.h
header. Fix the following warning (treated as error in W=1):
arch/powerpc/kernel/time.c:1068:13: error: no previous prototype for ‘time_init’
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
In commit dabe859ec636 ("powerpc: Give hypervisor decrementer interrupts
their own handler") an empty body function was added, but no prototype
was declared. Fix warning (treated as error in W=1):
arch/powerpc/kernel/time.c:629:6: error: no previous prototype for ‘hdec_interrupt’
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
In commit f0f558b131db ("powerpc/mm: Preserve CFAR value on SLB miss caused
by access to bogus address"), the function slb_miss_bad_addr() was added
without a prototype. This commit adds it.
Fix a warning (treated as error in W=1):
arch/powerpc/kernel/traps.c:1498:6: error: no previous prototype for ‘slb_miss_bad_addr’
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
__giveup_fpu() is never called outside process.c, so it can be static.
That also means we don't need an empty definition in switch_to.h
Signed-off-by: Mathieu Malaterre <[email protected]>
[mpe: Also drop the empty version, rewrite change log]
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Change signature of two functions, adding static keyword to prevent the
following two warnings (treated as errors on W=1):
arch/powerpc/platforms/embedded6xx/flipper-pic.c:135:28: error: no previous prototype for ‘flipper_pic_init’
arch/powerpc/platforms/embedded6xx/usbgecko_udbg.c:172:6: error: no previous prototype for ‘ug_udbg_putc’
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Since the value of `tmp` is never intended to be read, declare both `tmp`
variables as unused. Fix warning (treated as error in W=1):
arch/powerpc/kernel/signal_32.c: In function ‘sys_swapcontext’:
arch/powerpc/kernel/signal_32.c:1048:16: error: variable ‘tmp’ set but not used
arch/powerpc/kernel/signal_32.c: In function ‘sys_debug_setcontext’:
arch/powerpc/kernel/signal_32.c:1234:16: error: variable ‘tmp’ set but not used
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
The inline keyword was not at the beginning of the function declaration.
Fix the following warning (treated as error in W=1):
arch/powerpc/lib/sstep.c:283:1: error: ‘inline’ is not at beginning of declaration
static int nokprobe_inline copy_mem_in(u8 *dest, unsigned long ea, int nb,
arch/powerpc/lib/sstep.c:388:1: error: ‘inline’ is not at beginning of declaration
static int nokprobe_inline copy_mem_out(u8 *dest, unsigned long ea, int nb,
Signed-off-by: Mathieu Malaterre <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|