Age | Commit message (Collapse) | Author | Files | Lines |
|
KVMPPC_NR_LPIDS no longer represents any size restriction on the
LPID space and can be removed. A CPU with more than 12 LPID bits
implemented will now be able to create more than 4095 guests.
Signed-off-by: Nicholas Piggin <[email protected]>
Reviewed-by: Fabiano Rosas <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Rather than tie this to KVMPPC_NR_LPIDS which is becoming more dynamic,
fix it to 4096 (12-bits) explicitly for now.
kvmhv_get_nested() does not have to check against KVM_MAX_NESTED_GUESTS
because the L1 partition table registration hcall already did that, and
it checks against the partition table size.
This patch also puts all the partition table size calculations into the
same form, using 12 for the architected size field shift and 4 for the
shift corresponding to the partition table entry size.
Reviewed-by: Fabiano Rosas <[email protected]>
Signed-of-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This removes the fixed sized kvm->arch.nested_guests array.
Signed-off-by: Nicholas Piggin <[email protected]>
Reviewed-by: Fabiano Rosas <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This removes the fixed-size lpid_inuse array.
Signed-off-by: Nicholas Piggin <[email protected]>
Reviewed-by: Fabiano Rosas <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The LPID allocator init is changed to:
- use mmu_lpid_bits rather than hard-coding;
- use KVM_MAX_NESTED_GUESTS for nested hypervisors;
- not reserve the top LPID on POWER9 and newer CPUs.
The reserved LPID is made a POWER7/8-specific detail.
Signed-off-by: Nicholas Piggin <[email protected]>
Reviewed-by: Fabiano Rosas <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Removing kvmppc_claim_lpid makes the lpid allocator API a bit simpler to
change the underlying implementation in a future patch.
The host LPID is always 0, so that can be a detail of the allocator. If
the allocator range is restricted, that can reserve LPIDs at the top of
the range. This allows kvmppc_claim_lpid to be removed.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
It is better to get all loads for the register values in flight
before starting to switch LPID, PID, and LPCR because those
mtSPRs are expensive and serialising.
This also just tidies up the code for a potential future change
to the context switching sequence.
Signed-off-by: Nicholas Piggin <[email protected]>
Reviewed-by: Fabiano Rosas <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This facility is controlled by FSCR only. Reserved bits should not be
set in the HFSCR register (although it's likely harmless as this
position would not be re-used, and the L0 is forgiving here too).
Signed-off-by: Nicholas Piggin <[email protected]>
Reviewed-by: Fabiano Rosas <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32-bit big
endian mode (MSR[SF,LE] unset).
The change in MSR is done in enter_rtas() in a relatively complex way,
since the MSR value could be hardcoded.
Furthermore, a panic has been reported when hitting the watchdog interrupt
while running in RTAS, this leads to the following stack trace:
watchdog: CPU 24 Hard LOCKUP
watchdog: CPU 24 TB:997512652051031, last heartbeat TB:997504470175378 (15980ms ago)
...
Supported: No, Unreleased kernel
CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G E X 5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
NIP: 000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
REGS: c00000000fc33d60 TRAP: 0100 Tainted: G E X (5.14.21-150400.71.1.bz196362_2-default)
MSR: 8000000002981000 <SF,VEC,VSX,ME> CR: 48800002 XER: 20040020
CFAR: 000000000000011c IRQMASK: 1
GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
NIP [000000001fb41050] 0x1fb41050
LR [000000001fb4104c] 0x1fb4104c
Call Trace:
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
Oops: Unrecoverable System Reset, sig: 6 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
...
Supported: No, Unreleased kernel
CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G E X 5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
NIP: 000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
REGS: c00000000fc33d60 TRAP: 0100 Tainted: G E X (5.14.21-150400.71.1.bz196362_2-default)
MSR: 8000000002981000 <SF,VEC,VSX,ME> CR: 48800002 XER: 20040020
CFAR: 000000000000011c IRQMASK: 1
GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
NIP [000000001fb41050] 0x1fb41050
LR [000000001fb4104c] 0x1fb4104c
Call Trace:
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
---[ end trace 3ddec07f638c34a2 ]---
This happens because MSR[RI] is unset when entering RTAS but there is no
valid reason to not set it here.
RTAS is expected to be called with MSR[RI] as specified in PAPR+ section
"7.2.1 Machine State":
R1–7.2.1–9. If called with MSR[RI] equal to 1, then RTAS must protect
its own critical regions from recursion by setting the MSR[RI] bit to
0 when in the critical regions.
Fixing this by reviewing the way MSR is compute before calling RTAS. Now a
hardcoded value meaning real mode, 32 bits big endian mode and Recoverable
Interrupt is loaded. In the case MSR[S] is set, it will remain set while
entering RTAS as only urfid can unset it (thanks Fabiano).
In addition a check is added in do_enter_rtas() to detect calls made with
MSR[RI] unset, as we are forcing it on later.
This patch has been tested on the following machines:
Power KVM Guest
P8 S822L (host Ubuntu kernel 5.11.0-49-generic)
PowerVM LPAR
P8 9119-MME (FW860.A1)
p9 9008-22L (FW950.00)
P10 9080-HEX (FW1010.00)
Suggested-by: Nicholas Piggin <[email protected]>
Signed-off-by: Laurent Dufour <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Use a kmalloced data structure to store interrupt controller internal
data instead of static global variables.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/c8f0866ee013113d5e28948943cf0586e49f5353.1649226186.git.christophe.leroy@csgroup.eu
|
|
mpc8xx_pics_init() is now only a trampoline to
mpc8xx_pic_init().
Remove mpc8xx_pics_init() and use mpc8xx_pic_init()
directly.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/9c55a698adb5ba3b7b77023170fcaf0acb5d2d81.1649226186.git.christophe.leroy@csgroup.eu
|
|
In the same logic as commit be7ecbd240b2 ("soc: fsl: qe: convert QE
interrupt controller to platform_device"), convert CPM1 interrupt
controller to platform_device.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/fb80d0b2077312079c49da0296e25591578771cd.1649226186.git.christophe.leroy@csgroup.eu
|
|
Add CPM error interrupt as a standalone platform driver,
to simplify the init of CPM interrupt handler.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/375a72df6e4a26c5959cc81a6c6d46152efa2306.1649226186.git.christophe.leroy@csgroup.eu
|
|
CPM interrupt controller is quite standalone. Move it into a
dedicated file. It will help for next step which will change
it to a platform driver.
This is pure code move, checkpatch report is ignored at this point,
except one parenthesis alignment which would remain at the end of
the series. All other points fly away with following patches.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/d3a7dc832d905bed14b35d83410cdb69a7ba20e8.1649226186.git.christophe.leroy@csgroup.eu
|
|
powerpc's asm/prom.h brings some headers that it doesn't
need itself.
In order to clean it up, first add missing headers in
users of asm/prom.h
Signed-off-by: Christophe Leroy <[email protected]>
Acked-by: Frederic Barrat <[email protected]>
Acked-by: Andrew Donnellan <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/a2bae89b280e7a7cb87889635d9911d6a245e780.1648833388.git.christophe.leroy@csgroup.eu
|
|
powerpc's asm/prom.h brings some headers that it doesn't
need itself.
In order to clean it up, first add missing headers in
users of asm/prom.h
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/04961364547fe4556e30cb302b0e20a939b83426.1648833027.git.christophe.leroy@csgroup.eu
|
|
It's only during early startup that poking_init() is not done yet,
for instance when calling ftrace_init().
Once poking_init() has been called there must be a poking area, no
need to check it everytime patch_instruction() is called.
ftrace activation time is reduced by 7% with the change on an 8xx.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/8d6088aca7b63247377b6d9e4897d08d935fbe93.1647962456.git.christophe.leroy@csgroup.eu
|
|
Once init is done, initmem is freed forever so no need to
test system_state at every call to patch_instruction().
Use jump_label.
This reduces by 2% the time needed to activate ftrace on an 8xx.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/0aee964721cab7316cffde21a2ca223cee14d373.1647962456.git.christophe.leroy@csgroup.eu
|
|
Commit 863771a28e27 ("powerpc/32s: Convert switch_mmu_context() to C")
moved the switch_mmu_context() to C. While in principle a good idea, it
meant that the function now uses the stack. The stack is not accessible
from real mode though.
So to keep calling the function, let's turn on MSR_DR while we call it.
That way, all pointer references to the stack are handled virtually.
In addition, make sure to save/restore r12 on the stack, as it may get
clobbered by the C function.
Fixes: 863771a28e27 ("powerpc/32s: Convert switch_mmu_context() to C")
Cc: [email protected] # v5.14+
Reported-by: Matt Evans <[email protected]>
Signed-off-by: Alexander Graf <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
CONFIG_MODULES
If CONFIG_MODULES is not set, there is no point in checking
whether text is in module area.
This reduced the time needed to activate/deactivate ftrace
by more than 10% on an 8xx.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/f3c701cce00a38620788c0fc43ff0b611a268c54.1647962456.git.christophe.leroy@csgroup.eu
|
|
Aligning address to page boundary allows flush_tlb_kernel_range()
to know it's a single page flush and use tlbie instead of tlbia.
On 603 we now have the following code in first leg of
change_page_attr():
2c: 55 29 00 3c rlwinm r9,r9,0,0,30
30: 91 23 00 00 stw r9,0(r3)
34: 7c 00 22 64 tlbie r4,r0
38: 7c 00 04 ac hwsync
3c: 38 60 00 00 li r3,0
40: 4e 80 00 20 blr
Before we had:
28: 55 29 00 3c rlwinm r9,r9,0,0,30
2c: 91 23 00 00 stw r9,0(r3)
30: 54 89 00 26 rlwinm r9,r4,0,0,19
34: 38 84 10 00 addi r4,r4,4096
38: 7c 89 20 50 subf r4,r9,r4
3c: 28 04 10 00 cmplwi r4,4096
40: 41 81 00 30 bgt 70 <change_page_attr+0x70>
44: 7c 00 4a 64 tlbie r9,r0
48: 7c 00 04 ac hwsync
4c: 38 60 00 00 li r3,0
50: 4e 80 00 20 blr
...
70: 94 21 ff f0 stwu r1,-16(r1)
74: 7c 08 02 a6 mflr r0
78: 90 01 00 14 stw r0,20(r1)
7c: 48 00 00 01 bl 7c <change_page_attr+0x7c>
7c: R_PPC_REL24 _tlbia
80: 80 01 00 14 lwz r0,20(r1)
84: 38 60 00 00 li r3,0
88: 7c 08 03 a6 mtlr r0
8c: 38 21 00 10 addi r1,r1,16
90: 4e 80 00 20 blr
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/6bb118fb2ee89fa3c1f9cf90ed19f88220002cb0.1647877467.git.christophe.leroy@csgroup.eu
|
|
In the same spirit as commit 63f501e07a85 ("powerpc/8xx: Simplify TLB
handling"), simplify flush_tlb_kernel_range() for 8xx.
8xx cannot be SMP, and has 'tlbie' and 'tlbia' instructions, so
an inline version of flush_tlb_kernel_range() for 8xx is worth it.
With this page, first leg of change_page_attr() is:
2c: 55 29 00 3c rlwinm r9,r9,0,0,30
30: 91 23 00 00 stw r9,0(r3)
34: 7c 00 22 64 tlbie r4,r0
38: 7c 00 04 ac hwsync
3c: 38 60 00 00 li r3,0
40: 4e 80 00 20 blr
Before the patch it was:
30: 55 29 00 3c rlwinm r9,r9,0,0,30
34: 91 2a 00 00 stw r9,0(r10)
38: 94 21 ff f0 stwu r1,-16(r1)
3c: 7c 08 02 a6 mflr r0
40: 38 83 10 00 addi r4,r3,4096
44: 90 01 00 14 stw r0,20(r1)
48: 48 00 00 01 bl 48 <change_page_attr+0x48>
48: R_PPC_REL24 flush_tlb_kernel_range
4c: 80 01 00 14 lwz r0,20(r1)
50: 38 60 00 00 li r3,0
54: 7c 08 03 a6 mtlr r0
58: 38 21 00 10 addi r1,r1,16
5c: 4e 80 00 20 blr
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/d2610043419ce3e0e53a85386baf2c3625af5cfb.1647877442.git.christophe.leroy@csgroup.eu
|
|
__do_irq() inconditionnaly calls ppc_md.get_irq()
That's definitely a hot path.
At the time being ppc_md.get_irq address is read every time
from ppc_md structure.
Replace that call by a static call, and initialise that
call after ppc_md.init_IRQ() has set ppc_md.get_irq.
Emit a warning and don't set the static call if ppc_md.init_IRQ()
is still NULL, that way the kernel won't blow up if for some
reason ppc_md.get_irq() doesn't get properly set.
With the patch:
00000000 <__SCT__ppc_get_irq>:
0: 48 00 00 20 b 20 <__static_call_return0> <== Replaced by 'b <ppc_md.get_irq>' at runtime
...
00000020 <__static_call_return0>:
20: 38 60 00 00 li r3,0
24: 4e 80 00 20 blr
...
00000058 <__do_irq>:
...
64: 48 00 00 01 bl 64 <__do_irq+0xc>
64: R_PPC_REL24 __SCT__ppc_get_irq
68: 2c 03 00 00 cmpwi r3,0
...
Before the patch:
00000038 <__do_irq>:
...
3c: 3d 20 00 00 lis r9,0
3e: R_PPC_ADDR16_HA ppc_md+0x1c
...
44: 81 29 00 00 lwz r9,0(r9)
46: R_PPC_ADDR16_LO ppc_md+0x1c
...
4c: 7d 29 03 a6 mtctr r9
50: 4e 80 04 21 bctrl
54: 2c 03 00 00 cmpwi r3,0
...
On PPC64 which doesn't implement static calls yet we get:
00000000000000d0 <__do_irq>:
...
dc: 00 00 22 3d addis r9,r2,0
dc: R_PPC64_TOC16_HA .data+0x8
...
e4: 00 00 89 e9 ld r12,0(r9)
e4: R_PPC64_TOC16_LO_DS .data+0x8
...
f0: a6 03 89 7d mtctr r12
f4: 18 00 41 f8 std r2,24(r1)
f8: 21 04 80 4e bctrl
fc: 18 00 41 e8 ld r2,24(r1)
...
So on PPC64 that's similar to what we get without static calls.
But at least until ppc_md.get_irq() is set the call is to
__static_call_return0.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/afb92085f930651d8b1063e4d4bf0396c80ebc7d.1647002274.git.christophe.leroy@csgroup.eu
|
|
rol32(x, 16) will do the rotate using rlwinm.
No need to open code using inline assembly.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/794337eff7bb803d2c4e67d9eee635390c4c48fe.1646812553.git.christophe.leroy@csgroup.eu
|
|
Don't inherit headers "by chances" from asm/prom.h, asm/mpc52xx.h,
asm/pci.h etc...
Include the needed headers, and remove asm/prom.h when it was
needed exclusively for pulling necessary headers.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/be8bdc934d152a7d8ee8d1a840d5596e2f7d85e0.1646767214.git.christophe.leroy@csgroup.eu
|
|
Several files include asm/prom.h for no reason.
Clean it up.
Signed-off-by: Christophe Leroy <[email protected]>
[mpe: Drop change to prom_parse.c as reported by [email protected]]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/7c9b8fda63dcf63e1b28f43e7ebdb95182cbc286.1646767214.git.christophe.leroy@csgroup.eu
|
|
With CONFIG_FORTIFY_SOURCE enabled, string functions will also perform
dynamic checks for string size which can panic the kernel, like incase
of overflow detection.
In papr_scm, papr_scm_pmu_check_events function uses stat->stat_id with
string operations, to populate the nvdimm_events_map array. Since
stat_id variable is not NULL terminated, the kernel panics with
CONFIG_FORTIFY_SOURCE enabled at boot time.
Below are the logs of kernel panic:
detected buffer overflow in __fortify_strlen
------------[ cut here ]------------
kernel BUG at lib/string_helpers.c:980!
Oops: Exception in kernel mode, sig: 5 [#1]
NIP [c00000000077dad0] fortify_panic+0x28/0x38
LR [c00000000077dacc] fortify_panic+0x24/0x38
Call Trace:
[c0000022d77836e0] [c00000000077dacc] fortify_panic+0x24/0x38 (unreliable)
[c00800000deb2660] papr_scm_pmu_check_events.constprop.0+0x118/0x220 [papr_scm]
[c00800000deb2cb0] papr_scm_probe+0x288/0x62c [papr_scm]
[c0000000009b46a8] platform_probe+0x98/0x150
Fix this issue by using kmemdup_nul() to copy the content of
stat->stat_id directly to the nvdimm_events_map array.
mpe: stat->stat_id comes from the hypervisor, not userspace, so there is
no security exposure.
Fixes: 4c08d4bbc089 ("powerpc/papr_scm: Add perf interface support")
Signed-off-by: Kajol Jain <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Don't rely on random inclusion of linux/of.h by users
of asm/drmem.h
Add a forward declaration of struct property and
struct device_node.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/5643ec410e51b749db0636471cb7979524f9ed0e.1646767214.git.christophe.leroy@csgroup.eu
|
|
is_secure_guest() uses mfmsr().
Don't rely on users to include asm/reg.h, include
it in asm/svm.h
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/482c82c8a29d5fb3ea279b34f107e0e775001344.1646767214.git.christophe.leroy@csgroup.eu
|
|
parport.h needs only of_irq.h, no need to go via asm/prom.h
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/ec796ee56cf61f16ba24e62a9d3525d11931538c.1646767214.git.christophe.leroy@csgroup.eu
|
|
Move pci_device_from_OF_node() in pci64.c because it needs definition
of struct device_node and is not worth inlining.
ppc32.c already has it in pci32.c.
That way pci-bridge.h doesn't need linux/of.h (Brought by asm/prom.h
via asm/pci.h)
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/3c88286b55413730d7784133993a46ef4a3607ce.1646767214.git.christophe.leroy@csgroup.eu
|
|
PPC64 does everything in C, gcc is able to skip calculation
when one of the operands in zero.
Move the constant folding in PPC32 part.
This helps GCC and reduces ppc64_defconfig by 170 bytes.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/a4ca63dd4c4b09e1906d08fb814af5a41d0f3fcb.1644651363.git.christophe.leroy@csgroup.eu
|
|
emulate_step() instruction emulation including sc instruction emulation
initially appeared in xmon. It was then moved into sstep.c where kprobes
could use it too, and later hw_breakpoint and uprobes started to use it.
Until uprobes, the only instruction emulation users were for kernel
mode instructions.
- xmon only steps / breaks on kernel addresses.
- kprobes is kernel only.
- hw_breakpoint only emulates kernel instructions, single steps user.
At one point, there was support for the kernel to execute sc
instructions, although that is long removed and it's not clear whether
there were any in-tree users. So system call emulation is not required
by the above users.
uprobes uses emulate_step and it appears possible to emulate sc
instruction in userspace. Userspace system call emulation is broken and
it's not clear it ever worked well.
The big complication is that userspace takes an interrupt to the kernel
to emulate the instruction. The user->kernel interrupt sets up registers
and interrupt stack frame expecting to return to userspace, then system
call instruction emulation re-directs that stack frame to the kernel,
early in the system call interrupt handler. This means the interrupt
return code takes the kernel->kernel restore path, which does not
restore everything as the system call interrupt handler would expect
coming from userspace. regs->iamr appears to get lost for example,
because the kernel->kernel return does not restore the user iamr.
Accounting such as irqflags tracing and CPU accounting does not get
flipped back to user mode as the system call handler expects, so those
appear to enter the kernel twice without returning to userspace.
These things may be individually fixable with various complication, but
it is a big complexity for unclear real benefit.
Furthermore, it is not possible to single step a system call instruction
since it causes an interrupt. As such, a separate patch disables probing
on system call instructions.
This patch removes system call emulation and disables stepping system
calls.
Signed-off-by: Nicholas Piggin <[email protected]>
[minor commit log edit, and also get rid of '#ifdef CONFIG_PPC64']
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/a412e3b3791ed83de18704c8d90f492e7a0049c0.1648648712.git.naveen.n.rao@linux.vnet.ibm.com
|
|
Per the ISA, a Trace interrupt is not generated for:
- [h|u]rfi[d]
- rfscv
- sc, scv, and Trap instructions that trap
- Power-Saving Mode instructions
- other instructions that cause interrupts (other than Trace interrupts)
- the first instructions of any interrupt handler (applies to Branch and Single Step tracing;
CIABR matches may still occur)
- instructions that are emulated by software
Add a helper to check for instructions belonging to the first four
categories above and to reject kprobes, uprobes and xmon breakpoints on
such instructions. We reject probing on instructions belonging to these
categories across all ISA versions and across both BookS and BookE.
For trap instructions, we can't know in advance if they can cause a
trap, and there is no good reason to allow probing on those. Also,
uprobes already refuses to probe trap instructions and kprobes does not
allow probes on trap instructions used for kernel warnings and bugs. As
such, stop allowing any type of probes/breakpoints on trap instruction
across uprobes, kprobes and xmon.
For some of the fp/altivec instructions that can generate an interrupt
and which we emulate in the kernel (altivec assist, for example), we
check and turn off single stepping in emulate_single_step().
Instructions generating a DSI are restarted and single stepping normally
completes once the instruction is completed.
In uprobes, if a single stepped instruction results in a non-fatal
signal to be delivered to the task, such signals are "delayed" until
after the instruction completes. For fatal signals, single stepping is
cancelled and the instruction restarted in-place so that core dump
captures proper addresses.
In kprobes, we do not allow probes on instructions having an extable
entry and we also do not allow probing interrupt vectors.
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/f56ee979d50b8711fae350fc97870f3ca34acd75.1648648712.git.naveen.n.rao@linux.vnet.ibm.com
|
|
Some of the primary opcodes are duplicated. Remove those, and sort the
rest of the primary opcodes to make it easy to read.
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/a05edf638a2638d708fc2db0272f6317837b5eab.1648648712.git.naveen.n.rao@linux.vnet.ibm.com
|
|
Various spelling mistakes in comments.
Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <[email protected]>
Reviewed-by: Joel Stanley <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
So far the RELACOUNT tag from the ELF header was containing the exact
number of R_PPC_RELATIVE/R_PPC64_RELATIVE relocations. However the LLVM's
recent change [1] make it equal-or-less than the actual number which
makes it useless.
This replaces RELACOUNT in zImage loader with a pair of RELASZ and RELAENT.
The vmlinux relocation code is fixed in commit d79976918852
("powerpc/64: Add UADDR64 relocation support").
To make it more future proof, this walks through the entire .rela.dyn
section instead of assuming that the section is sorter by a relocation
type. Unlike d79976918852, this does not add unaligned UADDR/UADDR64
relocations as we are likely not to see those in practice - the zImage
is small and very arch specific so there is a smaller chance that some
generic feature (such as PRINK_INDEX) triggers unaligned relocations.
[1] https://github.com/llvm/llvm-project/commit/da0e5b885b25cf4
Signed-off-by: Alexey Kardashevskiy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
arch_randomize_brk() is only needed for hash on book3s/64, for other
platforms the one provided by the default mmap layout is good enough.
Move it to hash_utils.c and use randomize_page() like the generic one.
And properly opt out the radix case instead of making an assumption
on mmu_highuser_ssize.
Also change to a 32M range like most other architectures instead of 8M.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/eafa4d18ec8ac7b98dd02b40181e61643707cc7c.1649523076.git.christophe.leroy@csgroup.eu
|
|
Select CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT and
remove arch/powerpc/mm/mmap.c
This change reuses the generic framework added by
commit 67f3977f805b ("arm64, mm: move generic mmap layout
functions to mm") without any functional change.
Comparison between powerpc implementation and the generic one:
- mmap_is_legacy() is identical.
- arch_mmap_rnd() does exactly the same allthough it's written
slightly differently.
- MIN_GAP and MAX_GAP are identical.
- mmap_base() does the same but uses STACK_RND_MASK which provides
the same values as stack_maxrandom_size().
- arch_pick_mmap_layout() is identical.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/518f9def87d3c889d5958103e7463cf45a2f673d.1649523076.git.christophe.leroy@csgroup.eu
|
|
Do like most other architectures and provide randomisation also to
"legacy" memory mappings, by adding the random factor to
mm->mmap_base in arch_pick_mmap_layout().
See commit 8b8addf891de ("x86/mm/32: Enable full randomization on
i386 and X86_32") for all explanations and benefits of that mmap
randomisation.
At the moment, slice_find_area_bottomup() doesn't use mm->mmap_base
but uses the fixed TASK_UNMAPPED_BASE instead.
slice_find_area_bottomup() being used as a fallback to
slice_find_area_topdown(), it can't use mm->mmap_base
directly.
Instead of always using TASK_UNMAPPED_BASE as base address, leave
it to the caller. When called from slice_find_area_topdown()
TASK_UNMAPPED_BASE is used. Otherwise mm->mmap_base is used.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/417fb10dde828534c73a03138b49621d74f4e5be.1649523076.git.christophe.leroy@csgroup.eu
|
|
hugetlb_get_unmapped_area() is now identical to the
generic version if only RADIX is enabled, so move it
to slice.c and let it fallback on the generic one
when HASH MMU is not compiled in.
Do the same with arch_get_unmapped_area() and
arch_get_unmapped_area_topdown().
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/b5d9c124e82889e0cb115c150915a0c0d84eb960.1649523076.git.christophe.leroy@csgroup.eu
|
|
Use the generic version of arch_hugetlb_get_unmapped_area()
which is now available at all time.
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/05f77014c619061638ecc52a0a4136eb04cc2799.1649523076.git.christophe.leroy@csgroup.eu
|
|
arch_get_unmapped_area()
Use the generic version of arch_get_unmapped_area() which
is now available at all time instead of its copy
radix__arch_get_unmapped_area()
To allow that for PPC64, add arch_get_mmap_base() and
arch_get_mmap_end() macros.
Instead of setting mm->get_unmapped_area() to either
arch_get_unmapped_area() or generic_get_unmapped_area(),
always set it to arch_get_unmapped_area() and call
generic_get_unmapped_area() from there when radix is enabled.
Do the same with radix__arch_get_unmapped_area_topdown()
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/393be1fa386446443682fdb74544d733f68ef3bb.1649523076.git.christophe.leroy@csgroup.eu
|
|
CONFIG_PPC_MM_SLICES is always selected by hash book3s/64.
CONFIG_PPC_MM_SLICES is never selected by other platforms.
Remove it.
Signed-off-by: Christophe Leroy <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/dc2cdc204de8978574bf7c02329b6cfc4db0bce7.1649523076.git.christophe.leroy@csgroup.eu
|
|
Since commit 555904d07eef ("powerpc/8xx: MM_SLICE is not needed
anymore") only book3s/64 selects CONFIG_PPC_MM_SLICES.
Move slice.c into mm/book3s64/
Move necessary stuff in asm/book3s/64/slice.h and
remove asm/slice.h
Signed-off-by: Christophe Leroy <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/4a0d74ef1966a5902b5fd4ac4b513a760a6d675a.1649523076.git.christophe.leroy@csgroup.eu
|
|
vma_mmu_pagesize() is only required for slices,
otherwise there is a generic weak version doing the
exact same thing.
Move it to slice.c
Signed-off-by: Christophe Leroy <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/1302e000d529c93d07208f1fae90f938e7a551b4.1649523076.git.christophe.leroy@csgroup.eu
|
|
Powerpc needs flags and len to make decision on arch_get_mmap_end().
So add them as parameters to arch_get_mmap_end().
Signed-off-by: Christophe Leroy <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Acked-by: Andrew Morton <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/b556daabe7d2bdb2361c4d6130280da7c1ba2c14.1649523076.git.christophe.leroy@csgroup.eu
|
|
get_unmapped_area functions
Unlike most architectures, powerpc can only define at runtime
if it is going to use the generic arch_get_unmapped_area() or not.
Today, powerpc has a copy of the generic arch_get_unmapped_area()
because when selection HAVE_ARCH_UNMAPPED_AREA the generic
arch_get_unmapped_area() is not available.
Rename it generic_get_unmapped_area() and make it independent of
HAVE_ARCH_UNMAPPED_AREA.
Do the same for arch_get_unmapped_area_topdown() versus
HAVE_ARCH_UNMAPPED_AREA_TOPDOWN.
Do the same for hugetlb_get_unmapped_area() versus
HAVE_ARCH_HUGETLB_UNMAPPED_AREA.
Signed-off-by: Christophe Leroy <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Acked-by: Andrew Morton <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/77f9d3e592f1c8511df9381aa1c4e754651da4d1.1649523076.git.christophe.leroy@csgroup.eu
|
|
CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
Commit e7142bf5d231 ("arm64, mm: make randomization selected by
generic topdown mmap layout") introduced a default version of
arch_randomize_brk() provided when
CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT is selected.
powerpc could select CONFIG_ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
but needs to provide its own arch_randomize_brk().
In order to allow that, define generic version of arch_randomize_brk()
as a __weak symbol.
Signed-off-by: Christophe Leroy <[email protected]>
Acked-by: Andrew Morton <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/b222f1ca06c850daf1b2f26afdb46c6dd97d21ba.1649523076.git.christophe.leroy@csgroup.eu
|
|
Merge master into next, to bring in commit 5f24d5a579d1 ("mm, hugetlb:
allow for "high" userspace addresses"), which is needed as a
prerequisite for the series converting powerpc to the generic mmap
logic.
|