aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc/mm
AgeCommit message (Collapse)AuthorFilesLines
2017-10-25powerpc/64s/radix: Fix preempt imbalance in TLB flushNicholas Piggin1-0/+2
Fixes: 424de9c6e3f8 ("powerpc/mm/radix: Avoid flushing the PWC on every flush_tlb_range") Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-10-22powerpc/mm/radix: Drop unneeded NULL checkMichael Ellerman1-12/+10
We call these functions with non-NULL mm or vma. Hence we can skip the NULL check in these functions. We also remove now unused function __local_flush_hugetlb_page(). Signed-off-by: Aneesh Kumar K.V <[email protected]> [mpe: Drop the checks with is_vm_hugetlb_page() as noticed by Nick] Signed-off-by: Michael Ellerman <[email protected]>
2017-10-16powerpc/vphn: Fix numa update end-loop bugMichael Bringmann1-2/+8
powerpc/vphn: On Power systems with shared configurations of CPUs and memory, there are some issues with the association of additional CPUs and memory to nodes when hot-adding resources. This patch fixes an end-of-updates processing problem observed occasionally in numa_update_cpu_topology(). Signed-off-by: Michael Bringmann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-10-16powerpc/hotplug: Improve responsiveness of hotplug changeMichael Bringmann1-1/+22
powerpc/hotplug: On Power systems with shared configurations of CPUs and memory, there are some issues with the association of additional CPUs and memory to nodes when hot-adding resources. During hotplug CPU operations, this patch resets the timer on topology update work function to a small value to better ensure that the CPU topology is detected and configured sooner. Signed-off-by: Michael Bringmann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-10-16powerpc/vphn: Improve recognition of PRRN/VPHNMichael Bringmann1-4/+4
powerpc/vphn: On Power systems with shared configurations of CPUs and memory, there are some issues with the association of additional CPUs and memory to nodes when hot-adding resources. This patch updates the initialization checks to independently recognize PRRN or VPHN support. Signed-off-by: Michael Bringmann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-10-16powerpc/vphn: Update CPU topology when VPHN enabledMichael Bringmann1-1/+21
powerpc/vphn: On Power systems with shared configurations of CPUs and memory, there are some issues with the association of additional CPUs and memory to nodes when hot-adding resources. This patch corrects the currently broken capability to set the topology for shared CPUs in LPARs. At boot time for shared CPU lpars, the topology for each CPU was being set to node zero. Now when numa_update_cpu_topology() is called appropriately, the Virtual Processor Home Node (VPHN) capabilities information provided by the pHyp allows the appropriate node in the shared configuration to be selected for the CPU. Signed-off-by: Michael Bringmann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-10-10powerpc: Don't call lockdep_assert_cpus_held() from arch_update_cpu_topology()Thiago Jung Bauermann1-1/+0
It turns out that not all paths calling arch_update_cpu_topology() hold cpu_hotplug_lock, but that's OK because those paths can't race with any concurrent hotplug events. Warnings were reported with the following trace: lockdep_assert_cpus_held arch_update_cpu_topology sched_init_domains sched_init_smp kernel_init_freeable kernel_init ret_from_kernel_thread Which is safe because it's called early in boot when hotplug is not live yet. And also this trace: lockdep_assert_cpus_held arch_update_cpu_topology partition_sched_domains cpuset_update_active_cpus sched_cpu_deactivate cpuhp_invoke_callback cpuhp_down_callbacks cpuhp_thread_fun smpboot_thread_fn kthread ret_from_kernel_thread Which is safe because it's called as part of CPU hotplug, so although we don't hold the CPU hotplug lock, there is another thread driving the CPU hotplug operation which does hold the lock, and there is no race. Thanks to tglx for deciphering it for us. Fixes: 3e401f7a2e51 ("powerpc: Only obtain cpu_hotplug_lock if called by rtasd") Signed-off-by: Thiago Jung Bauermann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-10-05timer: Remove init_timer_deferrable() in favor of timer_setup()Kees Cook1-7/+5
This refactors the only users of init_timer_deferrable() to use the new timer_setup() and from_timer(). Removes definition of init_timer_deferrable(). Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: David S. Miller <[email protected]> # for networking parts Acked-by: Sebastian Reichel <[email protected]> # for drivers/hsi parts Cc: [email protected] Cc: Petr Mladek <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Kalle Valo <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Machek <[email protected]> Cc: [email protected] Cc: Chris Metcalf <[email protected]> Cc: [email protected] Cc: "James E.J. Bottomley" <[email protected]> Cc: Wim Van Sebroeck <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Ursula Braun <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Harish Patil <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: Manish Chopra <[email protected]> Cc: Len Brown <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: [email protected] Cc: Heiko Carstens <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Julian Wiedmann <[email protected]> Cc: John Stultz <[email protected]> Cc: Mark Gross <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: [email protected] Cc: [email protected] Cc: "Martin K. Petersen" <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: [email protected] Cc: Sebastian Reichel <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Stefan Richter <[email protected]> Cc: Michael Reed <[email protected]> Cc: [email protected] Cc: Martin Schwidefsky <[email protected]> Cc: Andrew Morton <[email protected]> Cc: [email protected] Cc: Sudip Mukherjee <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2017-10-04powerpc/mm: Call flush_tlb_kernel_range with interrupts enabledGuenter Roeck1-1/+1
flush_tlb_kernel_range() may call smp_call_function_many() which expects interrupts to be enabled. This results in a traceback. WARNING: CPU: 0 PID: 1 at kernel/smp.c:416 smp_call_function_many+0xcc/0x2fc CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-rc1-00009-g0666f56 #1 task: cf830000 task.stack: cf82e000 NIP: c00a93c8 LR: c00a9634 CTR: 00000001 REGS: cf82fde0 TRAP: 0700 Not tainted (4.14.0-rc1-00009-g0666f56) MSR: 00021000 <CE,ME> CR: 24000082 XER: 00000000 GPR00: c00a9634 cf82fe90 cf830000 c050ad3c c0015a54 00000000 00000001 00000001 GPR08: 00000001 00000000 00000000 cf82e000 24000084 00000000 c0003150 00000000 GPR16: 00000000 00000000 00000000 00000000 00000000 00000001 00000000 c0510000 GPR24: 00000000 c0015a54 00000000 c050ad3c c051823c c050ad3c 00000025 00000000 NIP [c00a93c8] smp_call_function_many+0xcc/0x2fc LR [c00a9634] smp_call_function+0x3c/0x50 Call Trace: [cf82fe90] [00000010] 0x10 (unreliable) [cf82fed0] [c00a9634] smp_call_function+0x3c/0x50 [cf82fee0] [c0015d2c] flush_tlb_kernel_range+0x20/0x38 [cf82fef0] [c001524c] mark_initmem_nx+0x154/0x16c [cf82ff20] [c001484c] free_initmem+0x20/0x4c [cf82ff30] [c000316c] kernel_init+0x1c/0x108 [cf82ff40] [c000f3a8] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 7c0803a6 7d808120 38210040 4e800020 3d20c052 812981a0 2f890000 40beffac 3d20c051 8929ac64 2f890000 40beff9c <0fe00000> 4bffff94 7fc3f378 7f64db78 Fixes: 3184cc4b6f6a ("powerpc/mm: Fix kernel RAM protection after freeing ...") Fixes: e611939fc8ec ("powerpc/mm: Ensure change_page_attr() doesn't ...") Cc: Christophe Leroy <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-09-28cxl: Enable global TLBIs for cxl contextsFrederic Barrat1-9/+0
The PSL and nMMU need to see all TLB invalidations for the memory contexts used on the adapter. For the hash memory model, it is done by making all TLBIs global as soon as the cxl driver is in use. For radix, we need something similar, but we can refine and only convert to global the invalidations for contexts actually used by the device. The new mm_context_add_copro() API increments the 'active_cpus' count for the contexts attached to the cxl adapter. As soon as there's more than 1 active cpu, the TLBIs for the context become global. Active cpu count must be decremented when detaching to restore locality if possible and to avoid overflowing the counter. The hash memory model support is somewhat limited, as we can't decrement the active cpus count when mm_context_remove_copro() is called, because we can't flush the TLB for a mm on hash. So TLBIs remain global on hash. Signed-off-by: Frederic Barrat <[email protected]> Fixes: f24be42aab37 ("cxl: Add psl9 specific code") Tested-by: Alistair Popple <[email protected]> [mpe: Fold in updated comment on the barrier from Fred] Signed-off-by: Michael Ellerman <[email protected]>
2017-09-28powerpc/mm: Export flush_all_mm()Frederic Barrat1-2/+4
With the optimizations introduced by commit a46cc7a90fd8 ("powerpc/mm/radix: Improve TLB/PWC flushes"), flush_tlb_mm() no longer flushes the page walk cache (PWC) with radix. This patch introduces flush_all_mm(), which flushes everything, TLB and PWC, for a given mm. Signed-off-by: Frederic Barrat <[email protected]> Reviewed-By: Alistair Popple <[email protected]> [mpe: Add a WARN_ON_ONCE() in the empty hash routines] Signed-off-by: Michael Ellerman <[email protected]>
2017-09-01powerpc/mm: Use seq_putc() in two functionsMarkus Elfring2-2/+2
Two single characters (line breaks) should be put into a sequence. Thus use the corresponding function "seq_putc". This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-09-01powerpc: fix location of two EXPORT_SYMBOLChristophe Leroy1-1/+1
Commit 9445aa1a3062a ("ppc: move exports to definitions") added EXPORT_SYMBOL() for memset() and flush_hash_pages() in the middle of the functions. This patch moves them at the end of the two functions. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-31powerpc/mm/radix: Prettify mapped memory range print outMichael Ellerman1-1/+6
When we map memory at boot we print out the ranges of real addresses that we mapped and the page size that was used. Currently it's a bit ugly: Mapped range 0x0 - 0x2000000000 with 0x40000000 Mapped range 0x200000000000 - 0x202000000000 with 0x40000000 Pad the addresses so they line up, and print the page size using actual units, eg: Mapped 0x0000000000000000-0x0000000001200000 with 64.0 KiB pages Mapped 0x0000000001200000-0x0000000040000000 with 2.00 MiB pages Mapped 0x0000000040000000-0x0000000100000000 with 1.00 GiB pages Signed-off-by: Michael Ellerman <[email protected]>
2017-08-31powerpc/mm/radix: Add pr_fmt() to pgtable-radix.cMichael Ellerman1-0/+4
Make the printks look a bit nicer by adding a prefix. Signed-off-by: Michael Ellerman <[email protected]>
2017-08-23powerpc/mm: Make switch_mm_irqs_off() out of lineBenjamin Herrenschmidt2-1/+100
It's too big to be inline, there is no reason to keep it that way. Signed-off-by: Benjamin Herrenschmidt <[email protected]> [mpe: Rework to incorporate the comment changes via fixes branch] Signed-off-by: Michael Ellerman <[email protected]>
2017-08-23powerpc/mm: Optimize detection of thread local mm'sBenjamin Herrenschmidt1-0/+2
Instead of comparing the whole CPU mask every time, let's keep a counter of how many bits are set in the mask. Thus testing for a local mm only requires testing if that counter is 1 and the current CPU bit is set in the mask. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-23powerpc/mm: Use mm_is_thread_local() instread of open-codingBenjamin Herrenschmidt4-14/+6
We open-code testing for the mm being local to the current CPU in a few places. Use our existing helper instead. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-23Merge branch 'fixes' into nextMichael Ellerman4-6/+80
There's a non-trivial dependency between some commits we want to put in next and the KVM prefetch work around that went into fixes. So merge fixes into next.
2017-08-17powerpc/mm/cxl: Add the fault handling cpu to mm cpumaskAneesh Kumar K.V1-9/+1
We use mm cpumask for serializing against lockless page table walk. Anybody who is doing a lockless page table walk is expected to disable irq and only cpus in mm cpumask is expected do the lockless walk. This ensure that a THP split can send IPI to only cpus in the mm cpumask, to make sure there are no parallel lockless page table walk. Add the CAPI fault handling cpu to the mm cpumask so that we can do the lockless page table walk while inserting hash page table entries. Reviewed-by: Frederic Barrat <[email protected]> Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-17powerpc/mm: Don't send IPI to all cpus on THP updatesAneesh Kumar K.V3-9/+39
Now that we made sure that lockless walk of linux page table is mostly limitted to current task(current->mm->pgdir) we can update the THP update sequence to only send IPI to CPUs on which this task has run. This helps in reducing the IPI overload on systems with large number of CPUs. WRT kvm even though kvm is walking page table with vpc->arch.pgdir, it is done only on secondary CPUs and in that case we have primary CPU added to task's mm cpumask. Sending an IPI to primary will force the secondary to do a vm exit and hence this mm cpumask usage is safe here. WRT CAPI, we still end up walking linux page table with capi context MM. For now the pte lookup serialization sends an IPI to all CPUs in CPI is in use. We can further improve this by adding the CAPI interrupt handling CPU to task mm cpumask. That will be done in a later patch. Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-17Merge branch 'topic/ppc-kvm' into nextMichael Ellerman3-14/+21
Bring in the commit to rename find_linux_pte_or_hugepte() which touches arch and KVM code, and might need to be merged with the kvmppc tree to avoid conflicts.
2017-08-17powerpc/mm: Rename find_linux_pte_or_hugepte()Aneesh Kumar K.V3-14/+21
Add newer helpers to make the function usage simpler. It is always recommended to use find_current_mm_pte() for walking the page table. If we cannot use find_current_mm_pte(), it should be documented why the said usage of __find_linux_pte() is safe against a parallel THP split. For now we have KVM code using __find_linux_pte(). This is because kvm code ends up calling __find_linux_pte() in real mode with MSR_EE=0 but with PACA soft_enabled = 1. We may want to fix that later and make sure we keep the MSR_EE and PACA soft_enabled in sync. When we do that we can switch kvm to use find_linux_pte(). Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-16powerpc/mm/hugetlb: Add support for reserving gigantic huge pages via kernel ↵Aneesh Kumar K.V3-160/+21
command line With commit aa888a74977a8 ("hugetlb: support larger than MAX_ORDER") we added support for allocating gigantic hugepages via kernel command line. Switch ppc64 arch specific code to use that. W.r.t FSL support, we now limit our allocation range using BOOTMEM_ALLOC_ACCESSIBLE. We use the kernel command line to do reservation of hugetlb pages on powernv platforms. On pseries hash mmu mode the supported gigantic huge page size is 16GB and that can only be allocated with hypervisor assist. For pseries the command line option doesn't do the allocation. Instead pseries does gigantic hugepage allocation based on hypervisor hint that is specified via "ibm,expected#pages" property of the memory node. Cc: Scott Wood <[email protected]> Cc: Christophe Leroy <[email protected]> Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-15powerpc/hugetlb: fix page rights verification in gup_hugepte()Christophe Leroy1-12/+3
gup_hugepte() checks if pages are present and readable, and when 'write' is set, also checks if the pages are writable. Initially this was done by checking if _PAGE_PRESENT and _PAGE_READ were set. In addition, _PAGE_WRITE was verified for write accesses. The problem is that we have to handle the three following cases: 1/ The target defines __PAGE_READ and __PAGE_WRITE 2/ The target defines __PAGE_RW 3/ The target defines __PAGE_RO In case 1/, this is obvious In case 2/, __PAGE_READ is defined as 0 and __PAGE_WRITE as __PAGE_RW so it works as well. But in case 3, __PAGE_RW is defined as 0, which means __PAGE_WRITE is 0 and then the test returns true (page writable) in all cases. A first correction was attempted in commit 6b8cb66a6a7cc ("powerpc: Fix usage of _PAGE_RO in hugepage"), but that fix is wrong: instead of checking that the page is writable when write is requested, it checks that the page is NOT writable when write is NOT requested. This patch adds a new pte_read() helper to check whether a page is readable or not. This avoids handling all possible cases in gup_hugepte(). Then gup_hugepte() is modified to use pte_present(), pte_read() and pte_write() instead of the raw flags. Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-15powerpc/mm: Simplify __set_fixmap()Christophe Leroy1-15/+0
__set_fixmap() uses __fix_to_virt() then does the boundary checks by it self. Instead, we can use fix_to_virt() which does the verification at build time. For this, we need to use it inline so that GCC can see the real value of idx at buildtime. In the meantime, we remove the 'fixmaps' variable. This variable is set but has never been used from the beginning (commit 2c419bdeca1d9 ("[POWERPC] Port fixmap from x86 and use for kmap_atomic")) Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-15powerpc/mm: declare some local functions staticChristophe Leroy1-2/+2
get_pteptr() and __mapin_ram_chunk() are only used locally, so define them static Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-15powerpc/mm: Implement STRICT_KERNEL_RWX on PPC32Christophe Leroy2-0/+30
This patch implements STRICT_KERNEL_RWX on PPC32. As for CONFIG_DEBUG_PAGEALLOC, it deactivates BAT and LTLB mappings in order to allow page protection setup at the level of each page. As BAT/LTLB mappings are deactivated, there might be a performance impact. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-15powerpc/mm: Fix kernel RAM protection after freeing unused memory on PPC32Christophe Leroy1-3/+10
As seen below, allthough the init sections have been freed, the associated memory area is still marked as executable in the page tables. ~ dmesg [ 5.860093] Freeing unused kernel memory: 592K (c0570000 - c0604000) ~ cat /sys/kernel/debug/kernel_page_tables ---[ Start of kernel VM ]--- 0xc0000000-0xc0497fff 4704K rw X present dirty accessed shared 0xc0498000-0xc056ffff 864K rw present dirty accessed shared 0xc0570000-0xc059ffff 192K rw X present dirty accessed shared 0xc05a0000-0xc7ffffff 125312K rw present dirty accessed shared ---[ vmalloc() Area ]--- This patch fixes that. The implementation is done by reusing the change_page_attr() function implemented for CONFIG_DEBUG_PAGEALLOC Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-15powerpc/mm: Ensure change_page_attr() doesn't invalidate pinned TLBsChristophe Leroy1-4/+6
__change_page_attr() uses flush_tlb_page(). flush_tlb_page() uses tlbie instruction, which also invalidates pinned TLBs, which is not what we expect. This patch modifies the implementation to use flush_tlb_kernel_range() instead. This will make use of tlbia which will preserve pinned TLBs. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-15powerpc/8xx: mark init functions with __initChristophe Leroy1-4/+4
setup_initial_memory_limit() is only called during init. mmu_patch_cmp_limit() is only called from 8xx_mmu.c Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-15powerpc/8xx: Make pinning of ITLBs optionalChristophe Leroy1-1/+7
As stated in a comment in head_8xx.S, today we "Always pin the first 8 MB ITLB to prevent ITLB misses while mucking around with SRR0/SRR1 in asm". This issue has just been cleared by the preceding patch, therefore we can make this pinning optional (on by default) and independent of DATA pinning. This patch also makes pinning of IMMR independent of pinning of DATA. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-15powerpc/8xx: Ensures RAM mapped with LTLB is seen as block mapped on 8xx.Christophe Leroy1-2/+11
On the 8xx, the RAM mapped with LTLBs must be seen as block mapped, just like areas mapped with BATs on standard PPC32. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-10powerpc/mm: Fix section mismatch warning in early_check_vec5()Michael Ellerman1-1/+1
early_check_vec5() is called from and calls __init routines, so should also be __init. Signed-off-by: Michael Ellerman <[email protected]>
2017-08-10powerpc/8xx: Use symbolic names for DSISR bits in DSIChristophe Leroy1-1/+1
Use symbolic names for DSISR bits in DSI Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-10powerpc/8xx: Getting rid of remaining use of CONFIG_8xxChristophe Leroy4-8/+8
Two config options exist to define powerpc MPC8xx: * CONFIG_PPC_8xx * CONFIG_8xx arch/powerpc/platforms/Kconfig.cputype has contained the following comment about CONFIG_8xx item for some years: "# this is temp to handle compat with arch=ppc" arch/powerpc is now the only place with remaining use of CONFIG_8xx: get rid of them. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-10powerpc/mm: Properly invalidate when setting process table baseSuraj Jitindar Singh1-2/+6
The host process table base is stored in the partition table by calling the function native_register_process_table(). Currently this just sets the entry in memory and is missing a subsequent cache invalidation instruction. Any update to the partition table should be followed by a cache invalidation instruction specifying invalidation of the caching of any partition table entries (RIC = 2, PRS = 0). We already have a function to update the partition table with the required cache invalidation instructions - mmu_partition_table_set_entry(). Update the native_register_process_table() function to call mmu_partition_table_set_entry(), this ensures all appropriate invalidation will be performed. Signed-off-by: Suraj Jitindar Singh <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> [mpe: Use a local for patb0 to clean it up slightly] Signed-off-by: Michael Ellerman <[email protected]>
2017-08-08powerpc/mm/hash64: Make vmalloc 56T on hashMichael Ellerman1-3/+15
On 64-bit book3s, with the hash MMU, we currently define the kernel virtual space (vmalloc, ioremap etc.), to be 16T in size. This is a leftover from pre v3.7 when our user VM was also 16T. Of that 16T we split it 50/50, with half used for PCI IO and ioremap and the other 8T for vmalloc. We never bothered to make it any bigger because 8T of vmalloc ought to be enough for anybody. But it turns out that's not true, the per cpu allocator wants large amounts of vmalloc space, not to make large allocations, but to allow a large stride between allocations, because we use pcpu_embed_first_chunk(). With a bit of juggling we can increase the entire kernel virtual space to 64T. The only real complication is the check of the address in the SLB miss handler, see the comment in the code. Although we could continue to split virtual space 50/50 as we do now, no one seems to be running out of PCI IO or ioremap space. So instead keep that as 8T, and use the remaining 56T for vmalloc. In future we should be able to increase the kernel virtual space to 512T, the code already supports that, it just needs testing on older hardware. Signed-off-by: Michael Ellerman <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]>
2017-08-08powerpc/mm/slb: Move comment next to the code it's referring toMichael Ellerman1-3/+4
There is a comment in slb_allocate() referring to the load of paca->vmalloc_sllp, but it's several lines prior in the assembly. We're about to change this code, and we want to add another comment, so move the comment immediately prior to the instruction it's talking about. Signed-off-by: Michael Ellerman <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-08powerpc/mm/book3s64: Make KERN_IO_START a variableMichael Ellerman3-0/+4
Currently KERN_IO_START is defined as: #define KERN_IO_START (KERN_VIRT_START + (KERN_VIRT_SIZE >> 1)) Although it looks like a constant, both the components are actually variables, to allow us to have a different value between Radix and Hash with a single kernel. However that still requires both Radix and Hash to place the kernel IO region at the same location relative to the start and end of the kernel virtual region (namely 1/2 way through it), and we'd like to change that. So split KERN_IO_START out into its own variable, and initialise it for Radix and Hash. In the medium term we should be able to reconsolidate this, by doing a more involved rearrangement of the location of the regions. Signed-off-by: Michael Ellerman <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> Acked-by: Balbir Singh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc: Remove old unused icswx based coprocessor supportBenjamin Herrenschmidt6-482/+0
We have a whole pile of unused code to maintain the ACOP register, allocate coprocessor PIDs and handle ACOP faults. This mechanism was used for the HFI adapter on POWER7 which is dead and gone and whose driver never went upstream. It was used on some A2 core based stuff that also never saw the light of day. Take out all that code. There is still some POWER8 coprocessor code that uses icswx but it's kernel only and thus doesn't use any of that infrastructure. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Cleanup check for stack expansionBenjamin Herrenschmidt1-36/+48
When hitting below a VM_GROWSDOWN vma (typically growing the stack), we check whether it's a valid stack-growing instruction and we check the distance to GPR1. This is largely open coded with lots of comments, so move it out to a helper. While at it, make store_update_sp a boolean. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Don't lose "major" fault indication on retryBenjamin Herrenschmidt1-2/+3
If the first iteration returns VM_FAULT_MAJOR but the second one doesn't, we fail to account the fault as a major fault. This fixes it and brings the code in line with x86. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Move page fault VMA access checks to a helperBenjamin Herrenschmidt1-24/+33
Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Set fault flags earlierBenjamin Herrenschmidt1-1/+4
Move out the code that sets FAULT_FLAG_WRITE so the block that check access permissions can be extracted. While at it also set FAULT_FLAG_INSTRUCTION which will be used for protection keys. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Add a bunch of (un)likely annotations to do_page_faultBenjamin Herrenschmidt1-10/+10
Mostly for the failure cases Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Move/simplify faulthandler_disabled() and !mm checkBenjamin Herrenschmidt1-14/+13
Do the check before we re-enable interrupts and clean the code up a bit. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Move the DSISR_PROTFAULT sanity checkBenjamin Herrenschmidt1-33/+42
This has a page of comment explaining what's going on right in the middle of do_page_fault() which makes things a bit hard to follow. Move it to a helper instead. Also do the test earlier as there's no point waiting until after we found the VMA. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Cosmetic fix to page fault accountingBenjamin Herrenschmidt1-4/+2
No need to break those lines, they aren't that long Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Move CMO accounting out of do_page_fault into a helperBenjamin Herrenschmidt1-11/+18
It makes do_page_fault() more readable. No functional change. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>