aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc/mm/fault.c
AgeCommit message (Collapse)AuthorFilesLines
2018-08-17Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-3/+4
Merge updates from Andrew Morton: - a few misc things - a few Y2038 fixes - ntfs fixes - arch/sh tweaks - ocfs2 updates - most of MM * emailed patches from Andrew Morton <[email protected]>: (111 commits) mm/hmm.c: remove unused variables align_start and align_end fs/userfaultfd.c: remove redundant pointer uwq mm, vmacache: hash addresses based on pmd mm/list_lru: introduce list_lru_shrink_walk_irq() mm/list_lru.c: pass struct list_lru_node* as an argument to __list_lru_walk_one() mm/list_lru.c: move locking from __list_lru_walk_one() to its caller mm/list_lru.c: use list_lru_walk_one() in list_lru_walk_node() mm, swap: make CONFIG_THP_SWAP depend on CONFIG_SWAP mm/sparse: delete old sparse_init and enable new one mm/sparse: add new sparse_init_nid() and sparse_init() mm/sparse: move buffer init/fini to the common place mm/sparse: use the new sparse buffer functions in non-vmemmap mm/sparse: abstract sparse buffer allocations mm/hugetlb.c: don't zero 1GiB bootmem pages mm, page_alloc: double zone's batchsize mm/oom_kill.c: document oom_lock mm/hugetlb: remove gigantic page support for HIGHMEM mm, oom: remove sleep from under oom_lock kernel/dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous() mm/cma: remove unsupported gfp_mask parameter from cma_alloc() ...
2018-08-17mm: convert return type of handle_mm_fault() caller to vm_fault_tSouptick Joarder1-3/+4
Use new return type vm_fault_t for fault handler. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. Ref-> commit 1c8f422059ae ("mm: change return type to vm_fault_t") In this patch all the caller of handle_mm_fault() are changed to return vm_fault_t type. Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PC Signed-off-by: Souptick Joarder <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Tony Luck <[email protected]> Cc: Matt Turner <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Russell King <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Richard Kuo <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Michal Simek <[email protected]> Cc: James Hogan <[email protected]> Cc: Ley Foon Tan <[email protected]> Cc: Jonas Bonn <[email protected]> Cc: James E.J. Bottomley <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Yoshinori Sato <[email protected]> Cc: David S. Miller <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: "Levin, Alexander (Sasha Levin)" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-07-30powerpc: remove unnecessary inclusion of asm/tlbflush.hChristophe Leroy1-1/+0
asm/tlbflush.h is only needed for: - using functions xxx_flush_tlb_xxx() - using MMU_NO_CONTEXT - including asm-generic/pgtable.h Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-07Merge tag 'powerpc-4.18-1' of ↵Linus Torvalds1-29/+47
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - Support for split PMD page table lock on 64-bit Book3S (Power8/9). - Add support for HAVE_RELIABLE_STACKTRACE, so we properly support live patching again. - Add support for patching barrier_nospec in copy_from_user() and syscall entry. - A couple of fixes for our data breakpoints on Book3S. - A series from Nick optimising TLB/mm handling with the Radix MMU. - Numerous small cleanups to squash sparse/gcc warnings from Mathieu Malaterre. - Several series optimising various parts of the 32-bit code from Christophe Leroy. - Removal of support for two old machines, "SBC834xE" and "C2K" ("GEFanuc,C2K"), which is why the diffstat has so many deletions. And many other small improvements & fixes. There's a few out-of-area changes. Some minor ftrace changes OK'ed by Steve, and a fix to our powernv cpuidle driver. Then there's a series touching mm, x86 and fs/proc/task_mmu.c, which cleans up some details around pkey support. It was ack'ed/reviewed by Ingo & Dave and has been in next for several weeks. Thanks to: Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Al Viro, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Balbir Singh, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Colin Ian King, Dave Hansen, Fabio Estevam, Finn Thain, Frederic Barrat, Gautham R. Shenoy, Haren Myneni, Hari Bathini, Ingo Molnar, Jonathan Neuschäfer, Josh Poimboeuf, Kamalesh Babulal, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Greer, Mathieu Malaterre, Matthew Wilcox, Michael Neuling, Michal Suchanek, Naveen N. Rao, Nicholas Piggin, Nicolai Stange, Olof Johansson, Paul Gortmaker, Paul Mackerras, Peter Rosin, Pridhiviraj Paidipeddi, Ram Pai, Rashmica Gupta, Ravi Bangoria, Russell Currey, Sam Bobroff, Samuel Mendoza-Jonas, Segher Boessenkool, Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stewart Smith, Thiago Jung Bauermann, Torsten Duwe, Vaibhav Jain, Wei Yongjun, Wolfram Sang, Yisheng Xie, YueHaibing" * tag 'powerpc-4.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (251 commits) powerpc/64s/radix: Fix missing ptesync in flush_cache_vmap cpuidle: powernv: Fix promotion from snooze if next state disabled powerpc: fix build failure by disabling attribute-alias warning in pci_32 ocxl: Fix missing unlock on error in afu_ioctl_enable_p9_wait() powerpc-opal: fix spelling mistake "Uniterrupted" -> "Uninterrupted" powerpc: fix spelling mistake: "Usupported" -> "Unsupported" powerpc/pkeys: Detach execute_only key on !PROT_EXEC powerpc/powernv: copy/paste - Mask SO bit in CR powerpc: Remove core support for Marvell mv64x60 hostbridges powerpc/boot: Remove core support for Marvell mv64x60 hostbridges powerpc/boot: Remove support for Marvell mv64x60 i2c controller powerpc/boot: Remove support for Marvell MPSC serial controller powerpc/embedded6xx: Remove C2K board support powerpc/lib: optimise PPC32 memcmp powerpc/lib: optimise 32 bits __clear_user() powerpc/time: inline arch_vtime_task_switch() powerpc/Makefile: set -mcpu=860 flag for the 8xx powerpc: Implement csum_ipv6_magic in assembly powerpc/32: Optimise __csum_partial() powerpc/lib: Adjust .balign inside string functions for PPC32 ...
2018-05-25powerpc/mm: Only read faulting instruction when necessary in do_page_fault()Christophe Leroy1-16/+34
Commit a7a9dcd882a67 ("powerpc: Avoid taking a data miss on every userspace instruction miss") has shown that limiting the read of faulting instruction to likely cases improves performance. This patch goes further into this direction by limiting the read of the faulting instruction to the only cases where it is likely needed. On an MPC885, with the same benchmark app as in the commit referred above, we see a reduction of about 3900 dTLB misses (approx 3%): Before the patch: Performance counter stats for './fault 500' (10 runs): 683033312 cpu-cycles ( +- 0.03% ) 134538 dTLB-load-misses ( +- 0.03% ) 46099 iTLB-load-misses ( +- 0.02% ) 19681 faults ( +- 0.02% ) 5.389747878 seconds time elapsed ( +- 0.06% ) With the patch: Performance counter stats for './fault 500' (10 runs): 682112862 cpu-cycles ( +- 0.03% ) 130619 dTLB-load-misses ( +- 0.03% ) 46073 iTLB-load-misses ( +- 0.05% ) 19681 faults ( +- 0.01% ) 5.381342641 seconds time elapsed ( +- 0.07% ) The proper work of the huge stack expansion was tested with the following app: int main(int argc, char **argv) { char buf[1024 * 1025]; sprintf(buf, "Hello world !\n"); printf(buf); exit(0); } Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> [mpe: Add include of pagemap.h to fix build errors] Signed-off-by: Michael Ellerman <[email protected]>
2018-05-25powerpc/mm: Use instruction symbolic names in store_updates_sp()Christophe Leroy1-13/+13
Use symbolic names defined in asm/ppc-opcode.h instead of hardcoded values. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-04-25signal: Ensure every siginfo we send has all bits initializedEric W. Biederman1-0/+1
Call clear_siginfo to ensure every stack allocated siginfo is properly initialized before being passed to the signal sending functions. Note: It is not safe to depend on C initializers to initialize struct siginfo on the stack because C is allowed to skip holes when initializing a structure. The initialization of struct siginfo in tracehook_report_syscall_exit was moved from the helper user_single_step_siginfo into tracehook_report_syscall_exit itself, to make it clear that the local variable siginfo gets fully initialized. In a few cases the scope of struct siginfo has been reduced to make it clear that siginfo siginfo is not used on other paths in the function in which it is declared. Instances of using memset to initialize siginfo have been replaced with calls clear_siginfo for clarity. Signed-off-by: "Eric W. Biederman" <[email protected]>
2018-04-04powerpc/mm/keys: Update documentation and remove unnecessary checkAneesh Kumar K.V1-16/+12
Adds more code comments. We also remove an unnecessary pkey check after we check for pkey error in this patch. Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-01-21Merge branch 'fixes' into nextMichael Ellerman1-1/+6
Merge our fixes branch from the 4.15 cycle. Unusually the fixes branch saw some significant features merged, notably the RFI flush patches, so we want the code in next to be tested against that, to avoid any surprises when the two are merged. There's also some other work on the panic handling that was reverted in fixes and we now want to do properly in next, which would conflict. And we also fix a few other minor merge conflicts.
2018-01-20powerpc: Deliver SEGV signal on pkey violationRam Pai1-12/+27
The value of the pkey, whose protection got violated, is made available in si_pkey field of the siginfo structure. Signed-off-by: Ram Pai <[email protected]> Signed-off-by: Thiago Jung Bauermann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-01-20powerpc: Handle exceptions caused by pkey violationRam Pai1-0/+22
Handle Data and Instruction exceptions caused by memory protection-key. The CPU will detect the key fault if the HPTE is already programmed with the key. However if the HPTE is not hashed, a key fault will not be detected by the hardware. The software will detect pkey violation in such a case. Signed-off-by: Ram Pai <[email protected]> Signed-off-by: Thiago Jung Bauermann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-01-16powerpc: Use the TRAP macro whenever comparing a trap numberBenjamin Herrenschmidt1-1/+1
Trap numbers can have extra bits at the bottom that need to be filtered out. There are a few cases where we don't do that. It's possible that we got lucky but better safe than sorry. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-01-02powerpc/mm: Fix SEGV on mapped region to return SEGV_ACCERRJohn Sperbeck1-1/+6
The recent refactoring of the powerpc page fault handler in commit c3350602e876 ("powerpc/mm: Make bad_area* helper functions") caused access to protected memory regions to indicate SEGV_MAPERR instead of the traditional SEGV_ACCERR in the si_code field of a user-space signal handler. This can confuse debug libraries that temporarily change the protection of memory regions, and expect to use SEGV_ACCERR as an indication to restore access to a region. This commit restores the previous behavior. The following program exhibits the issue: $ ./repro read || echo "FAILED" $ ./repro write || echo "FAILED" $ ./repro exec || echo "FAILED" #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <signal.h> #include <sys/mman.h> #include <assert.h> static void segv_handler(int n, siginfo_t *info, void *arg) { _exit(info->si_code == SEGV_ACCERR ? 0 : 1); } int main(int argc, char **argv) { void *p = NULL; struct sigaction act = { .sa_sigaction = segv_handler, .sa_flags = SA_SIGINFO, }; assert(argc == 2); p = mmap(NULL, getpagesize(), (strcmp(argv[1], "write") == 0) ? PROT_READ : 0, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); assert(p != MAP_FAILED); assert(sigaction(SIGSEGV, &act, NULL) == 0); if (strcmp(argv[1], "read") == 0) printf("%c", *(unsigned char *)p); else if (strcmp(argv[1], "write") == 0) *(unsigned char *)p = 0; else if (strcmp(argv[1], "exec") == 0) ((void (*)(void))p)(); return 1; /* failed to generate SEGV */ } Fixes: c3350602e876 ("powerpc/mm: Make bad_area* helper functions") Cc: [email protected] # v4.14+ Signed-off-by: John Sperbeck <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> [mpe: Add commit references in change log] Signed-off-by: Michael Ellerman <[email protected]>
2017-08-10powerpc/8xx: Use symbolic names for DSISR bits in DSIChristophe Leroy1-1/+1
Use symbolic names for DSISR bits in DSI Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-10powerpc/8xx: Getting rid of remaining use of CONFIG_8xxChristophe Leroy1-1/+1
Two config options exist to define powerpc MPC8xx: * CONFIG_PPC_8xx * CONFIG_8xx arch/powerpc/platforms/Kconfig.cputype has contained the following comment about CONFIG_8xx item for some years: "# this is temp to handle compat with arch=ppc" arch/powerpc is now the only place with remaining use of CONFIG_8xx: get rid of them. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc: Remove old unused icswx based coprocessor supportBenjamin Herrenschmidt1-15/+0
We have a whole pile of unused code to maintain the ACOP register, allocate coprocessor PIDs and handle ACOP faults. This mechanism was used for the HFI adapter on POWER7 which is dead and gone and whose driver never went upstream. It was used on some A2 core based stuff that also never saw the light of day. Take out all that code. There is still some POWER8 coprocessor code that uses icswx but it's kernel only and thus doesn't use any of that infrastructure. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Cleanup check for stack expansionBenjamin Herrenschmidt1-36/+48
When hitting below a VM_GROWSDOWN vma (typically growing the stack), we check whether it's a valid stack-growing instruction and we check the distance to GPR1. This is largely open coded with lots of comments, so move it out to a helper. While at it, make store_update_sp a boolean. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Don't lose "major" fault indication on retryBenjamin Herrenschmidt1-2/+3
If the first iteration returns VM_FAULT_MAJOR but the second one doesn't, we fail to account the fault as a major fault. This fixes it and brings the code in line with x86. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Move page fault VMA access checks to a helperBenjamin Herrenschmidt1-24/+33
Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Set fault flags earlierBenjamin Herrenschmidt1-1/+4
Move out the code that sets FAULT_FLAG_WRITE so the block that check access permissions can be extracted. While at it also set FAULT_FLAG_INSTRUCTION which will be used for protection keys. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Add a bunch of (un)likely annotations to do_page_faultBenjamin Herrenschmidt1-10/+10
Mostly for the failure cases Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Move/simplify faulthandler_disabled() and !mm checkBenjamin Herrenschmidt1-14/+13
Do the check before we re-enable interrupts and clean the code up a bit. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Move the DSISR_PROTFAULT sanity checkBenjamin Herrenschmidt1-33/+42
This has a page of comment explaining what's going on right in the middle of do_page_fault() which makes things a bit hard to follow. Move it to a helper instead. Also do the test earlier as there's no point waiting until after we found the VMA. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Cosmetic fix to page fault accountingBenjamin Herrenschmidt1-4/+2
No need to break those lines, they aren't that long Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Move CMO accounting out of do_page_fault into a helperBenjamin Herrenschmidt1-11/+18
It makes do_page_fault() more readable. No functional change. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Rework mm_fault_error()Benjamin Herrenschmidt1-38/+28
First, handle the normal retry failure in do_page_fault itself, since it's a simple return statement. That allows us to remove the "continue" special return code from mm_fault_error(). Once that's done, we can have an implementation much closer to x86 where we only call mm_fault_error() if VM_FAULT_ERROR is set and directly return. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Make bad_area* helper functionsBenjamin Herrenschmidt1-28/+50
Instead of goto labels, instead call those functions and return. This gets us closer to x86 and allows us to shring do_page_fault() even more. The main difference with x86 is that those function return a value which we then return from do_page_fault(). That value is our return value from do_page_fault() which we use to generate kernel faults. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Fix reporting of kernel execute faultsBenjamin Herrenschmidt1-6/+15
We currently test for is_exec and DSISR_PROTFAULT but that doesn't make sense as this is the wrong error bit to test for an execute permission failure. In fact, we had code that would return early if we had an exec fault in kernel mode so I think that was just dead code anyway. Finally the location of that test is awkward and prevents further simplifications. So instead move that test into a helper along with the existing early test for kernel exec faults and out of range accesses, and put it all in a "bad_kernel_fault()" helper. While at it test the correct error bits. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Simplify returns from __do_page_faultBenjamin Herrenschmidt1-23/+16
Now that we moved the exception state handling to a wrapper, we can just directly return rather than "goto bail" Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Move debugger check to notify_page_fault()Benjamin Herrenschmidt1-13/+8
unclutters the main path Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Overhaul handling of bad page faultsBenjamin Herrenschmidt1-18/+14
A bad page fault is when the HW signals an error such as a bad copy/paste, an AMO error, or some other type of error that will not be fixed by updating the PTE. Use a helper page_fault_is_bad() to check for bad page faults thus removing the per-processor family open-coding in __do_page_fault() and trigger a SIGBUS rather than a SIGSEGV which is more appropriate. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Move error_code checks for bad faults earlierBenjamin Herrenschmidt1-15/+20
There's no point looking for the VMA etc.. when we already know we are going to fail. This adds some code to set "code" for the si_code but that will be gone in subsequent patches. Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/mm: Move out definition of CPU specific is_write bitsBenjamin Herrenschmidt1-7/+11
Define a common page_fault_is_write() helper and use it Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-03powerpc/6xx: Handle DABR match before calling do_page_faultBenjamin Herrenschmidt1-9/+0
On legacy 6xx 32-bit procesors, we checked for the DABR match bit in DSISR from do_page_fault(), in the middle of a pile of ifdef's because all other CPU types do it in assembly prior to calling do_page_fault. Fix that. Signed-off-by: Benjamin Herrenschmidt <[email protected]> [mpe: Add #ifdef CONFIG_6xx] Signed-off-by: Michael Ellerman <[email protected]>
2017-08-02powerpc/mm: Pre-filter SRR1 bits before do_page_fault()Benjamin Herrenschmidt1-12/+2
By filtering the relevant SRR1 bits in the assembly rather than in do_page_fault() itself, we avoid a conditional branch (since we already come from different path for data and instruction faults). This will allow more simplifications later Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-08-02powerpc/mm: Move exception_enter/exit to a do_page_fault wrapperBenjamin Herrenschmidt1-3/+11
This will allow simplifying the returns from do_page_fault Signed-off-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-06-02powerpc/mm: The 8xx doesn't call do_page_fault() for breakpointsChristophe Leroy1-1/+1
The 8xx has a dedicated exception for breakpoints, that directly calls do_break() Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-06-02powerpc/mm: Evaluate user_mode(regs) only once in do_page_fault()Christophe Leroy1-6/+7
Analysis of the assembly code shows that when using user_mode(regs), at least the 'andi.' is redone all the time, and also the 'lwz ,132(r31)' most of the time. With the new form, the 'is_user' is mapped to cr4, then all further use of is_user results in just things like 'beq cr4,218 <do_page_fault+0x218>' Without the patch: 50: 81 1e 00 84 lwz r8,132(r30) 54: 71 09 40 00 andi. r9,r8,16384 58: 40 82 00 0c bne 64 <do_page_fault+0x64> 84: 81 3e 00 84 lwz r9,132(r30) 8c: 71 2a 40 00 andi. r10,r9,16384 90: 41 a2 01 64 beq 1f4 <do_page_fault+0x1f4> d4: 81 3e 00 84 lwz r9,132(r30) dc: 71 28 40 00 andi. r8,r9,16384 e0: 41 82 02 08 beq 2e8 <do_page_fault+0x2e8> 108: 81 3e 00 84 lwz r9,132(r30) 110: 71 28 40 00 andi. r8,r9,16384 118: 41 82 02 28 beq 340 <do_page_fault+0x340> 1e4: 81 3e 00 84 lwz r9,132(r30) 1e8: 71 2a 40 00 andi. r10,r9,16384 1ec: 40 82 01 68 bne 354 <do_page_fault+0x354> 228: 81 3e 00 84 lwz r9,132(r30) 22c: 71 28 40 00 andi. r8,r9,16384 230: 41 82 ff c4 beq 1f4 <do_page_fault+0x1f4> 288: 71 2a 40 00 andi. r10,r9,16384 294: 41 a2 fe 60 beq f4 <do_page_fault+0xf4> 50c: 81 3e 00 84 lwz r9,132(r30) 514: 71 2a 40 00 andi. r10,r9,16384 518: 40 a2 fc e0 bne 1f8 <do_page_fault+0x1f8> 534: 81 3e 00 84 lwz r9,132(r30) 53c: 71 2a 40 00 andi. r10,r9,16384 540: 41 82 fc b8 beq 1f8 <do_page_fault+0x1f8> This patch creates a local var called 'is_user' which contains the result of user_mode(regs) With the patch: 20: 81 03 00 84 lwz r8,132(r3) 48: 55 09 97 fe rlwinm r9,r8,18,31,31 58: 2e 09 00 00 cmpwi cr4,r9,0 5c: 40 92 00 0c bne cr4,68 <do_page_fault+0x68> 88: 41 b2 01 90 beq cr4,218 <do_page_fault+0x218> d4: 40 92 01 d0 bne cr4,2a4 <do_page_fault+0x2a4> 120: 41 b2 00 f8 beq cr4,218 <do_page_fault+0x218> 138: 41 b2 ff a0 beq cr4,d8 <do_page_fault+0xd8> 1d4: 40 92 00 e0 bne cr4,2b4 <do_page_fault+0x2b4> Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-06-02powerpc/mm: Remove a redundant test in do_page_fault()Christophe Leroy1-1/+1
The result of (trap == 0x400) is already in is_exec. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-06-02powerpc/mm: Only call store_updates_sp() on stores in do_page_fault()Christophe Leroy1-1/+1
Function store_updates_sp() checks whether the faulting instruction is a store updating r1. Therefore we can limit its calls to store exceptions. This patch is an improvement of commit a7a9dcd882a67 ("powerpc: Avoid taking a data miss on every userspace instruction miss") With the same microbenchmark app, run with 500 as argument, on an MPC885 we get: Before this patch: 152000 DTLB misses After this patch: 147000 DTLB misses Signed-off-by: Christophe Leroy <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-04-03powerpc: Avoid taking a data miss on every userspace instruction missAnton Blanchard1-1/+1
Early on in do_page_fault() we call store_updates_sp(), regardless of the type of exception. For an instruction miss this doesn't make sense, because we only use this information to detect if a data miss is the result of a stack expansion instruction or not. Worse still, it results in a data miss within every userspace instruction miss handler, because we try and load the very instruction we are about to install a pte for! A simple exec microbenchmark runs 6% faster on POWER8 with this fix: #include <stdlib.h> #include <stdio.h> #include <unistd.h> int main(int argc, char *argv[]) { unsigned long left = atol(argv[1]); char leftstr[16]; if (left-- == 0) return 0; sprintf(leftstr, "%ld", left); execlp(argv[0], argv[0], leftstr, NULL); perror("exec failed\n"); return 0; } Pass the number of iterations on the command line (eg 10000) and time how long it takes to execute. Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-03-21powerpc/mm: Move mmap_sem unlocking in do_page_fault()Laurent Dufour1-15/+4
Since the fault retry is now handled earlier, we can release the mmap_sem lock earlier too and remove later unlocking previously done in mm_fault_error(). Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Laurent Dufour <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-03-21powerpc/mm: Handle VM_FAULT_RETRY earlierLaurent Dufour1-29/+38
In do_page_fault() if handle_mm_fault() returns VM_FAULT_RETRY, retry the page fault handling before anything else. This would simplify the handling of the mmap_sem lock in this part of the code. Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Laurent Dufour <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-03-21powerpc/mm: Move mmap_sem unlock up from do_sigbusLaurent Dufour1-3/+3
Move mmap_sem releasing in the do_sigbus()'s unique caller : mm_fault_error() No functional changes. Reviewed-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Laurent Dufour <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar1-0/+1
<linux/sched/task_stack.h> We are going to split <linux/sched/task_stack.h> out of <linux/sched.h>, which will have to be picked up from other headers and a couple of .c files. Create a trivial placeholder <linux/sched/task_stack.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-02-22Merge tag 'powerpc-4.11-1' of ↵Linus Torvalds1-10/+33
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights include: - Support for direct mapped LPC on POWER9, giving Linux direct access to devices that may be on there such as a UART. - Memory hotplug support for the Power9 Radix MMU. - Add new AUX vectors describing the processor's cache geometry, to be used by glibc. - The ability for a guest to ask the hypervisor to resize the guest's hash table, and in addition support for doing so automatically when memory is hotplugged into/out-of the guest. This allows the hash table to be sized based on the current memory usage of the guest, rather than the maximum possible memory usage. - Implementation of optprobes (kprobe optimisation) for powerpc. In addition there's the topic branch shared with the KVM tree, which includes support for guests to use the Radix MMU on Power9. Thanks to: Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T, Anton Blanchard, Benjamin Herrenschmidt, Chris Packham, Daniel Axtens, Daniel Borkmann, David Gibson, Finn Thain, Gautham R. Shenoy, Gavin Shan, Greg Kurz, Joel Stanley, John Allen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael Neuling, Nathan Fontenot, Naveen N. Rao, Nicholas Piggin, Paul Mackerras, Ravi Bangoria, Reza Arbab, Shailendra Singh, Vaibhav Jain, Wei Yongjun" * tag 'powerpc-4.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (129 commits) powerpc/mm/radix: Skip ptesync in pte update helpers powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm powerpc/mm/radix: Update pte update sequence for pte clear case powerpc/mm: Update PROTFAULT handling in the page fault path powerpc/xmon: Fix data-breakpoint powerpc/mm: Fix build break with BOOK3S_64=n and MEMORY_HOTPLUG=y powerpc/mm: Fix build break when CMA=n && SPAPR_TCE_IOMMU=y powerpc/mm: Fix build break with RADIX=y & HUGETLBFS=n powerpc/pseries: Fix typo in parameter description powerpc/kprobes: Remove kprobe_exceptions_notify() kprobes: Introduce weak variant of kprobe_exceptions_notify() powerpc/ftrace: Fix confusing help text for DISABLE_MPROFILE_KERNEL powerpc/powernv: Fix opal_exit tracepoint opcode powerpc: Add a prototype for mcount() so it can be versioned powerpc: Drop GPL from of_node_to_nid() export to match other arches powerpc/kprobes: Optimize kprobe in kretprobe_trampoline() powerpc/kprobes: Implement Optprobes powerpc/kprobes: Fixes for kprobe_lookup_name() on BE powerpc: Add helper to check if offset is within relative branch range powerpc/bpf: Introduce __PPC_SH64() ...
2017-02-15powerpc/mm: Update PROTFAULT handling in the page fault pathAneesh Kumar K.V1-10/+33
With radix, we can get page fault with DSISR_PROTFAULT value set in case of PROT_NONE or autonuma mapping. The PROT_NONE case in handled by the vma check where we consider the access bad. For autonuma we should fall through and fixup the access mask correctly. Without this patch we trigger the WARN_ON() on radix. This code moves that WARN_ON() within a radix_enabled() check. I also moved the WARN_ON() outside the if condition making it apply for all type of faults (exec/write/read). It is also conditionalized for book3s, because BOOK3E can also get a PROTFAULT to handle the D/I cache sync. Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-02-08powerpc/mm: Fix spurrious segfaults on radix with autonumaBenjamin Herrenschmidt1-16/+5
When autonuma (Automatic NUMA balancing) marks a PTE inaccessible it clears all the protection bits but leave the PTE valid. With the Radix MMU, an attempt at executing from such a PTE will take a fault with bit 35 of SRR1 set "SRR1_ISI_N_OR_G". It is thus incorrect to treat all such faults as errors. We should pass them to handle_mm_fault() for autonuma to deal with. The case of pages that are really not executable is handled by the existing test for VM_EXEC further down. That leaves us with catching the kernel attempts at executing user pages. We can catch that earlier, even before we do find_vma. It is never valid on powerpc for the kernel to take an exec fault to begin with. So fold that test with the existing test for the kernel faulting on kernel addresses to bail out early. Fixes: 1d18ad026844 ("powerpc/mm: Detect instruction fetch denied and report") Signed-off-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: Aneesh Kumar K.V <[email protected]> Acked-by: Balbir Singh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-11-30powerpc/mm: Fix no execute fault handling on pre-POWER5Balbir Singh1-1/+9
Aneesh/Ben reported that the change to do_page_fault() we made in commit 1d18ad026844 ("powerpc/mm: Detect instruction fetch denied and report") needs to handle the case where CPU_FTR_COHERENT_ICACHE is missing but we have CPU_FTR_NOEXECUTE. In those cases the check added for SRR1_ISI_N_OR_G might trigger a false positive. This patch adds a check for CPU_FTR_COHERENT_ICACHE in addition to the MSR value. Fixes: 1d18ad026844 ("powerpc/mm: Detect instruction fetch denied and report") Reported-by: Aneesh Kumar K.V <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Balbir Singh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2016-11-25powerpc/mm: Detect instruction fetch denied and reportBalbir Singh1-0/+7
ISA 3 allows for prevention of instruction fetch and execution of user mode pages. If such an error occurs, SRR1 bit 35 reports the error. We catch and report the error in do_page_fault(). Signed-off-by: Balbir Singh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>