aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc/mm/fault.c
AgeCommit message (Collapse)AuthorFilesLines
2016-11-14powerpc: Add support for relative exception tablesNicholas Piggin1-1/+1
This halves the exception table size on 64-bit builds, and it allows build-time sorting of exception tables to work on relocated kernels. Signed-off-by: Nicholas Piggin <[email protected]> [mpe: Minor asm fixups and bits to keep the selftests working] Signed-off-by: Michael Ellerman <[email protected]>
2016-09-19powerpc: Use kprobe blacklist for exception handlersNicholas Piggin1-2/+2
Currently we mark the C implementations of some exception handlers as __kprobes. This has the effect of putting them in the ".kprobes.text" section, which separates them from the rest of the text. Instead we can use the blacklist macros to add the symbols to a blacklist which kprobes will check. This allows the linker to move exception handler functions close to callers and avoids trampolines in larger kernels. Signed-off-by: Nicholas Piggin <[email protected]> [mpe: Reword change log a bit] Signed-off-by: Michael Ellerman <[email protected]>
2016-08-22powerpc: migrate exception table users off module.h and onto extable.hPaul Gortmaker1-1/+1
These files were only including module.h for exception table related functions. We've now separated that content out into its own file "extable.h" so now move over to that and avoid all the extra header content in module.h that we don't really need to compile these files. Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: [email protected] Signed-off-by: Paul Gortmaker <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2016-07-26mm: do not pass mm_struct into handle_mm_faultKirill A. Shutemov1-1/+1
We always have vma->vm_mm around. Link: http://lkml.kernel.org/r/1466021202-61880-8-git-send-email-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-07-06powerpc: Add plain English description for alignment exception oopsesAnton Blanchard1-0/+4
If we take an alignment exception which we cannot fix, the oops currently prints: Unable to handle kernel paging request for unknown fault Lets print something more useful: Unable to handle kernel paging request for unaligned access at address 0xc0000000f77bba8f Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2015-05-19mm/fault, arch: Use pagefault_disable() to check for disabled pagefaults in ↵David Hildenbrand1-4/+5
the handler Introduce faulthandler_disabled() and use it to check for irq context and disabled pagefaults (via pagefault_disable()) in the pagefault handlers. Please note that we keep the in_atomic() checks in place - to detect whether in irq context (in which case preemption is always properly disabled). In contrast, preempt_disable() should never be used to disable pagefaults. With !CONFIG_PREEMPT_COUNT, preempt_disable() doesn't modify the preempt counter, and therefore the result of in_atomic() differs. We validate that condition by using might_fault() checks when calling might_sleep(). Therefore, add a comment to faulthandler_disabled(), describing why this is needed. faulthandler_disabled() and pagefault_disable() are defined in linux/uaccess.h, so let's properly add that include to all relevant files. This patch is based on a patch from Thomas Gleixner. Reviewed-and-tested-by: Thomas Gleixner <[email protected]> Signed-off-by: David Hildenbrand <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: [email protected] Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-02-12ppc64: add paranoid warnings for unexpected DSISR_PROTFAULTMel Gorman1-11/+9
ppc64 should not be depending on DSISR_PROTFAULT and it's unexpected if they are triggered. This patch adds warnings just in case they are being accidentally depended upon. Signed-off-by: Mel Gorman <[email protected]> Acked-by: Aneesh Kumar K.V <[email protected]> Tested-by: Sasha Levin <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Dave Jones <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kirill Shutemov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-02-12mm: convert p[te|md]_numa users to p[te|md]_protnone_numaMel Gorman1-5/+0
Convert existing users of pte_numa and friends to the new helper. Note that the kernel is broken after this patch is applied until the other page table modifiers are also altered. This patch layout is to make review easier. Signed-off-by: Mel Gorman <[email protected]> Acked-by: Linus Torvalds <[email protected]> Acked-by: Aneesh Kumar <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Tested-by: Sasha Levin <[email protected]> Cc: Dave Jones <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kirill Shutemov <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Sasha Levin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-01-29vm: add VM_FAULT_SIGSEGV handling supportLinus Torvalds1-0/+2
The core VM already knows about VM_FAULT_SIGBUS, but cannot return a "you should SIGSEGV" error, because the SIGSEGV case was generally handled by the caller - usually the architecture fault handler. That results in lots of duplication - all the architecture fault handlers end up doing very similar "look up vma, check permissions, do retries etc" - but it generally works. However, there are cases where the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV. In particular, when accessing the stack guard page, libsigsegv expects a SIGSEGV. And it usually got one, because the stack growth is handled by that duplicated architecture fault handler. However, when the generic VM layer started propagating the error return from the stack expansion in commit fee7e49d4514 ("mm: propagate error from stack expansion even for guard page"), that now exposed the existing VM_FAULT_SIGBUS result to user space. And user space really expected SIGSEGV, not SIGBUS. To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those duplicate architecture fault handlers about it. They all already have the code to handle SIGSEGV, so it's about just tying that new return value to the existing code, but it's all a bit annoying. This is the mindless minimal patch to do this. A more extensive patch would be to try to gather up the mostly shared fault handling logic into one generic helper routine, and long-term we really should do that cleanup. Just from this patch, you can generally see that most architectures just copied (directly or indirectly) the old x86 way of doing things, but in the meantime that original x86 model has been improved to hold the VM semaphore for shorter times etc and to handle VM_FAULT_RETRY and other "newer" things, so it would be a good idea to bring all those improvements to the generic case and teach other architectures about them too. Reported-and-tested-by: Takashi Iwai <[email protected]> Tested-by: Jan Engelhardt <[email protected]> Acked-by: Heiko Carstens <[email protected]> # "s390 still compiles and boots" Cc: [email protected] Cc: [email protected] Signed-off-by: Linus Torvalds <[email protected]>
2014-11-07powerpc/8xx: Invalidate non present TLB as early as possibleLEROY Christophe1-7/+0
8xx sometimes need to load a invalid/non-present TLBs in it DTLB asm handler. These must be invalidated separaly as linux mm doesn't. Commit 5efab4a02c89c252fb4cce097aafde5f8208dbfe was invalidating them in arch/powerpc/mm/fault.c. This patch does the invalidation earlier in order to free the TLB as soon as possible. This also has the advantage of removing some 8xx specific code from fault.c Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Scott Wood <[email protected]>
2014-10-13Merge branch 'sched-core-for-linus' of ↵Linus Torvalds1-4/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main changes in this cycle were: - Optimized support for Intel "Cluster-on-Die" (CoD) topologies (Dave Hansen) - Various sched/idle refinements for better idle handling (Nicolas Pitre, Daniel Lezcano, Chuansheng Liu, Vincent Guittot) - sched/numa updates and optimizations (Rik van Riel) - sysbench speedup (Vincent Guittot) - capacity calculation cleanups/refactoring (Vincent Guittot) - Various cleanups to thread group iteration (Oleg Nesterov) - Double-rq-lock removal optimization and various refactorings (Kirill Tkhai) - various sched/deadline fixes ... and lots of other changes" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits) sched/dl: Use dl_bw_of() under rcu_read_lock_sched() sched/fair: Delete resched_cpu() from idle_balance() sched, time: Fix build error with 64 bit cputime_t on 32 bit systems sched: Improve sysbench performance by fixing spurious active migration sched/x86: Fix up typo in topology detection x86, sched: Add new topology for multi-NUMA-node CPUs sched/rt: Use resched_curr() in task_tick_rt() sched: Use rq->rd in sched_setaffinity() under RCU read lock sched: cleanup: Rename 'out_unlock' to 'out_free_new_mask' sched: Use dl_bw_of() under RCU read lock sched/fair: Remove duplicate code from can_migrate_task() sched, mips, ia64: Remove __ARCH_WANT_UNLOCKED_CTXSW sched: print_rq(): Don't use tasklist_lock sched: normalize_rt_tasks(): Don't use _irqsave for tasklist_lock, use task_rq_lock() sched: Fix the task-group check in tg_has_rt_tasks() sched/fair: Leverage the idle state info when choosing the "idlest" cpu sched: Let the scheduler see CPU idle states sched/deadline: Fix inter- exclusive cpusets migrations sched/deadline: Clear dl_entity params when setscheduling to different class sched/numa: Kill the wrong/dead TASK_DEAD check in task_numa_fault() ...
2014-10-02powerpc: Fill in si_addr_lsb siginfo fieldAnton Blanchard1-0/+8
Fill in the si_addr_lsb siginfo field so the hwpoison code can pass to userspace the length of memory that has been corrupted. Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2014-10-02powerpc: Add VM_FAULT_HWPOISON handling to powerpc page fault handlerAnton Blanchard1-6/+11
do_page_fault was missing knowledge of HWPOISON, and we would oops if userspace tried to access a poisoned page: kernel BUG at arch/powerpc/mm/fault.c:180! Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2014-10-02powerpc: Simplify do_sigbusAnton Blanchard1-10/+10
Exit out early for a kernel fault, avoiding indenting of most of the function. Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2014-09-19sched: Add helper for task stack page overrun checkingAaron Tomlin1-3/+1
This facility is used in a few places so let's introduce a helper function to improve code readability. Signed-off-by: Aaron Tomlin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Andrew Morton <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Seiji Aguchi <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2014-09-19init/main.c: Give init_task a canaryAaron Tomlin1-2/+1
Tasks get their end of stack set to STACK_END_MAGIC with the aim to catch stack overruns. Currently this feature does not apply to init_task. This patch removes this restriction. Note that a similar patch was posted by Prarit Bhargava some time ago but was never merged: http://marc.info/?l=linux-kernel&m=127144305403241&w=2 Signed-off-by: Aaron Tomlin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Acked-by: Michael Ellerman <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Alex Thorlton <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Daeseok Youn <[email protected]> Cc: David Rientjes <[email protected]> Cc: Fabian Frederick <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kees Cook <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Michael Opdenacker <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Rusty Russell <[email protected]> Cc: Seiji Aguchi <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-09-12arch: mm: pass userspace fault flag to generic fault handlerJohannes Weiner1-3/+4
Unlike global OOM handling, memory cgroup code will invoke the OOM killer in any OOM situation because it has no way of telling faults occuring in kernel context - which could be handled more gracefully - from user-triggered faults. Pass a flag that identifies faults originating in user space from the architecture-specific fault handlers to generic code so that memcg OOM handling can be improved. Signed-off-by: Johannes Weiner <[email protected]> Reviewed-by: Michal Hocko <[email protected]> Cc: David Rientjes <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: azurIt <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-09-11powerpc: Fix possible deadlock on page faultAneesh Kumar K.V1-3/+10
stack_grow_into/14082 is trying to acquire lock: (&mm->mmap_sem){++++++}, at: [<c000000000206d28>] .might_fault+0x78/0xe0 but task is already holding lock: (&mm->mmap_sem){++++++}, at: [<c0000000007ffd8c>] .do_page_fault+0x24c/0x910 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&mm->mmap_sem); lock(&mm->mmap_sem); *** DEADLOCK *** May be due to missing lock nesting notation 1 lock held by stack_grow_into/14082: #0: (&mm->mmap_sem){++++++}, at: [<c0000000007ffd8c>] .do_page_fault+0x24c/0x910 stack backtrace: CPU: 21 PID: 14082 Comm: stack_grow_into Not tainted 3.10.0-10.el7.ppc64.debug #1 Call Trace: [c0000003d396b850] [c000000000016e7c] .show_stack+0x7c/0x1f0 (unreliable) [c0000003d396b920] [c000000000813fc8] .dump_stack+0x28/0x3c [c0000003d396b990] [c000000000124b90] .__lock_acquire+0x1640/0x1800 [c0000003d396bab0] [c00000000012570c] .lock_acquire+0xac/0x250 [c0000003d396bb80] [c000000000206d54] .might_fault+0xa4/0xe0 [c0000003d396bbf0] [c0000000007ffe2c] .do_page_fault+0x2ec/0x910 [c0000003d396be30] [c0000000000092e8] handle_page_fault+0x10/0x30 Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2013-08-14powerpc: Fix little endian lppaca, slb_shadow and dtl_entryAnton Blanchard1-1/+5
The lppaca, slb_shadow and dtl_entry hypervisor structures are big endian, so we have to byte swap them in little endian builds. LE KVM hosts will also need to be fixed but for now add an #error to remind us. Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2013-05-14powerpc: Exception hooks for context tracking subsystemLi Zhong1-14/+27
This is the exception hooks for context tracking subsystem, including data access, program check, single step, instruction breakpoint, machine check, alignment, fp unavailable, altivec assist, unknown exception, whose handlers might use RCU. This patch corresponds to [PATCH] x86: Exception hooks for userspace RCU extended QS commit 6ba3c97a38803883c2eee489505796cb0a727122 But after the exception handling moved to generic code, and some changes in following two commits: 56dd9470d7c8734f055da2a6bac553caf4a468eb context_tracking: Move exception handling to generic code 6c1e0256fad84a843d915414e4b5973b7443d48d context_tracking: Restore correct previous context state on exception exit it is able for exception hooks to use the generic code above instead of a redundant arch implementation. Signed-off-by: Li Zhong <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2013-01-10powerpc: Hardware breakpoints rewrite to handle non DABR breakpoint registersMichael Neuling1-2/+2
This is a rewrite so that we don't assume we are using the DABR throughout the code. We now use the arch_hw_breakpoint to store the breakpoint in a generic manner in the thread_struct, rather than storing the raw DABR value. The ptrace GET/SET_DEBUGREG interface currently passes the raw DABR in from userspace. We keep this functionality, so that future changes (like the POWER8 DAWR), will still fake the DABR to userspace. Signed-off-by: Michael Neuling <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2012-12-12mm, oom: remove statically defined arch functions of same nameDavid Rientjes1-15/+12
out_of_memory() is a globally defined function to call the oom killer. x86, sh, and powerpc all use a function of the same name within file scope in their respective fault.c unnecessarily. Inline the functions into the pagefault handlers to clean the code up. Signed-off-by: David Rientjes <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Reviewed-by: Michal Hocko <[email protected]> Reviewed-by: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-09readahead: fault retry breaks mmap file read random detectionShaohua Li1-0/+1
.fault now can retry. The retry can break state machine of .fault. In filemap_fault, if page is miss, ra->mmap_miss is increased. In the second try, since the page is in page cache now, ra->mmap_miss is decreased. And these are done in one fault, so we can't detect random mmap file access. Add a new flag to indicate .fault is tried once. In the second try, skip ra->mmap_miss decreasing. The filemap_fault state machine is ok with it. I only tested x86, didn't test other archs, but looks the change for other archs is obvious, but who knows :) Signed-off-by: Shaohua Li <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Wu Fengguang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2012-10-06Merge branch 'next' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Benjamin Herrenschmidt: "Some highlights in addition to the usual batch of fixes: - 64TB address space support for 64-bit processes by Aneesh Kumar - Gavin Shan did a major cleanup & re-organization of our EEH support code (IBM fancy PCI error handling & recovery infrastructure) which paves the way for supporting different platform backends, along with some rework of the PCIe code for the PowerNV platform in order to remove home made resource allocations and instead use the generic code (which is possible after some small improvements to it done by Gavin). - Uprobes support by Ananth N Mavinakayanahalli - A pile of embedded updates from Freescale folks, including new SoC and board supports, more KVM stuff including preparing for 64-bit BookE KVM support, ePAPR 1.1 updates, etc..." Fixup trivial conflicts in drivers/scsi/ipr.c * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (146 commits) powerpc/iommu: Fix multiple issues with IOMMU pools code powerpc: Fix VMX fix for memcpy case driver/mtd:IFC NAND:Initialise internal SRAM before any write powerpc/fsl-pci: use 'Header Type' to identify PCIE mode powerpc/eeh: Don't release eeh_mutex in eeh_phb_pe_get powerpc: Remove tlb batching hack for nighthawk powerpc: Set paca->data_offset = 0 for boot cpu powerpc/perf: Sample only if SIAR-Valid bit is set in P7+ powerpc/fsl-pci: fix warning when CONFIG_SWIOTLB is disabled powerpc/mpc85xx: Update interrupt handling for IFC controller powerpc/85xx: Enable USB support in p1023rds_defconfig powerpc/smp: Do not disable IPI interrupts during suspend powerpc/eeh: Fix crash on converting OF node to edev powerpc/eeh: Lock module while handling EEH event powerpc/kprobe: Don't emulate store when kprobe stwu r1 powerpc/kprobe: Complete kprobe and migrate exception frame powerpc/kprobe: Introduce a new thread flag powerpc: Remove unused __get_user64() and __put_user64() powerpc/eeh: Global mutex to protect PE tree powerpc/eeh: Remove EEH PE for normal PCI hotplug ...
2012-09-21userns: On ppc convert current_uid from a kuid before printing.Eric W. Biederman1-1/+1
Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
2012-09-05powerpc: Add trap_nr to thread_structAnanth N Mavinakayanahalli1-0/+1
Add thread_struct.trap_nr and use it to store the last exception the thread experienced. In this patch, we populate the field at various places where we force_sig_info() to the process. This is also used in uprobes to determine if the probed instruction caused an exception. Signed-off-by: Ananth N Mavinakayanahalli <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2012-03-28Disintegrate asm/system.h for PowerPCDavid Howells1-1/+1
Disintegrate asm/system.h for PowerPC. Signed-off-by: David Howells <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> cc: [email protected]
2012-03-09powerpc: Add support for page fault retry and fatal signalsBenjamin Herrenschmidt1-50/+120
Other architectures such as x86 and ARM have been growing new support for features like retrying page faults after dropping the mm semaphore to break contention, or being able to return from a stuck page fault when a SIGKILL is pending. This refactors our implementation of do_page_fault() to move the error handling out of line in a way similar to x86 and adds support for those two features. Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2012-03-09powerpc: Call do_page_fault() with interrupts offBenjamin Herrenschmidt1-0/+11
We currently turn interrupts back to their previous state before calling do_page_fault(). This can be annoying when debugging as a bad fault will potentially have lost some processor state before getting into the debugger. We also end up calling some generic code with interrupts enabled such as notify_page_fault() with interrupts enabled, which could be unexpected. This changes our code to behave more like other architectures, and make the assembly entry code call into do_page_faults() with interrupts disabled. They are conditionally re-enabled from within do_page_fault() in the same spot x86 does it. While there, add the might_sleep() test in the case of a successful trylock of the mmap semaphore, again like x86. Also fix a bug in the existing assembly where r12 (_MSR) could get clobbered by C calls (the DTL accounting in the exception common macro and DISABLE_INTS) in some cases. Signed-off-by: Benjamin Herrenschmidt <[email protected]> --- v2. Add the r12 clobber fix
2011-11-25powerpc/icswx: Simple ACOP fault handlerJimi Xenidis1-0/+17
This patch adds a fault handler that responds to illegal Coprocessor types. Currently all CTs are treated and illegal. There are two ways to report the fault back to the application. If the application used the record form ("icswx.") then the architected "reject" is emulated. If the application did not used the record form ("icswx") then it is selectable by config whether the failure is silent (as architected) or a SIGILL is generated. In all cases pr_warn() is used to log the bad CT. Signed-off-by: Jimi Xenidis <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2011-07-22Merge branch 'perf-core-for-linus' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (123 commits) perf: Remove the nmi parameter from the oprofile_perf backend x86, perf: Make copy_from_user_nmi() a library function perf: Remove perf_event_attr::type check x86, perf: P4 PMU - Fix typos in comments and style cleanup perf tools: Make test use the preset debugfs path perf tools: Add automated tests for events parsing perf tools: De-opt the parse_events function perf script: Fix display of IP address for non-callchain path perf tools: Fix endian conversion reading event attr from file header perf tools: Add missing 'node' alias to the hw_cache[] array perf probe: Support adding probes on offline kernel modules perf probe: Add probed module in front of function perf probe: Introduce debuginfo to encapsulate dwarf information perf-probe: Move dwarf library routines to dwarf-aux.{c, h} perf probe: Remove redundant dwarf functions perf probe: Move strtailcmp to string.c perf probe: Rename DIE_FIND_CB_FOUND to DIE_FIND_CB_END tracing/kprobe: Update symbol reference when loading module tracing/kprobes: Support module init function probing kprobes: Return -ENOENT if probe point doesn't exist ...
2011-07-01perf: Remove the nmi parameter from the swevent and overflow interfacePeter Zijlstra1-3/+3
The nmi parameter indicated if we could do wakeups from the current context, if not, we would set some state and self-IPI and let the resulting interrupt do the wakeup. For the various event classes: - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from the PMI-tail (ARM etc.) - tracepoint: nmi=0; since tracepoint could be from NMI context. - software: nmi=[0,1]; some, like the schedule thing cannot perform wakeups, and hence need 0. As one can see, there is very little nmi=1 usage, and the down-side of not using it is that on some platforms some software events can have a jiffy delay in wakeup (when arch_irq_work_raise isn't implemented). The up-side however is that we can remove the nmi parameter and save a bunch of conditionals in fast paths. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Michael Cree <[email protected]> Cc: Will Deacon <[email protected]> Cc: Deng-Cheng Zhu <[email protected]> Cc: Anton Blanchard <[email protected]> Cc: Eric B Munson <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Paul Mundt <[email protected]> Cc: David S. Miller <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jason Wessel <[email protected]> Cc: Don Zickus <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2011-06-29arch/powerpc: use printk_ratelimited instead of printk_ratelimitChristian Dietrich1-5/+5
Since printk_ratelimit() shouldn't be used anymore (see comment in include/linux/printk.h), replace it with printk_ratelimited. Signed-off-by: Christian Dietrich <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-09-02powerpc: Check end of stack canary at oops timeAnton Blanchard1-0/+6
Add a check for the stack canary when we oops, similar to x86. This should make it clear that we overran our stack: Unable to handle kernel paging request for data at address 0x24652f63700ac689 Faulting instruction address: 0xc000000000063d24 Thread overran stack, or stack corrupted Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-05-06powerpc: Invoke oom-killer from page faultBenjamin Herrenschmidt1-10/+4
As explained in commit 1c0fe6e3bd, we want to call the architecture independent oom killer when getting an unexplained OOM from handle_mm_fault, rather than simply killing current. Cc: [email protected] Cc: Benjamin Herrenschmidt <[email protected]> Cc: [email protected] Signed-off-by: Nick Piggin <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2010-04-07powerpc: Disable interrupts for data breakpoint exceptionsK.Prasad1-2/+3
Data address breakpoint exceptions are currently handled along with page-faults which require interrupts to remain in enabled state. Since exception handling for data breakpoints aren't pre-empt safe, we handle them separately. Signed-off-by: K.Prasad <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2009-12-09powerpc/8xx: Invalidate non present TLBsJoakim Tjernlund1-1/+7
8xx sometimes need to load a invalid/non-present TLBs in it DTLB asm handler. These must be invalidated separaly as linux mm don't. Signed-off-by: Joakim Tjernlund <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2009-09-21perf: Do the big rename: Performance Counters -> Performance EventsIngo Molnar1-4/+4
Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f | grep -vE 'oprofile|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N | sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: Stephane Eranian <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Acked-by: Paul Mackerras <[email protected]> Reviewed-by: Arjan van de Ven <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Howells <[email protected]> Cc: Kyle McMartin <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-21Move FAULT_FLAG_xyz into handle_mm_fault() callersLinus Torvalds1-1/+1
This allows the callers to now pass down the full set of FAULT_FLAG_xyz flags to handle_mm_fault(). All callers have been (mechanically) converted to the new calling convention, there's almost certainly room for architectures to clean up their code and then add FAULT_FLAG_RETRY when that support is added. Signed-off-by: Linus Torvalds <[email protected]>
2009-06-11perf_counter: Standardize event namesPeter Zijlstra1-3/+3
Pure renames only, to PERF_COUNT_HW_* and PERF_COUNT_SW_*. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-04-08perf_counter: allow for data addresses to be recordedPeter Zijlstra1-3/+5
Paul suggested we allow for data addresses to be recorded along with the traditional IPs as power can provide these. For now, only the software pagefault events provide data addresses, but in the future power might as well for some events. x86 doesn't seem capable of providing this atm. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Corey Ashford <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-04-06perf_counter: provide major/minor page fault software eventsPeter Zijlstra1-1/+4
Provide separate sw counters for major and minor page faults. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-04-06perf_counter: provide pagefault software eventsPeter Zijlstra1-0/+3
We use the generic software counter infrastructure to provide page fault events. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-02-11powerpc/mm: Rework I$/D$ coherency (v3)Benjamin Herrenschmidt1-29/+17
This patch reworks the way we do I and D cache coherency on PowerPC. The "old" way was split in 3 different parts depending on the processor type: - Hash with per-page exec support (64-bit and >= POWER4 only) does it at hashing time, by preventing exec on unclean pages and cleaning pages on exec faults. - Everything without per-page exec support (32-bit hash, 8xx, and 64-bit < POWER4) does it for all page going to user space in update_mmu_cache(). - Embedded with per-page exec support does it from do_page_fault() on exec faults, in a way similar to what the hash code does. That leads to confusion, and bugs. For example, the method using update_mmu_cache() is racy on SMP where another processor can see the new PTE and hash it in before we have cleaned the cache, and then blow trying to execute. This is hard to hit but I think it has bitten us in the past. Also, it's inefficient for embedded where we always end up having to do at least one more page fault. This reworks the whole thing by moving the cache sync into two main call sites, though we keep different behaviours depending on the HW capability. The call sites are set_pte_at() which is now made out of line, and ptep_set_access_flags() which joins the former in pgtable.c The base idea for Embedded with per-page exec support, is that we now do the flush at set_pte_at() time when coming from an exec fault, which allows us to avoid the double fault problem completely (we can even improve the situation more by implementing TLB preload in update_mmu_cache() but that's for later). If for some reason we didn't do it there and we try to execute, we'll hit the page fault, which will do a minor fault, which will hit ptep_set_access_flags() to do things like update _PAGE_ACCESSED or _PAGE_DIRTY if needed, we just make this guys also perform the I/D cache sync for exec faults now. This second path is the catch all for things that weren't cleaned at set_pte_at() time. For cpus without per-pag exec support, we always do the sync at set_pte_at(), thus guaranteeing that when the PTE is visible to other processors, the cache is clean. For the 64-bit hash with per-page exec support case, we keep the old mechanism for now. I'll look into changing it later, once I've reworked a bit how we use _PAGE_EXEC. This is also a first step for adding _PAGE_EXEC support for embedded platforms Signed-off-by: Benjamin Herrenschmidt <[email protected]>
2008-12-28Merge branch 'next' of ↵Linus Torvalds1-3/+11
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (144 commits) powerpc/44x: Support 16K/64K base page sizes on 44x powerpc: Force memory size to be a multiple of PAGE_SIZE powerpc/32: Wire up the trampoline code for kdump powerpc/32: Add the ability for a classic ppc kernel to be loaded at 32M powerpc/32: Allow __ioremap on RAM addresses for kdump kernel powerpc/32: Setup OF properties for kdump powerpc/32/kdump: Implement crash_setup_regs() using ppc_save_regs() powerpc: Prepare xmon_save_regs for use with kdump powerpc: Remove default kexec/crash_kernel ops assignments powerpc: Make default kexec/crash_kernel ops implicit powerpc: Setup OF properties for ppc32 kexec powerpc/pseries: Fix cpu hotplug powerpc: Fix KVM build on ppc440 powerpc/cell: add QPACE as a separate Cell platform powerpc/cell: fix build breakage with CONFIG_SPUFS disabled powerpc/mpc5200: fix error paths in PSC UART probe function powerpc/mpc5200: add rts/cts handling in PSC UART driver powerpc/mpc5200: Make PSC UART driver update serial errors counters powerpc/mpc5200: Remove obsolete code from mpc5200 MDIO driver powerpc/mpc5200: Add MDMA/UDMA support to MPC5200 ATA driver ... Fix trivial conflict in drivers/char/Makefile as per Paul's directions
2008-12-21powerpc/mm: Add SMP support to no-hash TLB handlingBenjamin Herrenschmidt1-1/+1
This commit moves the whole no-hash TLB handling out of line into a new tlb_nohash.c file, and implements some basic SMP support using IPIs and/or broadcast tlbivax instructions. Note that I'm using local invalidations for D->I cache coherency. At worst, if another processor is trying to execute the same and has the old entry in its TLB, it will just take a fault and re-do the TLB flush locally (it won't re-do the cache flush in any case). Signed-off-by: Benjamin Herrenschmidt <[email protected]> Acked-by: Kumar Gala <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2008-11-19powerpc: Correct page-in counter for CMM with 64k pagesRobert Jennings1-1/+1
Linux will report the number of page-ins so that the hypervisor can better determine partition memory pressure. The hardware page size and the OS page size can be different. In the case where the hardware page size is 4k and the OS is running with 64k pages the code in commit 409001948d9f221c94a61c3ee96de112755fc04d ("powerpc: Update page-in counter for CMM") would under-report the number of pages. This corrects the reporting to the hypervisor by incrementing the page_in count by 1 << PAGE_FACTOR each time. Reported-by: Andrew Theurer <[email protected]> Signed-off-by: Robert Jennings <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2008-11-14CRED: Wrap task credential accesses in the PowerPC archDavid Howells1-1/+1
Wrap access to task credentials so that they can be separated more easily from the task_struct during the introduction of COW creds. Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id(). Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more sense to use RCU directly rather than a convenient wrapper; these will be addressed by later patches. Signed-off-by: David Howells <[email protected]> Reviewed-by: James Morris <[email protected]> Acked-by: Serge Hallyn <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: [email protected] Signed-off-by: James Morris <[email protected]>
2008-11-05powerpc: Update page-in counter for CMMBrian King1-2/+10
A new field has been added to the VPA as a method for the client OS to communicate to firmware the number of page-ins it is performing when running collaborative memory overcommit. The hypervisor will use this information to better determine if a partition is experiencing memory pressure and needs more memory allocated to it. Signed-off-by: Brian King <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2008-07-25powerpc: BookE hardware watchpoint supportLuis Machado1-25/+0
This patch implements support for HW based watchpoint via the DBSR_DAC (Data Address Compare) facility of the BookE processors. It does so by interfacing with the existing DABR breakpoint code and adding the necessary bits and pieces for the new bits to be properly set or cleared Signed-off-by: Luis Machado <[email protected]> Signed-off-by: Benjamin Herrenschmidt <[email protected]>