Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull arch/sh updates from Rich Felker:
"Cleanup, SECCOMP_FILTER support, message printing fixes, and other
changes to arch/sh"
* tag 'sh-for-5.9' of git://git.libc.org/linux-sh: (34 commits)
sh: landisk: Add missing initialization of sh_io_port_base
sh: bring syscall_set_return_value in line with other architectures
sh: Add SECCOMP_FILTER
sh: Rearrange blocks in entry-common.S
sh: switch to copy_thread_tls()
sh: use the generic dma coherent remap allocator
sh: don't allow non-coherent DMA for NOMMU
dma-mapping: consolidate the NO_DMA definition in kernel/dma/Kconfig
sh: unexport register_trapped_io and match_trapped_io_handler
sh: don't include <asm/io_trapped.h> in <asm/io.h>
sh: move the ioremap implementation out of line
sh: move ioremap_fixed details out of <asm/io.h>
sh: remove __KERNEL__ ifdefs from non-UAPI headers
sh: sort the selects for SUPERH alphabetically
sh: remove -Werror from Makefiles
sh: Replace HTTP links with HTTPS ones
arch/sh/configs: remove obsolete CONFIG_SOC_CAMERA*
sh: stacktrace: Remove stacktrace_ops.stack()
sh: machvec: Modernize printing of kernel messages
sh: pci: Modernize printing of kernel messages
...
|
|
This reverts commit 2deebe4d56d638269a4a728086d64de5734b460a.
printk_address() is always used as a continuation of the previous
logging, hence it should not include a log level.
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Rich Felker <[email protected]>
|
|
Somewhere along the patch handling path, both the old "printk(KERN_ALERT
....)" and the new "pr_alert(...)" were retained, leading to the
duplicate printing of "PC:".
Drop the old one.
Fixes: eaabf98b0932a540 ("sh: fault: modernize printing of kernel messages")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Rich Felker <[email protected]>
|
|
Use the general page fault accounting by passing regs into
handle_mm_fault(). It naturally solve the issue of multiple page fault
accounting when page fault retry happened.
Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: Rich Felker <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "mm: Page fault accounting cleanups", v5.
This is v5 of the pf accounting cleanup series. It originates from Gerald
Schaefer's report on an issue a week ago regarding to incorrect page fault
accountings for retried page fault after commit 4064b9827063 ("mm: allow
VM_FAULT_RETRY for multiple times"):
https://lore.kernel.org/lkml/20200610174811.44b94525@thinkpad/
What this series did:
- Correct page fault accounting: we do accounting for a page fault
(no matter whether it's from #PF handling, or gup, or anything else)
only with the one that completed the fault. For example, page fault
retries should not be counted in page fault counters. Same to the
perf events.
- Unify definition of PERF_COUNT_SW_PAGE_FAULTS: currently this perf
event is used in an adhoc way across different archs.
Case (1): for many archs it's done at the entry of a page fault
handler, so that it will also cover e.g. errornous faults.
Case (2): for some other archs, it is only accounted when the page
fault is resolved successfully.
Case (3): there're still quite some archs that have not enabled
this perf event.
Since this series will touch merely all the archs, we unify this
perf event to always follow case (1), which is the one that makes most
sense. And since we moved the accounting into handle_mm_fault, the
other two MAJ/MIN perf events are well taken care of naturally.
- Unify definition of "major faults": the definition of "major
fault" is slightly changed when used in accounting (not
VM_FAULT_MAJOR). More information in patch 1.
- Always account the page fault onto the one that triggered the page
fault. This does not matter much for #PF handlings, but mostly for
gup. More information on this in patch 25.
Patchset layout:
Patch 1: Introduced the accounting in handle_mm_fault(), not enabled.
Patch 2-23: Enable the new accounting for arch #PF handlers one by one.
Patch 24: Enable the new accounting for the rest outliers (gup, iommu, etc.)
Patch 25: Cleanup GUP task_struct pointer since it's not needed any more
This patch (of 25):
This is a preparation patch to move page fault accountings into the
general code in handle_mm_fault(). This includes both the per task
flt_maj/flt_min counters, and the major/minor page fault perf events. To
do this, the pt_regs pointer is passed into handle_mm_fault().
PERF_COUNT_SW_PAGE_FAULTS should still be kept in per-arch page fault
handlers.
So far, all the pt_regs pointer that passed into handle_mm_fault() is
NULL, which means this patch should have no intented functional change.
Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Cain <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Greentime Hu <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: James E.J. Bottomley <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Ley Foon Tan <[email protected]>
Cc: "Luck, Tony" <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Nick Hu <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Russell King <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Stefan Kristiansson <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vincent Chen <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Convert comments that reference mmap_sem to reference mmap_lock instead.
[[email protected]: fix up linux-next leftovers]
[[email protected]: s/lockaphore/lock/, per Vlastimil]
[[email protected]: more linux-next fixups, per Michel]
Signed-off-by: Michel Lespinasse <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Vlastimil Babka <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Laurent Dufour <[email protected]>
Cc: Liam Howlett <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ying Han <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Convert comments that reference old mmap_sem APIs to reference
corresponding new mmap locking APIs instead.
Signed-off-by: Michel Lespinasse <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Vlastimil Babka <[email protected]>
Reviewed-by: Davidlohr Bueso <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Laurent Dufour <[email protected]>
Cc: Liam Howlett <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ying Han <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This change converts the existing mmap_sem rwsem calls to use the new mmap
locking API instead.
The change is generated using coccinelle with the following rule:
// spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .
@@
expression mm;
@@
(
-init_rwsem
+mmap_init_lock
|
-down_write
+mmap_write_lock
|
-down_write_killable
+mmap_write_lock_killable
|
-down_write_trylock
+mmap_write_trylock
|
-up_write
+mmap_write_unlock
|
-downgrade_write
+mmap_write_downgrade
|
-down_read
+mmap_read_lock
|
-down_read_killable
+mmap_read_lock_killable
|
-down_read_trylock
+mmap_read_trylock
|
-up_read
+mmap_read_unlock
)
-(&mm->mmap_sem)
+(mm)
Signed-off-by: Michel Lespinasse <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Reviewed-by: Laurent Dufour <[email protected]>
Reviewed-by: Vlastimil Babka <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Liam Howlett <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ying Han <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Currently, the log-level of show_stack() depends on a platform
realization. It creates situations where the headers are printed with
lower log level or higher than the stacktrace (depending on a platform or
user).
Furthermore, it forces the logic decision from user to an architecture
side. In result, some users as sysrq/kdb/etc are doing tricks with
temporary rising console_loglevel while printing their messages. And in
result it not only may print unwanted messages from other CPUs, but also
omit printing at all in the unlucky case where the printk() was deferred.
Introducing log-level parameter and KERN_UNSUPPRESSED [1] seems an easier
approach than introducing more printk buffers. Also, it will consolidate
printings with headers.
Add log level argument to printk_address() as a preparation to introduce
show_stack_loglvl().
As a good side-effect show_fault_oops() now prints the address with
KERN_EMREG as the rest of output, making sure there won't be situation
where "PC: " is printed without actual address.
[1]: https://lore.kernel.org/lkml/[email protected]/T/#u
Signed-off-by: Dmitry Safonov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Rich Felker <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate and remove usage of __ARCH_USE_5LEVEL_HACK.
Signed-off-by: Mike Rapoport <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Brian Cain <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Guan Xuetao <[email protected]>
Cc: James Morse <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Julien Thierry <[email protected]>
Cc: Ley Foon Tan <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Russell King <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Stefan Kristiansson <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
- Convert from printk() to pr_*(),
- Add missing continuations,
- Use "%llx" to format u64,
- Join multiple prints in show_fault_oops() into a single print.
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Mike Rapoport <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Brian Cain <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: Guan Xuetao <[email protected]>
Cc: James Morse <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Julien Thierry <[email protected]>
Cc: Ley Foon Tan <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Russell King <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Stefan Kristiansson <[email protected]>
Cc: Suzuki K Poulose <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Lets move vma_is_accessible() helper to include/linux/mm.h which makes it
available for general use. While here, this replaces all remaining open
encodings for VMA access check with vma_is_accessible().
Signed-off-by: Anshuman Khandual <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]>
Acked-by: Guo Ren <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Will Deacon <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The idea comes from a discussion between Linus and Andrea [1].
Before this patch we only allow a page fault to retry once. We achieved
this by clearing the FAULT_FLAG_ALLOW_RETRY flag when doing
handle_mm_fault() the second time. This was majorly used to avoid
unexpected starvation of the system by looping over forever to handle the
page fault on a single page. However that should hardly happen, and after
all for each code path to return a VM_FAULT_RETRY we'll first wait for a
condition (during which time we should possibly yield the cpu) to happen
before VM_FAULT_RETRY is really returned.
This patch removes the restriction by keeping the FAULT_FLAG_ALLOW_RETRY
flag when we receive VM_FAULT_RETRY. It means that the page fault handler
now can retry the page fault for multiple times if necessary without the
need to generate another page fault event. Meanwhile we still keep the
FAULT_FLAG_TRIED flag so page fault handler can still identify whether a
page fault is the first attempt or not.
Then we'll have these combinations of fault flags (only considering
ALLOW_RETRY flag and TRIED flag):
- ALLOW_RETRY and !TRIED: this means the page fault allows to
retry, and this is the first try
- ALLOW_RETRY and TRIED: this means the page fault allows to
retry, and this is not the first try
- !ALLOW_RETRY and !TRIED: this means the page fault does not allow
to retry at all
- !ALLOW_RETRY and TRIED: this is forbidden and should never be used
In existing code we have multiple places that has taken special care of
the first condition above by checking against (fault_flags &
FAULT_FLAG_ALLOW_RETRY). This patch introduces a simple helper to detect
the first retry of a page fault by checking against both (fault_flags &
FAULT_FLAG_ALLOW_RETRY) and !(fault_flag & FAULT_FLAG_TRIED) because now
even the 2nd try will have the ALLOW_RETRY set, then use that helper in
all existing special paths. One example is in __lock_page_or_retry(), now
we'll drop the mmap_sem only in the first attempt of page fault and we'll
keep it in follow up retries, so old locking behavior will be retained.
This will be a nice enhancement for current code [2] at the same time a
supporting material for the future userfaultfd-writeprotect work, since in
that work there will always be an explicit userfault writeprotect retry
for protected pages, and if that cannot resolve the page fault (e.g., when
userfaultfd-writeprotect is used in conjunction with swapped pages) then
we'll possibly need a 3rd retry of the page fault. It might also benefit
other potential users who will have similar requirement like userfault
write-protection.
GUP code is not touched yet and will be covered in follow up patch.
Please read the thread below for more information.
[1] https://lore.kernel.org/lkml/[email protected]/
[2] https://lore.kernel.org/lkml/[email protected]/
Suggested-by: Linus Torvalds <[email protected]>
Suggested-by: Andrea Arcangeli <[email protected]>
Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Brian Geffon <[email protected]>
Cc: Bobby Powers <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Denis Plotnikov <[email protected]>
Cc: "Dr . David Alan Gilbert" <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: "Kirill A . Shutemov" <[email protected]>
Cc: Martin Cracauer <[email protected]>
Cc: Marty McFadden <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Maya Gokhale <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Although there're tons of arch-specific page fault handlers, most of them
are still sharing the same initial value of the page fault flags. Say,
merely all of the page fault handlers would allow the fault to be retried,
and they also allow the fault to respond to SIGKILL.
Let's define a default value for the fault flags to replace those initial
page fault flags that were copied over. With this, it'll be far easier to
introduce new fault flag that can be used by all the architectures instead
of touching all the archs.
Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Brian Geffon <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Bobby Powers <[email protected]>
Cc: Denis Plotnikov <[email protected]>
Cc: "Dr . David Alan Gilbert" <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: "Kirill A . Shutemov" <[email protected]>
Cc: Martin Cracauer <[email protected]>
Cc: Marty McFadden <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Maya Gokhale <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Let SH to use the new fault_signal_pending() helper. Here we'll need to
move the up_read() out because that's actually needed as long as !RETRY
cases. At the meantime we can drop all the rest of up_read()s now (which
seems to be cleaner).
Signed-off-by: Peter Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Brian Geffon <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Bobby Powers <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Denis Plotnikov <[email protected]>
Cc: "Dr . David Alan Gilbert" <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: "Kirill A . Shutemov" <[email protected]>
Cc: Martin Cracauer <[email protected]>
Cc: Marty McFadden <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Maya Gokhale <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Architectures which support kprobes have very similar boilerplate around
calling kprobe_fault_handler(). Use a helper function in kprobes.h to
unify them, based on the x86 code.
This changes the behaviour for other architectures when preemption is
enabled. Previously, they would have disabled preemption while calling
the kprobe handler. However, preemption would be disabled if this fault
was due to a kprobe, so we know the fault was not due to a kprobe
handler and can simply return failure.
This behaviour was introduced in commit a980c0ef9f6d ("x86/kprobes:
Refactor kprobes_fault() like kprobe_exceptions_notify()")
[[email protected]: export kprobe_fault_handler()]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Anshuman Khandual <[email protected]>
Reviewed-by: Dave Hansen <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Russell King <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
As synchronous exceptions really only make sense against the current
task (otherwise how are you synchronous) remove the task parameter
from from force_sig_fault to make it explicit that is what is going
on.
The two known exceptions that deliver a synchronous exception to a
stopped ptraced task have already been changed to
force_sig_fault_to_task.
The callers have been changed with the following emacs regular expression
(with obvious variations on the architectures that take more arguments)
to avoid typos:
force_sig_fault[(]\([^,]+\)[,]\([^,]+\)[,]\([^,]+\)[,]\W+current[)]
->
force_sig_fault(\1,\2,\3)
Signed-off-by: "Eric W. Biederman" <[email protected]>
|
|
The force_sig_info_fault function is always called with tsk == current.
Make that explicit by removing the tsk parameter.
This also makes it clear that the sh force_sig_info_fault passes
current into force_sig_fault.
Signed-off-by: "Eric W. Biederman" <[email protected]>
|
|
Use new return type vm_fault_t for fault handler. For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno. Once all instances are converted, vm_fault_t will become a
distinct type.
Ref-> commit 1c8f422059ae ("mm: change return type to vm_fault_t")
In this patch all the caller of handle_mm_fault() are changed to return
vm_fault_t type.
Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Russell King <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Richard Kuo <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Ley Foon Tan <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: James E.J. Bottomley <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: Guan Xuetao <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: "Levin, Alexander (Sasha Levin)" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Filling in struct siginfo before calling force_sig_info a tedious and
error prone process, where once in a great while the wrong fields
are filled out, and siginfo has been inconsistently cleared.
Simplify this process by using the helper force_sig_fault. Which
takes as a parameters all of the information it needs, ensures
all of the fiddly bits of filling in struct siginfo are done properly
and then calls force_sig_info.
In short about a 5 line reduction in code for every time force_sig_info
is called, which makes the calling function clearer.
Cc: Yoshinori Sato <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: [email protected]
Acked-by: Rich Felker <[email protected]>
Signed-off-by: "Eric W. Biederman" <[email protected]>
|
|
Call clear_siginfo to ensure every stack allocated siginfo is properly
initialized before being passed to the signal sending functions.
Note: It is not safe to depend on C initializers to initialize struct
siginfo on the stack because C is allowed to skip holes when
initializing a structure.
The initialization of struct siginfo in tracehook_report_syscall_exit
was moved from the helper user_single_step_siginfo into
tracehook_report_syscall_exit itself, to make it clear that the local
variable siginfo gets fully initialized.
In a few cases the scope of struct siginfo has been reduced to make it
clear that siginfo siginfo is not used on other paths in the function
in which it is declared.
Instances of using memset to initialize siginfo have been replaced
with calls clear_siginfo for clarity.
Signed-off-by: "Eric W. Biederman" <[email protected]>
|
|
<linux/sched/signal.h>
We are going to split <linux/sched/signal.h> out of <linux/sched.h>, which
will have to be picked up from other headers and a couple of .c files.
Create a trivial placeholder <linux/sched/signal.h> file that just
maps to <linux/sched.h> to make this patch obviously correct and
bisectable.
Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
We always have vma->vm_mm around.
Link: http://lkml.kernel.org/r/1466021202-61880-8-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
the handler
Introduce faulthandler_disabled() and use it to check for irq context and
disabled pagefaults (via pagefault_disable()) in the pagefault handlers.
Please note that we keep the in_atomic() checks in place - to detect
whether in irq context (in which case preemption is always properly
disabled).
In contrast, preempt_disable() should never be used to disable pagefaults.
With !CONFIG_PREEMPT_COUNT, preempt_disable() doesn't modify the preempt
counter, and therefore the result of in_atomic() differs.
We validate that condition by using might_fault() checks when calling
might_sleep().
Therefore, add a comment to faulthandler_disabled(), describing why this
is needed.
faulthandler_disabled() and pagefault_disable() are defined in
linux/uaccess.h, so let's properly add that include to all relevant files.
This patch is based on a patch from Thomas Gleixner.
Reviewed-and-tested-by: Thomas Gleixner <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
"you should SIGSEGV" error, because the SIGSEGV case was generally
handled by the caller - usually the architecture fault handler.
That results in lots of duplication - all the architecture fault
handlers end up doing very similar "look up vma, check permissions, do
retries etc" - but it generally works. However, there are cases where
the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.
In particular, when accessing the stack guard page, libsigsegv expects a
SIGSEGV. And it usually got one, because the stack growth is handled by
that duplicated architecture fault handler.
However, when the generic VM layer started propagating the error return
from the stack expansion in commit fee7e49d4514 ("mm: propagate error
from stack expansion even for guard page"), that now exposed the
existing VM_FAULT_SIGBUS result to user space. And user space really
expected SIGSEGV, not SIGBUS.
To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
duplicate architecture fault handlers about it. They all already have
the code to handle SIGSEGV, so it's about just tying that new return
value to the existing code, but it's all a bit annoying.
This is the mindless minimal patch to do this. A more extensive patch
would be to try to gather up the mostly shared fault handling logic into
one generic helper routine, and long-term we really should do that
cleanup.
Just from this patch, you can generally see that most architectures just
copied (directly or indirectly) the old x86 way of doing things, but in
the meantime that original x86 model has been improved to hold the VM
semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
"newer" things, so it would be a good idea to bring all those
improvements to the generic case and teach other architectures about
them too.
Reported-and-tested-by: Takashi Iwai <[email protected]>
Tested-by: Jan Engelhardt <[email protected]>
Acked-by: Heiko Carstens <[email protected]> # "s390 still compiles and boots"
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Unlike global OOM handling, memory cgroup code will invoke the OOM killer
in any OOM situation because it has no way of telling faults occuring in
kernel context - which could be handled more gracefully - from
user-triggered faults.
Pass a flag that identifies faults originating in user space from the
architecture-specific fault handlers to generic code so that memcg OOM
handling can be improved.
Signed-off-by: Johannes Weiner <[email protected]>
Reviewed-by: Michal Hocko <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: azurIt <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
out_of_memory() is a globally defined function to call the oom killer.
x86, sh, and powerpc all use a function of the same name within file scope
in their respective fault.c unnecessarily. Inline the functions into the
pagefault handlers to clean the code up.
Signed-off-by: David Rientjes <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Paul Mundt <[email protected]>
Reviewed-by: Michal Hocko <[email protected]>
Reviewed-by: KAMEZAWA Hiroyuki <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
.fault now can retry. The retry can break state machine of .fault. In
filemap_fault, if page is miss, ra->mmap_miss is increased. In the second
try, since the page is in page cache now, ra->mmap_miss is decreased. And
these are done in one fault, so we can't detect random mmap file access.
Add a new flag to indicate .fault is tried once. In the second try, skip
ra->mmap_miss decreasing. The filemap_fault state machine is ok with it.
I only tested x86, didn't test other archs, but looks the change for other
archs is obvious, but who knows :)
Signed-off-by: Shaohua Li <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Wu Fengguang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Presently the oops code looks for the pgd either from the mm context or
the cached TTB value. There are presently cases where the TTB can be
unset or otherwise cleared by hardware, which we weren't handling,
resulting in recursive faults on the NULL pgd. In these cases we can
simply reload from swapper_pg_dir and continue on as normal.
Cc: [email protected]
Signed-off-by: Paul Mundt <[email protected]>
|
|
With the previous attempt reverted this switches to conditionalizing the
end address. Nominally VMALLOC_END, but extended for P3_ADDR_MAX in the
store queue case.
Signed-off-by: Paul Mundt <[email protected]>
|
|
This reverts commit 20e7c297efeca0861adcca073a0d283da659834b.
With store queues enabled the area above P4SEG has special properties
from the MMU's point of view, which was causing fixmap failure. We'll
have to do something else to satisfy the vmalloc range check.
Signed-off-by: Paul Mundt <[email protected]>
|
|
This moves the now generic _32 page fault handling code to a shared place
and adapts the _64 implementation to make use of it.
Signed-off-by: Paul Mundt <[email protected]>
|
|
Signed-off-by: Paul Mundt <[email protected]>
|
|
With the refactored update_mmu_cache() introduced in older kernels,
there's no longer any need to take the page_table_lock in this path,
so simply drop it completely.
Without this, performance degradation is seen on SMP on heavily
threaded workloads that don't use the split ptlock, and ultimately
we have no reason to contend for the lock in the first place.
Signed-off-by: Paul Mundt <[email protected]>
|
|
The __do_page_fault() fast-path contains a UTLB flush in order to
force an ITLB reload, this isn't needed in practice as the ITLB is
auto-reloaded from the UTLB anyways, which is already displaced by
the manual 'ldtlb' in the update_mmu_cache() path.
This provides a measurable speed up in the TLB miss fast-path.
Signed-off-by: Paul Mundt <[email protected]>
|
|
is_init() is an ambiguous name for the pid==1 check. Split it into
is_global_init() and is_container_init().
A cgroup init has it's tsk->pid == 1.
A global init also has it's tsk->pid == 1 and it's active pid namespace
is the init_pid_ns. But rather than check the active pid namespace,
compare the task structure with 'init_pid_ns.child_reaper', which is
initialized during boot to the /sbin/init process and never changes.
Changelog:
2.6.22-rc4-mm2-pidns1:
- Use 'init_pid_ns.child_reaper' to determine if a given task is the
global init (/sbin/init) process. This would improve performance
and remove dependence on the task_pid().
2.6.21-mm2-pidns2:
- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
ppc,avr32}/traps.c for the _exception() call to is_global_init().
This way, we kill only the cgroup if the cgroup's init has a
bug rather than force a kernel panic.
[[email protected]: fix comment]
[[email protected]: Use is_global_init() in arch/m32r/mm/fault.c]
[[email protected]: kernel/pid.c: remove unused exports]
[[email protected]: Fix capability.c to work with threaded init]
Signed-off-by: Serge E. Hallyn <[email protected]>
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Acked-by: Pavel Emelianov <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Cedric Le Goater <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Herbert Poetzel <[email protected]>
Cc: Kirill Korotaev <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
We have had complaints where a threaded application is left in a bad state
after one of it's threads is killed when we hit a VM: out_of_memory
condition.
Killing just one of the process threads can leave the application in a bad
state, whereas killing the entire process group would allow for the
application to restart, or be otherwise handled, and makes it very obvious
that something has gone wrong.
This change allows the entire process group to be taken down, rather
than just the one thread.
Signed-off-by: Will Schmidt <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Russell King <[email protected]>
Cc: Ian Molton <[email protected]>
Cc: Haavard Skinnemoen <[email protected]>
Cc: Mikael Starvik <[email protected]>
Cc: David Howells <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: "Luck, Tony" <[email protected]>
Cc: Hirokazu Takata <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Roman Zippel <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Kyle McMartin <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Paul Mundt <[email protected]>
Cc: Kazumoto Kojima <[email protected]>
Cc: Richard Curnow <[email protected]>
Cc: William Lee Irwin III <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Chris Zankel <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When reading the cached TTB value and extracting the pgd, we
accidentally applied a __va() to it and bumped it off in to bogus
space which ended up causing multiple faults in the error path.
Fix it up so unhandled faults don't do strange and highly unorthodox
things when oopsing.
Signed-off-by: Paul Mundt <[email protected]>
|
|
This patch completes Linus's wish that the fault return codes be made into
bit flags, which I agree makes everything nicer. This requires requires
all handle_mm_fault callers to be modified (possibly the modifications
should go further and do things like fault accounting in handle_mm_fault --
however that would be for another patch).
[[email protected]: fix alpha build]
[[email protected]: fix s390 build]
[[email protected]: fix sparc build]
[[email protected]: fix sparc64 build]
[[email protected]: fix ia64 build]
Signed-off-by: Nick Piggin <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Russell King <[email protected]>
Cc: Ian Molton <[email protected]>
Cc: Bryan Wu <[email protected]>
Cc: Mikael Starvik <[email protected]>
Cc: David Howells <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: "Luck, Tony" <[email protected]>
Cc: Hirokazu Takata <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Roman Zippel <[email protected]>
Cc: Greg Ungerer <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Paul Mundt <[email protected]>
Cc: Kazumoto Kojima <[email protected]>
Cc: Richard Curnow <[email protected]>
Cc: William Lee Irwin III <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Jeff Dike <[email protected]>
Cc: Paolo 'Blaisorblade' Giarrusso <[email protected]>
Cc: Miles Bader <[email protected]>
Cc: Chris Zankel <[email protected]>
Acked-by: Kyle McMartin <[email protected]>
Acked-by: Haavard Skinnemoen <[email protected]>
Acked-by: Ralf Baechle <[email protected]>
Acked-by: Andi Kleen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
[ Still apparently needs some ARM and PPC loving - Linus ]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Only print out pgd/pte data in the oops path if oops_may_print()
holds true. Follows the i386 implementation.
Signed-off-by: Paul Mundt <[email protected]>
|
|
Just at the time you added them on sh we're removing them from other
architectures. As there's no user yet this patch just removes them
completely. Once you actually have a kprobes patch it should follow
the direct call to kprobes_fault_handler model that powerpc, s390 and
sparc64 employ in 2.6.22-rc1 and that I'm updating other architectures
to.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Paul Mundt <[email protected]>
|
|
GCC doesn't seem to be able to figure this one out for
itself, so just shut it up..
CC arch/sh/mm/fault.o
arch/sh/mm/fault.c: In function '__do_page_fault':
arch/sh/mm/fault.c:288: warning: 'ptl' may be used uninitialized in this function
Signed-off-by: Paul Mundt <[email protected]>
|
|
This went in immediately after SH added the die chain notifiers,
so move over to that instead..
Signed-off-by: Paul Mundt <[email protected]>
|
|
Add the atomic die chains in, kprobes needs these.
Signed-off-by: Paul Mundt <[email protected]>
|
|
This ended up causing problems for older parts (particularly ones
using PTEA). Revert this for now, it can be added back in once it's
had some more testing.
Signed-off-by: Paul Mundt <[email protected]>
|
|
Wire up all of the essentials for lockdep..
Signed-off-by: Paul Mundt <[email protected]>
|
|
Previously this was using a static pgd shift in the reporting
code, simply flip this to PGDIR_SHIFT which does the right
thing depending on varying PTE magnitudes on the SH-X2 MMU.
While we're at it, and since it's been recently added, use
get_TTB() for fetching the TTB, rather than the open coded
instructions.
Signed-off-by: Paul Mundt <[email protected]>
|
|
Handle simple TLB miss faults which can be resolved completely
from the page table in assembler.
Signed-off-by: Stuart Menefy <[email protected]>
Signed-off-by: Paul Mundt <[email protected]>
|
|
Remove extra bits from the pmd structure and store a kernel logical
address rather than a physical address. This allows it to be directly
dereferenced. Another piece of wierdness inherited from x86.
Signed-off-by: Stuart Menefy <[email protected]>
Signed-off-by: Paul Mundt <[email protected]>
|
|
Remove the previous saving of fault codes into the thread_struct
as they are never used, and appeared to be inherited from x86.
Signed-off-by: Stuart Menefy <[email protected]>
Signed-off-by: Paul Mundt <[email protected]>
|