Age | Commit message (Collapse) | Author | Files | Lines |
|
the handler
Introduce faulthandler_disabled() and use it to check for irq context and
disabled pagefaults (via pagefault_disable()) in the pagefault handlers.
Please note that we keep the in_atomic() checks in place - to detect
whether in irq context (in which case preemption is always properly
disabled).
In contrast, preempt_disable() should never be used to disable pagefaults.
With !CONFIG_PREEMPT_COUNT, preempt_disable() doesn't modify the preempt
counter, and therefore the result of in_atomic() differs.
We validate that condition by using might_fault() checks when calling
might_sleep().
Therefore, add a comment to faulthandler_disabled(), describing why this
is needed.
faulthandler_disabled() and pagefault_disable() are defined in
linux/uaccess.h, so let's properly add that include to all relevant files.
This patch is based on a patch from Thomas Gleixner.
Reviewed-and-tested-by: Thomas Gleixner <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
"you should SIGSEGV" error, because the SIGSEGV case was generally
handled by the caller - usually the architecture fault handler.
That results in lots of duplication - all the architecture fault
handlers end up doing very similar "look up vma, check permissions, do
retries etc" - but it generally works. However, there are cases where
the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.
In particular, when accessing the stack guard page, libsigsegv expects a
SIGSEGV. And it usually got one, because the stack growth is handled by
that duplicated architecture fault handler.
However, when the generic VM layer started propagating the error return
from the stack expansion in commit fee7e49d4514 ("mm: propagate error
from stack expansion even for guard page"), that now exposed the
existing VM_FAULT_SIGBUS result to user space. And user space really
expected SIGSEGV, not SIGBUS.
To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
duplicate architecture fault handlers about it. They all already have
the code to handle SIGSEGV, so it's about just tying that new return
value to the existing code, but it's all a bit annoying.
This is the mindless minimal patch to do this. A more extensive patch
would be to try to gather up the mostly shared fault handling logic into
one generic helper routine, and long-term we really should do that
cleanup.
Just from this patch, you can generally see that most architectures just
copied (directly or indirectly) the old x86 way of doing things, but in
the meantime that original x86 model has been improved to hold the VM
semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
"newer" things, so it would be a good idea to bring all those
improvements to the generic case and teach other architectures about
them too.
Reported-and-tested-by: Takashi Iwai <[email protected]>
Tested-by: Jan Engelhardt <[email protected]>
Acked-by: Heiko Carstens <[email protected]> # "s390 still compiles and boots"
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Unlike global OOM handling, memory cgroup code will invoke the OOM killer
in any OOM situation because it has no way of telling faults occuring in
kernel context - which could be handled more gracefully - from
user-triggered faults.
Pass a flag that identifies faults originating in user space from the
architecture-specific fault handlers to generic code so that memcg OOM
handling can be improved.
Signed-off-by: Johannes Weiner <[email protected]>
Reviewed-by: Michal Hocko <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: KAMEZAWA Hiroyuki <[email protected]>
Cc: azurIt <[email protected]>
Cc: KOSAKI Motohiro <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Disintegrate asm/system.h for M32R.
Signed-off-by: David Howells <[email protected]>
cc: [email protected]
|
|
Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: Lucas De Marchi <[email protected]>
|
|
As explained in commit 1c0fe6e3bd ("mm: invoke oom-killer from page
fault") , we want to call the architecture independent oom killer when
getting an unexplained OOM from handle_mm_fault, rather than simply
killing current.
Signed-off-by: Nick Piggin <[email protected]>
Acked-by: David Rientjes <[email protected]>
Cc: Hirokazu Takata <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
On VIVT ARM, when we have multiple shared mappings of the same file
in the same MM, we need to ensure that we have coherency across all
copies. We do this via make_coherent() by making the pages
uncacheable.
This used to work fine, until we allowed highmem with highpte - we
now have a page table which is mapped as required, and is not available
for modification via update_mmu_cache().
Ralf Beache suggested getting rid of the PTE value passed to
update_mmu_cache():
On MIPS update_mmu_cache() calls __update_tlb() which walks pagetables
to construct a pointer to the pte again. Passing a pte_t * is much
more elegant. Maybe we might even replace the pte argument with the
pte_t?
Ben Herrenschmidt would also like the pte pointer for PowerPC:
Passing the ptep in there is exactly what I want. I want that
-instead- of the PTE value, because I have issue on some ppc cases,
for I$/D$ coherency, where set_pte_at() may decide to mask out the
_PAGE_EXEC.
So, pass in the mapped page table pointer into update_mmu_cache(), and
remove the PTE value, updating all implementations and call sites to
suit.
Includes a fix from Stephen Rothwell:
sparc: fix fallout from update_mmu_cache API change
Signed-off-by: Stephen Rothwell <[email protected]>
Acked-by: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Russell King <[email protected]>
|
|
This allows the callers to now pass down the full set of FAULT_FLAG_xyz
flags to handle_mm_fault(). All callers have been (mechanically)
converted to the new calling convention, there's almost certainly room
for architectures to clean up their code and then add FAULT_FLAG_RETRY
when that support is added.
Signed-off-by: Linus Torvalds <[email protected]>
|
|
is_init() is an ambiguous name for the pid==1 check. Split it into
is_global_init() and is_container_init().
A cgroup init has it's tsk->pid == 1.
A global init also has it's tsk->pid == 1 and it's active pid namespace
is the init_pid_ns. But rather than check the active pid namespace,
compare the task structure with 'init_pid_ns.child_reaper', which is
initialized during boot to the /sbin/init process and never changes.
Changelog:
2.6.22-rc4-mm2-pidns1:
- Use 'init_pid_ns.child_reaper' to determine if a given task is the
global init (/sbin/init) process. This would improve performance
and remove dependence on the task_pid().
2.6.21-mm2-pidns2:
- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
ppc,avr32}/traps.c for the _exception() call to is_global_init().
This way, we kill only the cgroup if the cgroup's init has a
bug rather than force a kernel panic.
[[email protected]: fix comment]
[[email protected]: Use is_global_init() in arch/m32r/mm/fault.c]
[[email protected]: kernel/pid.c: remove unused exports]
[[email protected]: Fix capability.c to work with threaded init]
Signed-off-by: Serge E. Hallyn <[email protected]>
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Acked-by: Pavel Emelianov <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Cedric Le Goater <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Herbert Poetzel <[email protected]>
Cc: Kirill Korotaev <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
We have had complaints where a threaded application is left in a bad state
after one of it's threads is killed when we hit a VM: out_of_memory
condition.
Killing just one of the process threads can leave the application in a bad
state, whereas killing the entire process group would allow for the
application to restart, or be otherwise handled, and makes it very obvious
that something has gone wrong.
This change allows the entire process group to be taken down, rather
than just the one thread.
Signed-off-by: Will Schmidt <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Russell King <[email protected]>
Cc: Ian Molton <[email protected]>
Cc: Haavard Skinnemoen <[email protected]>
Cc: Mikael Starvik <[email protected]>
Cc: David Howells <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: "Luck, Tony" <[email protected]>
Cc: Hirokazu Takata <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Roman Zippel <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Kyle McMartin <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Paul Mundt <[email protected]>
Cc: Kazumoto Kojima <[email protected]>
Cc: Richard Curnow <[email protected]>
Cc: William Lee Irwin III <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Chris Zankel <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This patch completes Linus's wish that the fault return codes be made into
bit flags, which I agree makes everything nicer. This requires requires
all handle_mm_fault callers to be modified (possibly the modifications
should go further and do things like fault accounting in handle_mm_fault --
however that would be for another patch).
[[email protected]: fix alpha build]
[[email protected]: fix s390 build]
[[email protected]: fix sparc build]
[[email protected]: fix sparc64 build]
[[email protected]: fix ia64 build]
Signed-off-by: Nick Piggin <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Russell King <[email protected]>
Cc: Ian Molton <[email protected]>
Cc: Bryan Wu <[email protected]>
Cc: Mikael Starvik <[email protected]>
Cc: David Howells <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: "Luck, Tony" <[email protected]>
Cc: Hirokazu Takata <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Roman Zippel <[email protected]>
Cc: Greg Ungerer <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Paul Mundt <[email protected]>
Cc: Kazumoto Kojima <[email protected]>
Cc: Richard Curnow <[email protected]>
Cc: William Lee Irwin III <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Jeff Dike <[email protected]>
Cc: Paolo 'Blaisorblade' Giarrusso <[email protected]>
Cc: Miles Bader <[email protected]>
Cc: Chris Zankel <[email protected]>
Acked-by: Kyle McMartin <[email protected]>
Acked-by: Haavard Skinnemoen <[email protected]>
Acked-by: Ralf Baechle <[email protected]>
Acked-by: Andi Kleen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
[ Still apparently needs some ARM and PPC loving - Linus ]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.
Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Part of long forgotten patch
http://groups.google.com/group/fa.linux.kernel/msg/e98e941ce1cf29f6?dmode=source
Since then, m32r grabbed two copies.
Leave s390 copy because of important absence of CONFIG_VT, but remove
references to non-existent timerlist_lock. ia64 also loses timerlist_lock.
Signed-off-by: Alexey Dobriyan <[email protected]>
Acked-by: Martin Schwidefsky <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: "Luck, Tony" <[email protected]>
Cc: Hirokazu Takata <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Fix do_page_fault and update_mmu_cache.
* Fix do_page_fault (vmalloc_fault:) to pass error_code correctly
to update_mmu_cache by using a thread-fault code for all m32r chips.
* Fix update_mmu_cache for OPSP chip
- #ifdef CONFIG_CHIP_OPSP portion is a workaround of OPSP;
Add a notfound-case operation to update_mmu_cache for OPSP
like other m32r chip.
- Fix pte_data that was not initialized if no entry found.
Signed-off-by: Kazuhiro Inaoka <[email protected]>
Signed-off-by: Hirokazu Takata <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Don't mask the lower 12-bit of the page fault address.
In the current m32r kernel implementation, we use an access exception
to detect page faults.
This patch fixes ace_handler (access exception handler) for m32r. In order to
check userspace address in do_page_fault, we have to pass full 32-bit address
to do_page_fault.
Signed-off-by: Hirokazu Takata <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This is an updated version of Eric Biederman's is_init() patch.
(http://lkml.org/lkml/2006/2/6/280). It applies cleanly to 2.6.18-rc3 and
replaces a few more instances of ->pid == 1 with is_init().
Further, is_init() checks pid and thus removes dependency on Eric's other
patches for now.
Eric's original description:
There are a lot of places in the kernel where we test for init
because we give it special properties. Most significantly init
must not die. This results in code all over the kernel test
->pid == 1.
Introduce is_init to capture this case.
With multiple pid spaces for all of the cases affected we are
looking for only the first process on the system, not some other
process that has pid == 1.
Signed-off-by: Eric W. Biederman <[email protected]>
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Serge Hallyn <[email protected]>
Cc: Cedric Le Goater <[email protected]>
Cc: <[email protected]>
Acked-by: Paul Mackerras <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Signed-off-by: Jörn Engel <[email protected]>
Signed-off-by: Adrian Bunk <[email protected]>
|
|
Signed-off-by: Adrian Bunk <[email protected]>
|
|
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
|