aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/entry
AgeCommit message (Collapse)AuthorFilesLines
2015-10-09x86/entry: Add C code for fast system call entriesAndy Lutomirski1-0/+43
This handles both SYSENTER and SYSCALL. The asm glue will take care of the differences. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/6041a58a9b8ef6d2522ab4350deb1a1945eb563f.1444091585.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/entry/64/compat: Migrate the body of the syscall entry to CAndy Lutomirski2-39/+19
Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/a2f0fce68feeba798a24339b5a7ec1ec2dd9eaf7.1444091585.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/entry: Add do_syscall_32(), a C function to do 32-bit syscallsAndy Lutomirski1-0/+43
System calls are really quite simple. Add a helper to call a 32-bit system call. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/a77ed179834c27da436fb4a7fb23c8ee77abc11c.1444091585.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/syscalls: Give sys_call_ptr_t a useful typeAndy Lutomirski2-4/+4
Syscalls are asmlinkage functions (on 32-bit kernels), take six args of type unsigned long, and return long. Note that uml could probably be slightly cleaned up on top of this patch. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/4d3ecc4a169388d47009175408b2961961744e6f.1444091585.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/entry/syscalls: Move syscall table declarations into asm/syscalls.hAndy Lutomirski1-4/+1
The header was missing some compat declarations. Also make sys_call_ptr_t have a consistent type. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/3166aaff0fb43897998fcb6ef92991533f8c5c6c.1444091585.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/entry/64/compat: Set up full pt_regs for all compat syscallsAndy Lutomirski3-40/+20
This is conceptually simpler. More importantly, it eliminates the PTREGSCALL and execve stubs, which were not compatible with the C ABI. This means that C code can call through the compat syscall table. The execve stubs are a bit subtle. They did two things: they cleared some registers and they forced slow-path return. Neither is necessary any more: elf_common_init clears the extra registers and start_thread calls force_iret(). Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/f95b7f7dfaacf88a8cae85bb06226cae53769287.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/entry/64/compat: Remove most of the fast system call machineryAndy Lutomirski1-242/+4
We now have only one code path that calls through the compat syscall table. This will make it much more pleasant to change the pt_regs vs register calling convention, which we need to do to move the call into C. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/320cda5573cefdc601b955d23fbe8f36c085432d.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/entry/64/compat: Remove audit optimizationsAndy Lutomirski1-96/+2
These audit optimizations are messy and hard to maintain. We'll get a similar effect from opportunistic sysret when fast compat system calls are re-implemented. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/0bcca79ac7ff835d0e5a38725298865b01347a82.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/entry/64/compat: Disable SYSENTER and SYSCALL32 entriesAndy Lutomirski1-0/+13
We've disabled the vDSO helpers to call them, so turn off the entries entirely (temporarily) in preparation for cleaning them up. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/8d6e84bf651519289dc532dcc230adfabbd2a3eb.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/vdso/32: Save extra registers in the INT80 vsyscall pathAndy Lutomirski2-1/+25
The goal is to integrate the SYSENTER and SYSCALL32 entry paths with the INT80 path. SYSENTER clobbers ESP and EIP. SYSCALL32 clobbers ECX (and, invisibly, R11). SYSRETL (long mode to compat mode) clobbers ECX and, invisibly, R11. SYSEXIT (which we only need for native 32-bit) clobbers ECX and EDX. This means that we'll need to provide ESP to the kernel in a register (I chose ECX, since it's only needed for SYSENTER) and we need to provide the args that normally live in ECX and EDX in memory. The epilogue needs to restore ECX and EDX, since user code relies on regs being preserved. We don't need to do anything special about EIP, since the kernel already knows where we are. The kernel will eventually need to know where int $0x80 lands, so add a vdso_image entry for it. The only user-visible effect of this code is that ptrace-induced changes to ECX and EDX during fast syscalls will be lost. This is already the case for the SYSENTER path. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/b860925adbee2d2627a0671fbfe23a7fd04127f8.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/vdso: Replace hex int80 CFI annotations with GAS directivesAndy Lutomirski1-40/+8
Maintaining the current CFI annotations written in R'lyehian is difficult for most of us. Translate them to something a little closer to English. This will remove the CFI data for kernels built with extremely old versions of binutils. I think this is a fair tradeoff for the ability for mortals to edit the asm. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/ae3ff4ff5278b4bfc1e1dab368823469866d4b71.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-09x86/vdso: Define BUILD_VDSO while building and emit .eh_frame in asmAndy Lutomirski1-2/+2
For the vDSO, user code wants runtime unwind info. Make sure that, if we use .cfi directives, we generate it. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/16e29ad8855e6508197000d8c41f56adb00d7580.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-07x86/vdso: Remove runtime 32-bit vDSO selectionAndy Lutomirski7-255/+13
32-bit userspace will now always see the same vDSO, which is exactly what used to be the int80 vDSO. Subsequent patches will clean it up and make it support SYSENTER and SYSCALL using alternatives. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/e7e6b3526fa442502e6125fe69486aab50813c32.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-07x86/entry/64/compat: After SYSENTER, move STI after the NT fixupAndy Lutomirski1-8/+22
We eventually want to make it all the way into C code before enabling interrupts. We need to rework our flags handling slightly to delay enabling interrupts. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/35d24d2a9305da3182eab7b2cdfd32902e90962c.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-07x86/entry, locking/lockdep: Move lockdep_sys_exit() to ↵Andy Lutomirski3-3/+2
prepare_exit_to_usermode() Rather than worrying about exactly where LOCKDEP_SYS_EXIT should go in the asm code, add it to prepare_exit_from_usermode() and remove all of the asm calls that are followed by prepare_exit_to_usermode(). LOCKDEP_SYS_EXIT now appears only in the syscall fast paths. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/1736ebe948b845e68120b86b89091f3ec27f5e8e.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-07x86/entry/64/compat: Fix SYSENTER's NT flag before user memory accessAndy Lutomirski1-9/+9
Clearing NT is part of the prologue, whereas loading up arg6 makes more sense to think about as part of syscall processing. Reorder them. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/19eb235828b2d2a52c53459e09f2974e15e65a35.1444091584.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-10-07Merge branch 'linus' into x86/asm, to pick up fixes before applying new changesIngo Molnar1-1/+15
Signed-off-by: Ingo Molnar <[email protected]>
2015-10-06Merge tag 'v4.3-rc3' into x86/mm, to pick up fixes before applying new changesIngo Molnar1-1/+15
Signed-off-by: Ingo Molnar <[email protected]>
2015-09-22x86/nmi/64: Fix a paravirt stack-clobbering bug in the NMI codeAndy Lutomirski1-1/+4
The NMI entry code that switches to the normal kernel stack needs to be very careful not to clobber any extra stack slots on the NMI stack. The code is fine under the assumption that SWAPGS is just a normal instruction, but that assumption isn't really true. Use SWAPGS_UNSAFE_STACK instead. This is part of a fix for some random crashes that Sasha saw. Fixes: 9b6e6a8334d5 ("x86/nmi/64: Switch stacks on userspace NMI entry") Reported-and-tested-by: Sasha Levin <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/974bc40edffdb5c2950a5c4977f821a446b76178.1442791737.git.luto@kernel.org Signed-off-by: Thomas Gleixner <[email protected]>
2015-09-22x86/paravirt: Replace the paravirt nop with a bona fide empty functionAndy Lutomirski1-0/+11
PARAVIRT_ADJUST_EXCEPTION_FRAME generates this code (using nmi as an example, trimmed for readability): ff 15 00 00 00 00 callq *0x0(%rip) # 2796 <nmi+0x6> 2792: R_X86_64_PC32 pv_irq_ops+0x2c That's a call through a function pointer to regular C function that does nothing on native boots, but that function isn't protected against kprobes, isn't marked notrace, and is certainly not guaranteed to preserve any registers if the compiler is feeling perverse. This is bad news for a CLBR_NONE operation. Of course, if everything works correctly, once paravirt ops are patched, it gets nopped out, but what if we hit this code before paravirt ops are patched in? This can potentially cause breakage that is very difficult to debug. A more subtle failure is possible here, too: if _paravirt_nop uses the stack at all (even just to push RBP), it will overwrite the "NMI executing" variable if it's called in the NMI prologue. The Xen case, perhaps surprisingly, is fine, because it's already written in asm. Fix all of the cases that default to paravirt_nop (including adjust_exception_frame) with a big hammer: replace paravirt_nop with an asm function that is just a ret instruction. The Xen case may have other problems, so document them. This is part of a fix for some random crashes that Sasha saw. Reported-and-tested-by: Sasha Levin <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/8f5d2ba295f9d73751c33d97fda03e0495d9ade0.1442791737.git.luto@kernel.org Signed-off-by: Thomas Gleixner <[email protected]>
2015-09-22x86/vdso32: Define PGTABLE_LEVELS to 32bit VDSOToshi Kani1-0/+2
In case of CONFIG_X86_64, vdso32/vclock_gettime.c fakes a 32-bit non-PAE kernel configuration by re-defining it to CONFIG_X86_32. However, it does not re-define CONFIG_PGTABLE_LEVELS leaving it as 4 levels. This mismatch leads <asm/pgtable_type.h> to NOT include <asm-generic/ pgtable-nopud.h> and <asm-generic/pgtable-nopmd.h>, which will cause compile errors when a later patch enhances <asm/pgtable_type.h> to use PUD_SHIFT and PMD_SHIFT. These -nopud & -nopmd headers define these SHIFTs for the 32-bit non-PAE kernel. Fix it by re-defining CONFIG_PGTABLE_LEVELS to 2 levels. Signed-off-by: Toshi Kani <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Juergen Gross <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Konrad Wilk <[email protected]> Cc: Robert Elliot <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-09-21x86/entry/vsyscall: Fix undefined symbol warningBorislav Petkov1-2/+2
Commit: 3dc33bd30f3e1 ("x86/entry/vsyscall: Add CONFIG to control default") did the ifdef/elif thing but GCC doesn't like that: arch/x86/entry/vsyscall/vsyscall_64.c:44:7: warning: "CONFIG_LEGACY_VSYSCALL_NONE" is not defined [-Wundef] #elif CONFIG_LEGACY_VSYSCALL_NONE ^ Use defined() instead. Signed-off-by: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-09-20x86/entry/vsyscall: Add CONFIG to control defaultKees Cook1-1/+8
Most modern systems can run with vsyscall=none. In an effort to provide a way for build-time defaults to lack legacy settings, this adds a new CONFIG to select the type of vsyscall mapping to use, similar to the existing "vsyscall" command line parameter. Signed-off-by: Kees Cook <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-09-11sys_membarrier(): system-wide memory barrier (generic, x86)Mathieu Desnoyers2-0/+2
Here is an implementation of a new system call, sys_membarrier(), which executes a memory barrier on all threads running on the system. It is implemented by calling synchronize_sched(). It can be used to distribute the cost of user-space memory barriers asymmetrically by transforming pairs of memory barriers into pairs consisting of sys_membarrier() and a compiler barrier. For synchronization primitives that distinguish between read-side and write-side (e.g. userspace RCU [1], rwlocks), the read-side can be accelerated significantly by moving the bulk of the memory barrier overhead to the write-side. The existing applications of which I am aware that would be improved by this system call are as follows: * Through Userspace RCU library (http://urcu.so) - DNS server (Knot DNS) https://www.knot-dns.cz/ - Network sniffer (http://netsniff-ng.org/) - Distributed object storage (https://sheepdog.github.io/sheepdog/) - User-space tracing (http://lttng.org) - Network storage system (https://www.gluster.org/) - Virtual routers (https://events.linuxfoundation.org/sites/events/files/slides/DPDK_RCU_0MQ.pdf) - Financial software (https://lkml.org/lkml/2015/3/23/189) Those projects use RCU in userspace to increase read-side speed and scalability compared to locking. Especially in the case of RCU used by libraries, sys_membarrier can speed up the read-side by moving the bulk of the memory barrier cost to synchronize_rcu(). * Direct users of sys_membarrier - core dotnet garbage collector (https://github.com/dotnet/coreclr/issues/198) Microsoft core dotnet GC developers are planning to use the mprotect() side-effect of issuing memory barriers through IPIs as a way to implement Windows FlushProcessWriteBuffers() on Linux. They are referring to sys_membarrier in their github thread, specifically stating that sys_membarrier() is what they are looking for. To explain the benefit of this scheme, let's introduce two example threads: Thread A (non-frequent, e.g. executing liburcu synchronize_rcu()) Thread B (frequent, e.g. executing liburcu rcu_read_lock()/rcu_read_unlock()) In a scheme where all smp_mb() in thread A are ordering memory accesses with respect to smp_mb() present in Thread B, we can change each smp_mb() within Thread A into calls to sys_membarrier() and each smp_mb() within Thread B into compiler barriers "barrier()". Before the change, we had, for each smp_mb() pairs: Thread A Thread B previous mem accesses previous mem accesses smp_mb() smp_mb() following mem accesses following mem accesses After the change, these pairs become: Thread A Thread B prev mem accesses prev mem accesses sys_membarrier() barrier() follow mem accesses follow mem accesses As we can see, there are two possible scenarios: either Thread B memory accesses do not happen concurrently with Thread A accesses (1), or they do (2). 1) Non-concurrent Thread A vs Thread B accesses: Thread A Thread B prev mem accesses sys_membarrier() follow mem accesses prev mem accesses barrier() follow mem accesses In this case, thread B accesses will be weakly ordered. This is OK, because at that point, thread A is not particularly interested in ordering them with respect to its own accesses. 2) Concurrent Thread A vs Thread B accesses Thread A Thread B prev mem accesses prev mem accesses sys_membarrier() barrier() follow mem accesses follow mem accesses In this case, thread B accesses, which are ensured to be in program order thanks to the compiler barrier, will be "upgraded" to full smp_mb() by synchronize_sched(). * Benchmarks On Intel Xeon E5405 (8 cores) (one thread is calling sys_membarrier, the other 7 threads are busy looping) 1000 non-expedited sys_membarrier calls in 33s =3D 33 milliseconds/call. * User-space user of this system call: Userspace RCU library Both the signal-based and the sys_membarrier userspace RCU schemes permit us to remove the memory barrier from the userspace RCU rcu_read_lock() and rcu_read_unlock() primitives, thus significantly accelerating them. These memory barriers are replaced by compiler barriers on the read-side, and all matching memory barriers on the write-side are turned into an invocation of a memory barrier on all active threads in the process. By letting the kernel perform this synchronization rather than dumbly sending a signal to every process threads (as we currently do), we diminish the number of unnecessary wake ups and only issue the memory barriers on active threads. Non-running threads do not need to execute such barrier anyway, because these are implied by the scheduler context switches. Results in liburcu: Operations in 10s, 6 readers, 2 writers: memory barriers in reader: 1701557485 reads, 2202847 writes signal-based scheme: 9830061167 reads, 6700 writes sys_membarrier: 9952759104 reads, 425 writes sys_membarrier (dyn. check): 7970328887 reads, 425 writes The dynamic sys_membarrier availability check adds some overhead to the read-side compared to the signal-based scheme, but besides that, sys_membarrier slightly outperforms the signal-based scheme. However, this non-expedited sys_membarrier implementation has a much slower grace period than signal and memory barrier schemes. Besides diminishing the number of wake-ups, one major advantage of the membarrier system call over the signal-based scheme is that it does not need to reserve a signal. This plays much more nicely with libraries, and with processes injected into for tracing purposes, for which we cannot expect that signals will be unused by the application. An expedited version of this system call can be added later on to speed up the grace period. Its implementation will likely depend on reading the cpu_curr()->mm without holding each CPU's rq lock. This patch adds the system call to x86 and to asm-generic. [1] http://urcu.so membarrier(2) man page: MEMBARRIER(2) Linux Programmer's Manual MEMBARRIER(2) NAME membarrier - issue memory barriers on a set of threads SYNOPSIS #include <linux/membarrier.h> int membarrier(int cmd, int flags); DESCRIPTION The cmd argument is one of the following: MEMBARRIER_CMD_QUERY Query the set of supported commands. It returns a bitmask of supported commands. MEMBARRIER_CMD_SHARED Execute a memory barrier on all threads running on the system. Upon return from system call, the caller thread is ensured that all running threads have passed through a state where all memory accesses to user-space addresses match program order between entry to and return from the system call (non-running threads are de facto in such a state). This covers threads from all pro=E2=80=90 cesses running on the system. This command returns 0. The flags argument needs to be 0. For future extensions. All memory accesses performed in program order from each targeted thread is guaranteed to be ordered with respect to sys_membarrier(). If we use the semantic "barrier()" to represent a compiler barrier forcing memory accesses to be performed in program order across the barrier, and smp_mb() to represent explicit memory barriers forcing full memory ordering across the barrier, we have the following ordering table for each pair of barrier(), sys_membarrier() and smp_mb(): The pair ordering is detailed as (O: ordered, X: not ordered): barrier() smp_mb() sys_membarrier() barrier() X X O smp_mb() X O O sys_membarrier() O O O RETURN VALUE On success, these system calls return zero. On error, -1 is returned, and errno is set appropriately. For a given command, with flags argument set to 0, this system call is guaranteed to always return the same value until reboot. ERRORS ENOSYS System call is not implemented. EINVAL Invalid arguments. Linux 2015-04-15 MEMBARRIER(2) Signed-off-by: Mathieu Desnoyers <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Reviewed-by: Josh Triplett <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Nicholas Miell <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Alan Cox <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: David Howells <[email protected]> Cc: Pranith Kumar <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-09-10mm: mark most vm_operations_struct constKirill A. Shutemov1-1/+1
With two exceptions (drm/qxl and drm/radeon) all vm_operations_struct structs should be constant. Signed-off-by: Kirill A. Shutemov <[email protected]> Reviewed-by: Oleg Nesterov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-09-04userfaultfd: activate syscallAndrea Arcangeli2-0/+2
This activates the userfaultfd syscall. [[email protected]: activate syscall fix] [[email protected]: don't enable userfaultfd on powerpc] Signed-off-by: Andrea Arcangeli <[email protected]> Acked-by: Pavel Emelyanov <[email protected]> Cc: Sanidhya Kashyap <[email protected]> Cc: [email protected] Cc: "Kirill A. Shutemov" <[email protected]> Cc: Andres Lagar-Cavilla <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Peter Feiner <[email protected]> Cc: "Dr. David Alan Gilbert" <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "Huangpeng (Peter)" <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-08-18Merge branch 'x86/urgent' into x86/asm to fix up conflicts and to pick up fixesIngo Molnar1-1/+2
Conflicts: arch/x86/entry/entry_64_compat.S arch/x86/math-emu/get_address.c Signed-off-by: Ingo Molnar <[email protected]>
2015-08-13x86: fix error handling for 32-bit compat out-of-range system call numbersLinus Torvalds1-1/+2
Commit 3f5159a9221f ("x86/asm/entry/32: Update -ENOSYS handling to match the 64-bit logic") broke the ENOSYS handling for the 32-bit compat case. The proper error return value was never loaded into %rax, except if things just happened to go through the audit paths, which ended up reloading the return value. This moves the loading or %rax into the normal system call path, just to make sure the error case triggers it. It's kind of sad, since it adds a useless instruction to reload the register to the fast path, but it's not like that single load from the stack is going to be noticeable. Reported-by: David Drysdale <[email protected]> Tested-by: Kees Cook <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-08-08x86/vdso: Emit a GNU hashAndy Lutomirski1-1/+1
Some dynamic loaders may be slightly faster if a GNU hash is available. Strangely, this seems to have no effect at all on the vdso size. This is unlikely to have any measurable effect on the time it takes to resolve vdso symbols (since there are so few of them). In some contexts, it can be a win for a different reason: if every DSO has a GNU hash section, then libc can avoid calculating SysV hashes at all. Both musl and glibc appear to have this optimization. It's plausible that this breaks some ancient glibc version. If so, then, depending on what glibc versions break, we could either require COMPAT_VDSO for them or consider reverting. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Isaac Dunham <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nathan Lynch <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rich Felker <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] <[email protected]> Link: http://lkml.kernel.org/r/fd56cc057a2d62ab31c56a48d04fccb435b3fd4f.1438897382.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-08-05x86/entry: Remove do_notify_resume(), syscall_trace_leave(), and their TIF masksAndy Lutomirski1-57/+0
They are no longer used. Good riddance! Deleting the TIF_ macros is really nice. It was never clear why there were so many variants. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Eric Paris <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/22c61682f446628573dde0f1d573ab821677e06da.1438378274.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-08-05x86/entry/32: Migrate to C exit pathAndy Lutomirski1-51/+11
This removes the hybrid asm-and-C implementation of exit work. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Eric Paris <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/2baa438619ea6c027b40ec9fceacca52f09c74d09.1438378274.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-08-05x86/entry/32: Remove 32-bit syscall audit optimizationsAndy Lutomirski1-46/+2
The asm audit optimizations are ugly and obfuscate the code too much. Remove them. This will regress performance if syscall auditing is enabled on 32-bit kernels and SYSENTER is in use. If this becomes a problem, interested parties are encouraged to implement the equivalent of the 64-bit opportunistic SYSRET optimization. Alternatively, a case could be made that, on 32-bit kernels, a less messy asm audit optimization could be done. 32-bit kernels don't have the complicated partial register saving tricks that 64-bit kernels have, so the SYSENTER post-syscall path could just call the audit hooks directly. Any reimplementation of this ought to demonstrate that it only calls the audit hook once per syscall, though, which does not currently appear to be true. Someone would have to make the case that doing so would be better than implementing opportunistic SYSEXIT, though. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Eric Paris <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/212be39dd8c90b44c4b7bbc678128d6b88bdb9912.1438378274.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-07-31x86/vm86: Use the normal pt_regs area for vm86Brian Gerst1-23/+1
Change to use the normal pt_regs area to enter and exit vm86 mode. This is done by increasing the padding at the top of the stack to make room for the extra vm86 segment slots in the IRET frame. It then saves the 32-bit regs in the off-stack vm86 data, and copies in the vm86 regs. Exiting back to 32-bit mode does the reverse. This allows removing the hacks to jump directly into the exit asm code due to having to change the stack pointer. Returning normally from the vm86 syscall and the exception handlers allows things like ptrace and auditing to work properly. Signed-off-by: Brian Gerst <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-07-31Merge branch 'x86/urgent' into x86/asm, before applying dependent patchesIngo Molnar2-105/+208
Signed-off-by: Ingo Molnar <[email protected]>
2015-07-24x86/asm/entry/32: Revert 'Do not use R9 in SYSCALL32' commitDenys Vlasenko1-5/+9
This change reverts most of commit 53e9accf0f 'Do not use R9 in SYSCALL32'. I don't yet understand how, but code in that commit sometimes fails to preserve EBP. See https://bugzilla.kernel.org/show_bug.cgi?id=101061 "Problems while executing 32-bit code on AMD64" Reported-and-tested-by: Krzysztof A. Sobiecki <[email protected]> Signed-off-by: Denys Vlasenko <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Will Drewry <[email protected]> Cc: Kees Cook <[email protected]> CC: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-07-21x86/entry/syscalls: Wire up 32-bit direct socket callsAndy Lutomirski1-0/+15
On x86_64, there's no socketcall syscall; instead all of the socket calls are real syscalls. For 32-bit programs, we're stuck offering the socketcall syscall, but it would be nice to expose the direct calls as well. This will enable seccomp to filter socket calls (for new userspace only, but that's fine for some applications) and it will provide a tiny performance boost. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Alexander Larsson <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Cosimo Cecchi <[email protected]> Cc: Dan Nicholson <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rajalakshmi Srinivasaraghavan <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tulio Magno Quites Machado Filho <[email protected]> Cc: libc-alpha <[email protected]> Link: http://lkml.kernel.org/r/cb5138299d37d5800e2d135b01a7667fa6115854.1436912629.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-07-17x86/entry: Fix _TIF_USER_RETURN_NOTIFY check in prepare_exit_to_usermodeAndy Lutomirski1-1/+2
Linus noticed that the early return check was missing _TIF_USER_RETURN_NOTIFY. If the only work flag was _TIF_USER_RETURN_NOTIFY, we'd skip user return notifiers. Fix it. (This is the only missing bit.) This fixes double faults on a KVM host. It's the same issue as last time, except that this time it's very easy to trigger. Apparently no one uses -next as a KVM host. ( I'm still not quite sure what it is that KVM does that blows up so badly if we miss a user return notifier. My best guess is that KVM lets KERNEL_GS_BASE (i.e. the user's gs base) be negative and fixes it up in a user return notifier. If we actually end up in user mode with a negative gs base, we blow up pretty badly. ) Reported-by: Linus Torvalds <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: c5c46f59e4e7 ("x86/entry: Add new, comprehensible entry and exit handlers written in C") Link: http://lkml.kernel.org/r/3f801104d24ee7a6bb1446408d9950777aa63277.1436995419.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-07-17x86/entry/64, x86/nmi/64: Add CONFIG_DEBUG_ENTRY NMI testing codeAndy Lutomirski1-0/+15
It turns out to be rather tedious to test the NMI nesting code. Make it easier: add a new CONFIG_DEBUG_ENTRY option that causes the NMI handler to pre-emptively unmask NMIs. With this option set, errors in the repeat_nmi logic or failures to detect that we're in a nested NMI will result in quick panics under perf (especially if multiple counters are running at high frequency) instead of requiring an unusual workload that generates page faults or breakpoints inside NMIs. I called it CONFIG_DEBUG_ENTRY instead of CONFIG_DEBUG_NMI_ENTRY because I want to add new non-NMI checks elsewhere in the entry code in the future, and I'd rather not add too many new config options or add this option and then immediately rename it. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Steven Rostedt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2015-07-17x86/nmi/64: Make the "NMI executing" variable more consistentAndy Lutomirski1-6/+5
Currently, "NMI executing" is one the first time an outermost NMI hits repeat_nmi and zero thereafter. Change it to be zero each time for consistency. This is intended to help NMI handling fail harder if it's buggy. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Steven Rostedt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2015-07-17x86/nmi/64: Minor asm simplificationAndy Lutomirski1-2/+1
Replace LEA; MOV with an equivalent SUB. This saves one instruction. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Steven Rostedt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2015-07-17x86/nmi/64: Use DF to avoid userspace RSP confusing nested NMI detectionAndy Lutomirski1-4/+25
We have a tricky bug in the nested NMI code: if we see RSP pointing to the NMI stack on NMI entry from kernel mode, we assume that we are executing a nested NMI. This isn't quite true. A malicious userspace program can point RSP at the NMI stack, issue SYSCALL, and arrange for an NMI to happen while RSP is still pointing at the NMI stack. Fix it with a sneaky trick. Set DF in the region of code that the RSP check is intended to detect. IRET will clear DF atomically. ( Note: other than paravirt, there's little need for all this complexity. We could check RIP instead of RSP. ) Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Steven Rostedt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-07-17x86/nmi/64: Reorder nested NMI checksAndy Lutomirski1-16/+18
Check the repeat_nmi .. end_repeat_nmi special case first. The next patch will rework the RSP check and, as a side effect, the RSP check will no longer detect repeat_nmi .. end_repeat_nmi, so we'll need this ordering of the checks. Note: this is more subtle than it appears. The check for repeat_nmi .. end_repeat_nmi jumps straight out of the NMI code instead of adjusting the "iret" frame to force a repeat. This is necessary, because the code between repeat_nmi and end_repeat_nmi sets "NMI executing" and then writes to the "iret" frame itself. If a nested NMI comes in and modifies the "iret" frame while repeat_nmi is also modifying it, we'll end up with garbage. The old code got this right, as does the new code, but the new code is a bit more explicit. If we were to move the check right after the "NMI executing" check, then we'd get it wrong and have random crashes. ( Because the "NMI executing" check would jump to the code that would modify the "iret" frame without checking if the interrupted NMI was currently modifying it. ) Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Steven Rostedt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-07-17x86/nmi/64: Improve nested NMI commentsAndy Lutomirski1-66/+92
I found the nested NMI documentation to be difficult to follow. Improve the comments. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Steven Rostedt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-07-17x86/nmi/64: Switch stacks on userspace NMI entryAndy Lutomirski1-4/+58
Returning to userspace is tricky: IRET can fail, and ESPFIX can rearrange the stack prior to IRET. The NMI nesting fixup relies on a precise stack layout and atomic IRET. Rather than trying to teach the NMI nesting fixup to handle ESPFIX and failed IRET, punt: run NMIs that came from user mode on the normal kernel stack. This will make some nested NMIs visible to C code, but the C code is okay with that. As a side effect, this should speed up perf: it eliminates an RDMSR when NMIs come from user mode. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Steven Rostedt <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-07-17x86/nmi/64: Remove asm code that saves CR2Andy Lutomirski1-17/+0
Now that do_nmi saves CR2, we don't need to save it in asm. Signed-off-by: Andy Lutomirski <[email protected]> Reviewed-by: Steven Rostedt <[email protected]> Acked-by: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-07-08x86/entry/64: Fix IRQ state confusion and related warning on compat syscalls ↵Andy Lutomirski1-2/+6
with CONFIG_AUDITSYSCALL=n int_ret_from_sys_call now expects IRQs to be enabled. I got this right in the real sysexit_audit and sysretl_audit asm paths, but I missed it in the #defined-away versions when CONFIG_AUDITSYSCALL=n. This is a straightforward fix for CONFIG_AUDITSYSCALL=n Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: 29ea1b258b98 ("x86/entry/64: Migrate 64-bit and compat syscalls to the new exit handlers and remove old assembly code") Link: http://lkml.kernel.org/r/25cf0a01e01c6008118dd8f8d9f043020416700c.1436291493.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-07-07x86/entry: Remove SCHEDULE_USER and asm/context-tracking.hAndy Lutomirski1-1/+0
SCHEDULE_USER is no longer used, and asm/context-tracking.h contained nothing else. Remove the header entirely. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/854e9b45f69af20e26c47099eb236321563ebcee.1435952415.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-07-07x86/asm/entry/64: Migrate error and IRQ exit work to C and remove old ↵Andy Lutomirski2-46/+23
assembly code Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/60e90901eee611e59e958bfdbbe39969b4f88fe5.1435952415.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-07-07x86/asm/entry/64: Simplify IRQ stack pt_regs handlingAndy Lutomirski1-5/+3
There's no need for both RSI and RDI to point to the original stack. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/3a0481f809dd340c7d3f54ce3fd6d66ef2a578cd.1435952415.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>
2015-07-07x86/asm/entry/64: Save all regs on interrupt entryAndy Lutomirski2-23/+9
To prepare for the big rewrite of the error and interrupt exit paths, we will need pt_regs completely filled in. It's already completely filled in when error_exit runs, so rearrange interrupt handling to match it. This will slow down interrupt handling very slightly (eight instructions), but the simplification it enables will be more than worth it. Signed-off-by: Andy Lutomirski <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/d8a766a7f558b30e6e01352854628a2d9943460c.1435952415.git.luto@kernel.org Signed-off-by: Ingo Molnar <[email protected]>