Age | Commit message (Collapse) | Author | Files | Lines |
|
This gets rid of the last user of the old RESTORE_..._REGS infrastructure.
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/652a260f17a160789bc6a41d997f98249b73e2ab.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
|
|
They did almost the same thing. Remove a bunch of pointless
instructions (mostly hidden in macros) and reduce cognitive load by
merging them.
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/1204e20233fcab9130a1ba80b3b1879b5db3fc1f.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Saves 64 bytes.
Signed-off-by: Andy Lutomirski <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/6609b7f74ab31c36604ad746e019ea8495aec76c.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
|
|
paranoid_exit_restore was a copy of restore_regs_and_return_to_kernel.
Merge them and make the paranoid_exit internal labels local.
Keeping .Lparanoid_exit makes the code a bit shorter because it
allows a 2-byte jnz instead of a 5-byte jnz.
Saves 96 bytes of text.
( This is still a bit suboptimal in a non-CONFIG_TRACE_IRQFLAGS
kernel, but fixing that would make the code rather messy. )
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/510d66a1895cda9473c84b1086f0bb974f22de6a.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The old code restored all the registers with movq instead of pop.
In theory, this was done because some CPUs have higher movq
throughput, but any gain there would be tiny and is almost certainly
outweighed by the higher text size.
This saves 96 bytes of text.
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/ad82520a207ccd851b04ba613f4f752b33ac05f7.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
|
|
All of the code paths that ended up doing IRET to usermode did
SWAPGS immediately beforehand. Move the SWAPGS into the common
code.
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/27fd6f45b7cd640de38fb9066fd0349bcd11f8e1.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
|
|
These code paths will diverge soon.
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/dccf8c7b3750199b4b30383c812d4e2931811509.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The only user was the 64-bit opportunistic SYSRET failure path, and
that path didn't really need it. This change makes the
opportunistic SYSRET code a bit more straightforward and gets rid of
the label.
Signed-off-by: Andy Lutomirski <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/be3006a7ad3326e3458cf1cc55d416252cbe1986.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
|
|
This makes the build log look nicer.
Before:
SYSTBL arch/x86/entry/syscalls/../../include/generated/asm/syscalls_32.h
SYSHDR arch/x86/entry/syscalls/../../include/generated/asm/unistd_32_ia32.h
SYSHDR arch/x86/entry/syscalls/../../include/generated/asm/unistd_64_x32.h
SYSTBL arch/x86/entry/syscalls/../../include/generated/asm/syscalls_64.h
SYSHDR arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_32.h
SYSHDR arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_64.h
SYSHDR arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_x32.h
After:
SYSTBL arch/x86/include/generated/asm/syscalls_32.h
SYSHDR arch/x86/include/generated/asm/unistd_32_ia32.h
SYSHDR arch/x86/include/generated/asm/unistd_64_x32.h
SYSTBL arch/x86/include/generated/asm/syscalls_64.h
SYSHDR arch/x86/include/generated/uapi/asm/unistd_32.h
SYSHDR arch/x86/include/generated/uapi/asm/unistd_64.h
SYSHDR arch/x86/include/generated/uapi/asm/unistd_x32.h
Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
to READ_ONCE()/WRITE_ONCE()
Please do not apply this to mainline directly, instead please re-run the
coccinelle script shown below and apply its output.
For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't harmful, and changing them results in
churn.
However, for some features, the read/write distinction is critical to
correct operation. To distinguish these cases, separate read/write
accessors must be used. This patch migrates (most) remaining
ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
coccinelle script:
----
// Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
// WRITE_ONCE()
// $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch
virtual patch
@ depends on patch @
expression E1, E2;
@@
- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)
@ depends on patch @
expression E;
@@
- ACCESS_ONCE(E)
+ READ_ONCE(E)
----
Signed-off-by: Mark Rutland <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
I find the '.ifeq <expression>' directive to be confusing. Reading it
quickly seems to suggest its opposite meaning, or that it's missing an
argument.
Improve readability by replacing all of its x86 uses with
'.if <expression> == 0'.
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andrei Vagin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/757da028e802c7e98d23fbab8d234b1063e161cf.1508516398.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
This fixes the following ORC warning in the 'int3' entry code:
WARNING: can't dereference iret registers at ffff8801c5f17fe0 for ip ffffffff95f0d94b
The ORC metadata had the wrong stack offset for the iret registers.
Their location on the stack is dependent on whether the exception has an
error code.
Reported-and-tested-by: Andrei Vagin <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Fixes: 8c1f75587a18 ("x86/entry/64: Add unwind hint annotations")
Link: http://lkml.kernel.org/r/931d57f0551ed7979d5e7e05370d445c8e5137f8.1508516398.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Using the ARRAY_SIZE macro improves the readability of the code.
Found with Coccinelle with the following semantic patch:
@r depends on (org || report)@
type T;
T[] E;
position p;
@@
(
(sizeof(E)@p /sizeof(*E))
|
(sizeof(E)@p /sizeof(E[...]))
|
(sizeof(E)@p /sizeof(T))
)
Signed-off-by: Jérémy Lefaure <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: Martin Mares <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
On x86-32, Tetsuo Handa and Fengguang Wu reported unwinder warnings
like:
WARNING: kernel stack regs at f60bb9c8 in swapper:1 has bad 'bp' value 0ba00000
And also there were some stack dumps with a bunch of unreliable '?'
symbols after an apic_timer_interrupt symbol, meaning the unwinder got
confused when it tried to read the regs.
The cause of those issues is that, with GCC 4.8 (and possibly older),
there are cases where GCC misaligns the stack pointer in a leaf function
for no apparent reason:
c124a388 <acpi_rs_move_data>:
c124a388: 55 push %ebp
c124a389: 89 e5 mov %esp,%ebp
c124a38b: 57 push %edi
c124a38c: 56 push %esi
c124a38d: 89 d6 mov %edx,%esi
c124a38f: 53 push %ebx
c124a390: 31 db xor %ebx,%ebx
c124a392: 83 ec 03 sub $0x3,%esp
...
c124a3e3: 83 c4 03 add $0x3,%esp
c124a3e6: 5b pop %ebx
c124a3e7: 5e pop %esi
c124a3e8: 5f pop %edi
c124a3e9: 5d pop %ebp
c124a3ea: c3 ret
If an interrupt occurs in such a function, the regs on the stack will be
unaligned, which breaks the frame pointer encoding assumption. So on
32-bit, use the MSB instead of the LSB to encode the regs.
This isn't an issue on 64-bit, because interrupts align the stack before
writing to it.
Reported-and-tested-by: Tetsuo Handa <[email protected]>
Reported-and-tested-by: Fengguang Wu <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Byungchul Park <[email protected]>
Cc: LKP <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/279a26996a482ca716605c7dbc7f2db9d8d91e81.1507597785.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 apic updates from Thomas Gleixner:
"This update provides:
- Cleanup of the IDT management including the removal of the extra
tracing IDT. A first step to cleanup the vector management code.
- The removal of the paravirt op adjust_exception_frame. This is a
XEN specific issue, but merged through this branch to avoid nasty
merge collisions
- Prevent dmesg spam about the TSC DEADLINE bug, when the CPU has
disabled the TSC DEADLINE timer in CPUID.
- Adjust a debug message in the ioapic code to print out the
information correctly"
* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
x86/idt: Fix the X86_TRAP_BP gate
x86/xen: Get rid of paravirt op adjust_exception_frame
x86/eisa: Add missing include
x86/idt: Remove superfluous ALIGNment
x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on CPUs without the feature
x86/idt: Remove the tracing IDT leftovers
x86/idt: Hide set_intr_gate()
x86/idt: Simplify alloc_intr_gate()
x86/idt: Deinline setup functions
x86/idt: Remove unused functions/inlines
x86/idt: Move interrupt gate initialization to IDT code
x86/idt: Move APIC gate initialization to tables
x86/idt: Move regular trap init to tables
x86/idt: Move IST stack based traps to table init
x86/idt: Move debug stack init to table based
x86/idt: Switch early trap init to IDT tables
x86/idt: Prepare for table based init
x86/idt: Move early IDT setup out of 32-bit asm
x86/idt: Move early IDT handler setup to IDT code
x86/idt: Consolidate IDT invalidation
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull syscall updates from Ingo Molnar:
"Improve the security of set_fs(): we now check the address limit on a
number of key platforms (x86, arm, arm64) before returning to
user-space - without adding overhead to the typical system call fast
path"
* 'x86-syscall-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
arm64/syscalls: Check address limit on user-mode return
arm/syscalls: Check address limit on user-mode return
x86/syscalls: Check address limit on user-mode return
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm updates from Ingo Molnar:
- Introduce the ORC unwinder, which can be enabled via
CONFIG_ORC_UNWINDER=y.
The ORC unwinder is a lightweight, Linux kernel specific debuginfo
implementation, which aims to be DWARF done right for unwinding.
Objtool is used to generate the ORC unwinder tables during build, so
the data format is flexible and kernel internal: there's no
dependency on debuginfo created by an external toolchain.
The ORC unwinder is almost two orders of magnitude faster than the
(out of tree) DWARF unwinder - which is important for perf call graph
profiling. It is also significantly simpler and is coded defensively:
there has not been a single ORC related kernel crash so far, even
with early versions. (knock on wood!)
But the main advantage is that enabling the ORC unwinder allows
CONFIG_FRAME_POINTERS to be turned off - which speeds up the kernel
measurably:
With frame pointers disabled, GCC does not have to add frame pointer
instrumentation code to every function in the kernel. The kernel's
.text size decreases by about 3.2%, resulting in better cache
utilization and fewer instructions executed, resulting in a broad
kernel-wide speedup. Average speedup of system calls should be
roughly in the 1-3% range - measurements by Mel Gorman [1] have shown
a speedup of 5-10% for some function execution intense workloads.
The main cost of the unwinder is that the unwinder data has to be
stored in RAM: the memory cost is 2-4MB of RAM, depending on kernel
config - which is a modest cost on modern x86 systems.
Given how young the ORC unwinder code is it's not enabled by default
- but given the performance advantages the plan is to eventually make
it the default unwinder on x86.
See Documentation/x86/orc-unwinder.txt for more details.
- Remove lguest support: its intended role was that of a temporary
proof of concept for virtualization, plus its removal will enable the
reduction (removal) of the paravirt API as well, so Rusty agreed to
its removal. (Juergen Gross)
- Clean up and fix FSGS related functionality (Andy Lutomirski)
- Clean up IO access APIs (Andy Shevchenko)
- Enhance the symbol namespace (Jiri Slaby)
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
objtool: Handle GCC stack pointer adjustment bug
x86/entry/64: Use ENTRY() instead of ALIGN+GLOBAL for stub32_clone()
x86/fpu/math-emu: Add ENDPROC to functions
x86/boot/64: Extract efi_pe_entry() from startup_64()
x86/boot/32: Extract efi_pe_entry() from startup_32()
x86/lguest: Remove lguest support
x86/paravirt/xen: Remove xen_patch()
objtool: Fix objtool fallthrough detection with function padding
x86/xen/64: Fix the reported SS and CS in SYSCALL
objtool: Track DRAP separately from callee-saved registers
objtool: Fix validate_branch() return codes
x86: Clarify/fix no-op barriers for text_poke_bp()
x86/switch_to/64: Rewrite FS/GS switching yet again to fix AMD CPUs
selftests/x86/fsgsbase: Test selectors 1, 2, and 3
x86/fsgsbase/64: Report FSBASE and GSBASE correctly in core dumps
x86/fsgsbase/64: Fully initialize FS and GS state in start_thread_common
x86/asm: Fix UNWIND_HINT_REGS macro for older binutils
x86/asm/32: Fix regs_get_register() on segment registers
x86/xen/64: Rearrange the SYSCALL entries
x86/asm/32: Remove a bunch of '& 0xffff' from pt_regs segment reads
...
|
|
When running as Xen pv-guest the exception frame on the stack contains
%r11 and %rcx additional to the other data pushed by the processor.
Instead of having a paravirt op being called for each exception type
prepend the Xen specific code to each exception entry. When running as
Xen pv-guest just use the exception entry with prepended instructions,
otherwise use the entry without the Xen specific code.
[ tglx: Merged through tip to avoid ugly merge conflict ]
Signed-off-by: Juergen Gross <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
|
|
ALIGN+GLOBAL is effectively what ENTRY() does, so use ENTRY() which is
dedicated for exactly this purpose -- global functions.
Note that stub32_clone() is a C-like leaf function -- it has a standard
call frame -- it only switches one argument and continues by jumping
into C. Since each ENTRY() should be balanced by some END*() marker, we
add a corresponding ENDPROC() to stub32_clone() too.
Besides that, x86's custom GLOBAL macro is going to die very soon.
Signed-off-by: Jiri Slaby <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The GDT entry related code uses two ways to access entries via
union fields:
- bitfields
- macros which initialize the two 16-bit parts of the entry
by magic shift and mask operations.
Clean it up and only use the bitfields to initialize and access entries.
( The old access patterns were partly done due to GCC optimizing bitfield
accesses in a horrible way - that's mostly fixed these days and clarity
of code in such low level accessors is very important. )
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
No more users of the tracing IDT. All exception tracepoints have been moved
into the regular handlers. Get rid of the mess which shouldn't have been
created in the first place.
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Make use of the new irqvector tracing static key and remove the duplicated
trace_do_pagefault() implementation.
If irq vector tracing is disabled, then the overhead of this is a single
NOP5, which is a reasonable tradeoff to avoid duplicated code and the
unholy macro mess.
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Generate irqentry and softirqentry text sections without
any Kconfig dependencies. This will add extra sections, but
there should be no performace impact.
Suggested-by: Ingo Molnar <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Anil S Keshavamurthy <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: David S . Miller <[email protected]>
Cc: Francis Deslauriers <[email protected]>
Cc: Jesper Nilsson <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Mikael Starvik <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/150172789110.27216.3955739126693102122.stgit@devbox
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Xen's raw SYSCALL entries are much less weird than native. Rather
than fudging them to look like native entries, use the Xen-provided
stack frame directly.
This lets us eliminate entry_SYSCALL_64_after_swapgs and two uses of
the SWAPGS_UNSAFE_STACK paravirt hook. The SYSENTER code would
benefit from similar treatment.
This makes one change to the native code path: the compat
instruction that clears the high 32 bits of %rax is moved slightly
later. I'd be surprised if this affects performance at all.
Tested-by: Juergen Gross <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/7c88ed36805d36841ab03ec3b48b4122c4418d71.1502164668.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
This closes a hole in our SMAP implementation.
This patch comes from grsecurity. Good catch!
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/314cc9f294e8f14ed85485727556ad4f15bb1659.1502159503.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
|
|
We are using the same vector for nested/non-nested posted
interrupts delivery, this may cause interrupts latency in
L1 since we can't kick the L2 vcpu out of vmx-nonroot mode.
This patch introduces a new vector which is only for nested
posted interrupts to solve the problems above.
Signed-off-by: Wincy Van <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Add unwind hint annotations to entry_64.S. This will enable the ORC
unwinder to unwind through any location in the entry code including
syscalls, interrupts, and exceptions.
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/b9f6d478aadf68ba57c739dcfac34ec0dc021c4c.1499786555.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The OOPS unwinder wants the word at the top of the IRQ stack to
point back to the previous stack at all times when the IRQ stack
is in use. There's currently a one-instruction window in ENTER_IRQ_STACK
during which this isn't the case. Fix it by writing the old RSP to the
top of the IRQ stack before jumping.
This currently writes the pointer to the stack twice, which is a bit
ugly. We could get rid of this by replacing irq_stack_ptr with
irq_stack_ptr_minus_eight (better name welcome). OTOH, there may be
all kinds of odd microarchitectural considerations in play that
affect performance by a few cycles here.
Reported-by: Mike Galbraith <[email protected]>
Reported-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/aae7e79e49914808440ad5310ace138ced2179ca.1499786555.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
This will allow IRQ stacks to nest inside NMIs or similar entries
that can happen during IRQ stack setup or teardown.
The new macros won't work correctly if they're invoked with IRQs on.
Add a check under CONFIG_DEBUG_ENTRY to detect that.
Signed-off-by: Andy Lutomirski <[email protected]>
[ Use %r10 instead of %r11 in xen_do_hypervisor_callback to make objtool
and ORC unwinder's lives a little easier. ]
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/b0b2ff5fb97d2da2e1d7e1f380190c92545c8bb5.1499786555.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Pull ARM updates from Russell King:
- add support for ftrace-with-registers, which is needed for kgraft and
other ftrace tools
- support for mremap() for the sigpage/vDSO so that checkpoint/restore
can work
- add timestamps to each line of the register dump output
- remove the unused KTHREAD_SIZE from nommu
- align the ARM bitops APIs with the generic API (using unsigned long
pointers rather than void pointers)
- make the configuration of userspace Thumb support an expert option so
that we can default it on, and avoid some hard to debug userspace
crashes
* 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8684/1: NOMMU: Remove unused KTHREAD_SIZE definition
ARM: 8683/1: ARM32: Support mremap() for sigpage/vDSO
ARM: 8679/1: bitops: Align prototypes to generic API
ARM: 8678/1: ftrace: Adds support for CONFIG_DYNAMIC_FTRACE_WITH_REGS
ARM: make configuration of userspace Thumb support an expert option
ARM: 8673/1: Fix __show_regs output timestamps
|
|
Ensure the address limit is a user-mode segment before returning to
user-mode. Otherwise a process can corrupt kernel-mode memory and elevate
privileges [1].
The set_fs function sets the TIF_SETFS flag to force a slow path on
return. In the slow path, the address limit is checked to be USER_DS if
needed.
The addr_limit_user_check function is added as a cross-architecture
function to check the address limit.
[1] https://bugs.chromium.org/p/project-zero/issues/detail?id=990
Signed-off-by: Thomas Garnier <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: [email protected]
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: David Howells <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Miroslav Benes <[email protected]>
Cc: Chris Metcalf <[email protected]>
Cc: Pratyush Anand <[email protected]>
Cc: Russell King <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: [email protected]
Cc: Will Drewry <[email protected]>
Cc: [email protected]
Cc: Oleg Nesterov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
|
|
CRIU restores application mappings on the same place where they
were before Checkpoint. That means, that we need to move vDSO
and sigpage during restore on exactly the same place where
they were before C/R.
Make mremap() code update mm->context.{sigpage,vdso} pointers
during VMA move. Sigpage is used for landing after handling
a signal - if the pointer is not updated during moving, the
application might crash on any signal after mremap().
vDSO pointer on ARM32 is used only for setting auxv at this moment,
update it during mremap() in case of future usage.
Without those updates, current work of CRIU on ARM32 is not reliable.
Historically, we error Checkpointing if we find vDSO page on ARM32
and suggest user to disable CONFIG_VDSO.
But that's not correct - it goes from x86 where signal processing
is ended in vDSO blob. For arm32 it's sigpage, which is not disabled
with `CONFIG_VDSO=n'.
Looks like C/R was working by luck - because userspace on ARM32 at
this moment always sets SA_RESTORER.
Signed-off-by: Dmitry Safonov <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Cc: [email protected]
Cc: Will Deacon <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Christopher Covington <[email protected]>
Signed-off-by: Russell King <[email protected]>
|
|
On x86-64 __VIRTUAL_MASK_SHIFT depends on paging mode now.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Petr Mladek reported the following warning when loading the livepatch
sample module:
WARNING: CPU: 1 PID: 3699 at arch/x86/kernel/stacktrace.c:132 save_stack_trace_tsk_reliable+0x133/0x1a0
...
Call Trace:
__schedule+0x273/0x820
schedule+0x36/0x80
kthreadd+0x305/0x310
? kthread_create_on_cpu+0x80/0x80
? icmp_echo.part.32+0x50/0x50
ret_from_fork+0x2c/0x40
That warning means the end of the stack is no longer recognized as such
for newly forked tasks. The problem was introduced with the following
commit:
ff3f7e2475bb ("x86/entry: Fix the end of the stack for newly forked tasks")
... which was completely misguided. It only partially fixed the
reported issue, and it introduced another bug in the process. None of
the other entry code saves the frame pointer before calling into C code,
so it doesn't make sense for ret_from_fork to do so either.
Contrary to what I originally thought, the original issue wasn't related
to newly forked tasks. It was actually related to ftrace. When entry
code calls into a function which then calls into an ftrace handler, the
stack frame looks different than normal.
The original issue will be fixed in the unwinder, in a subsequent patch.
Reported-by: Petr Mladek <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: [email protected]
Fixes: ff3f7e2475bb ("x86/entry: Fix the end of the stack for newly forked tasks")
Link: http://lkml.kernel.org/r/f350760f7e82f0750c8d1dd093456eb212751caa.1495553739.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching
Pull livepatch updates from Jiri Kosina:
- a per-task consistency model is being added for architectures that
support reliable stack dumping (extending this, currently rather
trivial set, is currently in the works).
This extends the nature of the types of patches that can be applied
by live patching infrastructure. The code stems from the design
proposal made [1] back in November 2014. It's a hybrid of SUSE's
kGraft and RH's kpatch, combining advantages of both: it uses
kGraft's per-task consistency and syscall barrier switching combined
with kpatch's stack trace switching. There are also a number of
fallback options which make it quite flexible.
Most of the heavy lifting done by Josh Poimboeuf with help from
Miroslav Benes and Petr Mladek
[1] https://lkml.kernel.org/r/[email protected]
- module load time patch optimization from Zhou Chengming
- a few assorted small fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: add missing printk newlines
livepatch: Cancel transition a safe way for immediate patches
livepatch: Reduce the time of finding module symbols
livepatch: make klp_mutex proper part of API
livepatch: allow removal of a disabled patch
livepatch: add /proc/<pid>/patch_state
livepatch: change to a per-task consistency model
livepatch: store function sizes
livepatch: use kstrtobool() in enabled_store()
livepatch: move patching functions into patch.c
livepatch: remove unnecessary object loaded check
livepatch: separate enabled and patched states
livepatch/s390: add TIF_PATCH_PENDING thread flag
livepatch/s390: reorganize TIF thread flag bits
livepatch/powerpc: add TIF_PATCH_PENDING thread flag
livepatch/x86: add TIF_PATCH_PENDING thread flag
livepatch: create temporary klp_update_patch_state() stub
x86/entry: define _TIF_ALLWORK_MASK flags explicitly
stacktrace/x86: add function for detecting reliable stack traces
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull fs/compat.c cleanups from Al Viro:
"More moving of compat syscalls from fs/compat.c to fs/*.c where the
native counterparts live.
And death to compat_sys_getdents64() - the only architecture that used
to need it was ia64, and _that_ has lost biarch support quite a few
years ago"
* 'work.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs/compat.c: trim unused includes
move compat_rw_copy_check_uvector() over to fs/read_write.c
fhandle: move compat syscalls from compat.c
open: move compat syscalls from compat.c
stat: move compat syscalls from compat.c
fcntl: move compat syscalls from compat.c
readdir: move compat syscalls from compat.c
statfs: move compat syscalls from compat.c
utimes: move compat syscalls from compat.c
move compat select-related syscalls to fs/select.c
Remove compat_sys_getdents64()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Ingo Molnar:
"The main x86 MM changes in this cycle were:
- continued native kernel PCID support preparation patches to the TLB
flushing code (Andy Lutomirski)
- various fixes related to 32-bit compat syscall returning address
over 4Gb in applications, launched from 64-bit binaries - motivated
by C/R frameworks such as Virtuozzo. (Dmitry Safonov)
- continued Intel 5-level paging enablement: in particular the
conversion of x86 GUP to the generic GUP code. (Kirill A. Shutemov)
- x86/mpx ABI corner case fixes/enhancements (Joerg Roedel)
- ... plus misc updates, fixes and cleanups"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (62 commits)
mm, zone_device: Replace {get, put}_zone_device_page() with a single reference to fix pmem crash
x86/mm: Fix flush_tlb_page() on Xen
x86/mm: Make flush_tlb_mm_range() more predictable
x86/mm: Remove flush_tlb() and flush_tlb_current_task()
x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
x86/mm/64: Fix crash in remove_pagetable()
Revert "x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation"
x86/boot/e820: Remove a redundant self assignment
x86/mm: Fix dump pagetables for 4 levels of page tables
x86/mpx, selftests: Only check bounds-vs-shadow when we keep shadow
x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space
Revert "x86/mm/numa: Remove numa_nodemask_from_meminfo()"
x86/espfix: Add support for 5-level paging
x86/kasan: Extend KASAN to support 5-level paging
x86/mm: Add basic defines/helpers for CONFIG_X86_5LEVEL=y
x86/paravirt: Add 5-level support to the paravirt code
x86/mm: Define virtual memory map for 5-level paging
x86/asm: Remove __VIRTUAL_MASK_SHIFT==47 assert
x86/boot: Detect 5-level paging support
x86/mm/numa: Remove numa_nodemask_from_meminfo()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 vdso updates from Ingo Molnar:
"Add support for vDSO acceleration of the "Hyper-V TSC page", to speed
up clock reading on Hyper-V guests"
* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso: Add VCLOCK_HVCLOCK vDSO clock read method
x86/hyperv: Move TSC reading method to asm/mshyperv.h
x86/hyperv: Implement hv_get_tsc_page()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 asm updates from Ingo Molnar:
"The main changes in this cycle were:
- unwinder fixes and enhancements
- improve ftrace interaction with the unwinder
- optimize the code footprint of WARN() and related debugging
constructs
- ... plus misc updates, cleanups and fixes"
* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86/unwind: Dump all stacks in unwind_dump()
x86/unwind: Silence more entry-code related warnings
x86/ftrace: Fix ebp in ftrace_regs_caller that screws up unwinder
x86/unwind: Remove unused 'sp' parameter in unwind_dump()
x86/unwind: Prepend hex mask value with '0x' in unwind_dump()
x86/unwind: Properly zero-pad 32-bit values in unwind_dump()
x86/unwind: Ensure stack pointer is aligned
debug: Avoid setting BUGFLAG_WARNING twice
x86/unwind: Silence entry-related warnings
x86/unwind: Read stack return address in update_stack_state()
x86/unwind: Move common code into update_stack_state()
debug: Fix __bug_table[] in arch linker scripts
debug: Add _ONCE() logic to report_bug()
x86/debug: Define BUG() again for !CONFIG_BUG
x86/debug: Implement __WARN() using UD0
x86/ftrace: Use Makefile logic instead of #ifdef for compiling ftrace_*.o
x86/ftrace: Add -mfentry support to x86_32 with DYNAMIC_FTRACE set
x86/ftrace: Clean up ftrace_regs_caller
x86/ftrace: Add stack frame pointer to ftrace_caller
x86/ftrace: Move the ftrace specific code out of entry_32.S
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pul x86/process updates from Ingo Molnar:
"The main change in this cycle was to add the ARCH_[GET|SET]_CPUID
prctl() ABI extension to control the availability of the CPUID
instruction, analogously to the existing PR_GET|SET_TSC ABI that
controls RDTSC.
Motivation: the 'rr' user-space record-and-replay execution debugger
would like to trap and emulate the CPUID instruction - which
instruction is normally unprivileged.
Trapping CPUID is possible on IvyBridge and later Intel CPUs - expose
this hardware capability"
* 'x86-process-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/syscalls/32: Ignore arch_prctl for other architectures
um/arch_prctl: Fix fallout from x86 arch_prctl() rework
x86/arch_prctl: Add ARCH_[GET|SET]_CPUID
x86/cpufeature: Detect CPUID faulting support
x86/syscalls/32: Wire up arch_prctl on x86-32
x86/arch_prctl: Add do_arch_prctl_common()
x86/arch_prctl/64: Rename do_arch_prctl() to do_arch_prctl_64()
x86/arch_prctl/64: Use SYSCALL_DEFINE2 to define sys_arch_prctl()
x86/arch_prctl: Rename 'code' argument to 'option'
x86/msr: Rename MISC_FEATURE_ENABLES to MISC_FEATURES_ENABLES
x86/process: Optimize TIF_NOTSC switch
x86/process: Correct and optimize TIF_BLOCKSTEP switch
x86/process: Optimize TIF checks in __switch_to_xtra()
|
|
Unlike normal compat syscall variants, it is needed only for
biarch architectures that have different alignement requirements for
u64 in 32bit and 64bit ABI *and* have __put_user() that won't handle
a store of 64bit value at 32bit-aligned address. We used to have one
such (ia64), but its biarch support has been gone since 2010 (after
being broken in 2008, which went unnoticed since nobody had been using
it).
It had escaped removal at the same time only because back in 2004
a patch that switched several syscalls on amd64 from private wrappers to
generic compat ones had switched to use of compat_sys_getdents64(), which
hadn't needed (or used) a compat wrapper on amd64.
Let's bury it - it's at least 7 years overdue.
Signed-off-by: Al Viro <[email protected]>
|
|
vdso_enabled can be set to arbitrary integer values via the kernel command
line 'vdso32=' parameter or via 'sysctl abi.vsyscall32'.
load_vdso32() only maps VDSO if vdso_enabled == 1, but ARCH_DLINFO_IA32
merily checks for vdso_enabled != 0. As a consequence the AT_SYSINFO_EHDR
auxiliary vector for the VDSO_ENTRY is emitted with a NULL pointer which
causes a segfault when the application tries to use the VDSO.
Restrict the valid arguments on the command line and the sysctl to 0 and 1.
Fixes: b0b49f2673f0 ("x86, vdso: Remove compat vdso support")
Signed-off-by: Mathias Krause <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: Roland McGrath <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
We don't need the assert anymore, as:
17be0aec74fb ("x86/asm/entry/64: Implement better check for canonical addresses")
made canonical address checks generic wrt. address width.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The function tracing hook code for ftrace is not an entry point from
userspace and does not belong in the entry_*.S files. It has already been
moved out of entry_64.S.
Move it out of entry_32.S into its own ftrace_32.S file.
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Reviewed-by: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Hook up arch_prctl to call do_arch_prctl() on x86-32, and in 32 bit compat
mode on x86-64. This allows to have arch_prctls that are not specific to 64
bits.
On UML, simply stub out this syscall.
Signed-off-by: Kyle Huey <[email protected]>
Cc: Grzegorz Andrejczuk <[email protected]>
Cc: [email protected]
Cc: Radim Krčmář <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: Nadav Amit <[email protected]>
Cc: Robert O'Callahan <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: [email protected]
Cc: Jeff Dike <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: [email protected]
Cc: David Matlack <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: [email protected]
Cc: Paolo Bonzini <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
Each processor holds a GDT in its per-cpu structure. The sgdt
instruction gives the base address of the current GDT. This address can
be used to bypass KASLR memory randomization. With another bug, an
attacker could target other per-cpu structures or deduce the base of
the main memory section (PAGE_OFFSET).
This patch relocates the GDT table for each processor inside the
fixmap section. The space is reserved based on number of supported
processors.
For consistency, the remapping is done by default on 32 and 64-bit.
Each processor switches to its remapped GDT at the end of
initialization. For hibernation, the main processor returns with the
original GDT and switches back to the remapping at completion.
This patch was tested on both architectures. Hibernation and KVM were
both tested specially for their usage of the GDT.
Thanks to Boris Ostrovsky <[email protected]> for testing and
recommending changes for Xen support.
Signed-off-by: Thomas Garnier <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Luis R . Rodriguez <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Paul Gortmaker <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Rafael J . Wysocki <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Stanislaw Gruszka <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: zijun_hu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|