Age | Commit message (Collapse) | Author | Files | Lines |
|
pud_large() is always defined as pud_leaf(). Merge their usages. Chose
pud_leaf() because pud_leaf() is a global API, while pud_large() is not.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: "Naveen N. Rao" <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
pmd_large() is always defined as pmd_leaf(). Merge their usages. Chose
pmd_leaf() because pmd_leaf() is a global API, while pmd_large() is not.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Mike Rapoport (IBM) <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: "Naveen N. Rao" <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Popuplate the shadow for the shared portion of the CPU entry area, i.e.
the read-only IDT mapping, during KASAN initialization. A recent change
modified KASAN to map the per-CPU areas on-demand, but forgot to keep a
shadow for the common area that is shared amongst all CPUs.
Map the common area in KASAN init instead of letting idt_map_in_cea() do
the dirty work so that it Just Works in the unlikely event more shared
data is shoved into the CPU entry area.
The bug manifests as a not-present #PF when software attempts to lookup
an IDT entry, e.g. when KVM is handling IRQs on Intel CPUs (KVM performs
direct CALL to the IRQ handler to avoid the overhead of INTn):
BUG: unable to handle page fault for address: fffffbc0000001d8
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 16c03a067 P4D 16c03a067 PUD 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 5 PID: 901 Comm: repro Tainted: G W 6.1.0-rc3+ #410
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:kasan_check_range+0xdf/0x190
vmx_handle_exit_irqoff+0x152/0x290 [kvm_intel]
vcpu_run+0x1d89/0x2bd0 [kvm]
kvm_arch_vcpu_ioctl_run+0x3ce/0xa70 [kvm]
kvm_vcpu_ioctl+0x349/0x900 [kvm]
__x64_sys_ioctl+0xb8/0xf0
do_syscall_64+0x2b/0x50
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Fixes: 9fd429c28073 ("x86/kasan: Map shadow for percpu pages on demand")
Reported-by: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Add helpers to dedup code for aligning shadow address up/down to page
boundaries when translating an address to its shadow.
No functional change intended.
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Andrey Ryabinin <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Rename the CPU entry area variables in kasan_init() to shorten their
names, a future fix will reference the beginning of the per-CPU portion
of the CPU entry area, and shadow_cpu_entry_per_cpu_begin is a bit much.
No functional change intended.
Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Andrey Ryabinin <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
KASAN maps shadow for the entire CPU-entry-area:
[CPU_ENTRY_AREA_BASE, CPU_ENTRY_AREA_BASE + CPU_ENTRY_AREA_MAP_SIZE]
This will explode once the per-cpu entry areas are randomized since it
will increase CPU_ENTRY_AREA_MAP_SIZE to 512 GB and KASAN fails to
allocate shadow for such big area.
Fix this by allocating KASAN shadow only for really used cpu entry area
addresses mapped by cea_map_percpu_pages()
Thanks to the 0day folks for finding and reporting this to be an issue.
[ dhansen: tweak changelog since this will get committed before peterz's
actual cpu-entry-area randomization ]
Signed-off-by: Andrey Ryabinin <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Tested-by: Yujie Liu <[email protected]>
Cc: kernel test robot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Rename memblock_free_ptr() to memblock_free() and use memblock_free()
when freeing a virtual pointer so that memblock_free() will be a
counterpart of memblock_alloc()
The callers are updated with the below semantic patch and manual
addition of (void *) casting to pointers that are represented by
unsigned long variables.
@@
identifier vaddr;
expression size;
@@
(
- memblock_phys_free(__pa(vaddr), size);
+ memblock_free(vaddr, size);
|
- memblock_free_ptr(vaddr, size);
+ memblock_free(vaddr, size);
)
[[email protected]: fixup]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Shahab Vahedi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The boot-time allocation interface for memblock is a mess, with
'memblock_alloc()' returning a virtual pointer, but then you are
supposed to free it with 'memblock_free()' that takes a _physical_
address.
Not only is that all kinds of strange and illogical, but it actually
causes bugs, when people then use it like a normal allocation function,
and it fails spectacularly on a NULL pointer:
https://lore.kernel.org/all/20210912140820.GD25450@xsang-OptiPlex-9020/
or just random memory corruption if the debug checks don't catch it:
https://lore.kernel.org/all/[email protected]/
I really don't want to apply patches that treat the symptoms, when the
fundamental cause is this horribly confusing interface.
I started out looking at just automating a sane replacement sequence,
but because of this mix or virtual and physical addresses, and because
people have used the "__pa()" macro that can take either a regular
kernel pointer, or just the raw "unsigned long" address, it's all quite
messy.
So this just introduces a new saner interface for freeing a virtual
address that was allocated using 'memblock_alloc()', and that was kept
as a regular kernel pointer. And then it converts a couple of users
that are obvious and easy to test, including the 'xbc_nodes' case in
lib/bootconfig.c that caused problems.
Reported-by: kernel test robot <[email protected]>
Fixes: 40caa127f3c7 ("init: bootconfig: Remove all bootconfig data when the init memory is removed")
Cc: Steven Rostedt <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "mm: consolidate definitions of page table accessors", v2.
The low level page table accessors (pXY_index(), pXY_offset()) are
duplicated across all architectures and sometimes more than once. For
instance, we have 31 definition of pgd_offset() for 25 supported
architectures.
Most of these definitions are actually identical and typically it boils
down to, e.g.
static inline unsigned long pmd_index(unsigned long address)
{
return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
}
static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
{
return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
}
These definitions can be shared among 90% of the arches provided
XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.
For architectures that really need a custom version there is always
possibility to override the generic version with the usual ifdefs magic.
These patches introduce include/linux/pgtable.h that replaces
include/asm-generic/pgtable.h and add the definitions of the page table
accessors to the new header.
This patch (of 12):
The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the
functions involving page table manipulations, e.g. pte_alloc() and
pmd_alloc(). So, there is no point to explicitly include <asm/pgtable.h>
in the files that include <linux/mm.h>.
The include statements in such cases are remove with a simple loop:
for f in $(git grep -l "include <linux/mm.h>") ; do
sed -i -e '/include <asm\/pgtable.h>/ d' $f
done
Signed-off-by: Mike Rapoport <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Cain <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Greentime Hu <[email protected]>
Cc: Greg Ungerer <[email protected]>
Cc: Guan Xuetao <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Ley Foon Tan <[email protected]>
Cc: Mark Salter <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Nick Hu <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Russell King <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vincent Chen <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Make #GP exceptions caused by out-of-bounds KASAN shadow accesses easier
to understand by computing the address of the original access and
printing that. More details are in the comments in the patch.
This turns an error like this:
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault, probably for non-canonical address
0xe017577ddf75b7dd: 0000 [#1] PREEMPT SMP KASAN PTI
into this:
general protection fault, probably for non-canonical address
0xe017577ddf75b7dd: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: maybe wild-memory-access in range
[0x00badbeefbadbee8-0x00badbeefbadbeef]
The hook is placed in architecture-independent code, but is currently
only wired up to the X86 exception handler because I'm not sufficiently
familiar with the address space layout and exception handling mechanisms
on other architectures.
Signed-off-by: Jann Horn <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Dmitry Vyukov <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: linux-mm <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: x86-ml <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
In the case where KASAN directly allocates memory to back vmalloc space,
don't map the early shadow page over it.
We prepopulate pgds/p4ds for the range that would otherwise be empty.
This is required to get it synced to hardware on boot, allowing the
lower levels of the page tables to be filled dynamically.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Daniel Axtens <[email protected]>
Acked-by: Dmitry Vyukov <[email protected]>
Reviewed-by: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Since commit d52888aa2753 ("x86/mm: Move LDT remap out of KASLR region on
5-level paging") kernel doesn't boot with KASAN on 5-level paging machines.
The bug is actually in early_p4d_offset() and introduced by commit
12a8cc7fcf54 ("x86/kasan: Use the same shadow offset for 4- and 5-level paging")
early_p4d_offset() tries to convert pgd_val(*pgd) value to a physical
address. This doesn't make sense because pgd_val() already contains the
physical address.
It did work prior to commit d52888aa2753 because the result of
"__pa_nodebug(pgd_val(*pgd)) & PTE_PFN_MASK" was the same as "pgd_val(*pgd)
& PTE_PFN_MASK". __pa_nodebug() just set some high bits which were masked
out by applying PTE_PFN_MASK.
After the change of the PAGE_OFFSET offset in commit d52888aa2753
__pa_nodebug(pgd_val(*pgd)) started to return a value with more high bits
set and PTE_PFN_MASK wasn't enough to mask out all of them. So it returns a
wrong not even canonical address and crashes on the attempt to dereference
it.
Switch back to pgd_val() & PTE_PFN_MASK to cure the issue.
Fixes: 12a8cc7fcf54 ("x86/kasan: Use the same shadow offset for 4- and 5-level paging")
Reported-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Andrey Ryabinin <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
As all the memblock allocation functions return NULL in case of error
rather than panic(), the duplicates with _nopanic suffix can be removed.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Acked-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Petr Mladek <[email protected]> [printk]
Cc: Catalin Marinas <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Dennis Zhou <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Greentime Hu <[email protected]>
Cc: Guan Xuetao <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Guo Ren <[email protected]> [c-sky]
Cc: Heiko Carstens <[email protected]>
Cc: Juergen Gross <[email protected]> [Xen]
Cc: Mark Salter <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Russell King <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
With tag based KASAN mode the early shadow value is 0xff and not 0x00, so
this patch renames kasan_zero_(page|pte|pmd|pud|p4d) to
kasan_early_shadow_(page|pte|pmd|pud|p4d) to avoid confusion.
Link: http://lkml.kernel.org/r/3fed313280ebf4f88645f5b89ccbc066d320e177.1544099024.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <[email protected]>
Suggested-by: Mark Rutland <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.
The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include <linux/memblock.h>
@@
@@
- #include <linux/bootmem.h>
+ #include <linux/memblock.h>
[[email protected]: dma-direct: fix up for the removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/[email protected]
[[email protected]: powerpc: fix up for removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/[email protected]
[[email protected]: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Greentime Hu <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Guan Xuetao <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Ley Foon Tan <[email protected]>
Cc: Mark Salter <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Richard Kuo <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Russell King <[email protected]>
Cc: Serge Semin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Drop BOOTMEM_ALLOC_ACCESSIBLE and BOOTMEM_ALLOC_ANYWHERE in favor of
identical MEMBLOCK definitions.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Greentime Hu <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Guan Xuetao <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Ley Foon Tan <[email protected]>
Cc: Mark Salter <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Richard Kuo <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Russell King <[email protected]>
Cc: Serge Semin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The conversion is done using
sed -i 's@memblock_virt_alloc@memblock_alloc@g' \
$(git grep -l memblock_virt_alloc)
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Greentime Hu <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Guan Xuetao <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Ley Foon Tan <[email protected]>
Cc: Mark Salter <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Richard Kuo <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Russell King <[email protected]>
Cc: Serge Semin <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
pgtable_l5_enabled is defined using cpu_feature_enabled() but we refer
to it as a variable. This is misleading.
Make pgtable_l5_enabled() a function.
We cannot literally define it as a function due to circular dependencies
between header files. Function-alike macros is close enough.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Usually pgtable_l5_enabled is defined using cpu_feature_enabled().
cpu_feature_enabled() is not available in early boot code. We use
several different preprocessor tricks to get around it. It's messy.
Unify them all.
If cpu_feature_enabled() is not yet available, USE_EARLY_PGTABLE_L5 can
be defined before all includes. It makes pgtable_l5_enabled rely on
__pgtable_l5_enabled variable instead. This approach fits all early
users.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
A PTE is constructed from a physical address and a pgprotval_t.
__PAGE_KERNEL, for instance, is a pgprot_t and must be converted
into a pgprotval_t before it can be used to create a PTE. This is
done implicitly within functions like pfn_pte() by massage_pgprot().
However, this makes it very challenging to set bits (and keep them
set) if your bit is being filtered out by massage_pgprot().
This moves the bit filtering out of pfn_pte() and friends. For
users of PAGE_KERNEL*, filtering will be done automatically inside
those macros but for users of __PAGE_KERNEL*, they need to do their
own filtering now.
Note that we also just move pfn_pte/pmd/pud() over to check_pgprot()
instead of massage_pgprot(). This way, we still *look* for
unsupported bits and properly warn about them if we find them. This
might happen if an unfiltered __PAGE_KERNEL* value was passed in,
for instance.
- printk format warning fix from: Arnd Bergmann <[email protected]>
- boot crash fix from: Tom Lendacky <[email protected]>
- crash bisected by: Mike Galbraith <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reported-and-fixed-by: Arnd Bergmann <[email protected]>
Fixed-by: Tom Lendacky <[email protected]>
Bisected-by: Mike Galbraith <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
By this point we have functioning boot-time switching between 4- and
5-level paging mode. But naive approach comes with cost.
Numbers below are for kernel build, allmodconfig, 5 times.
CONFIG_X86_5LEVEL=n:
Performance counter stats for 'sh -c make -j100 -B -k >/dev/null' (5 runs):
17308719.892691 task-clock:u (msec) # 26.772 CPUs utilized ( +- 0.11% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
331,993,164 page-faults:u # 0.019 M/sec ( +- 0.01% )
43,614,978,867,455 cycles:u # 2.520 GHz ( +- 0.01% )
39,371,534,575,126 stalled-cycles-frontend:u # 90.27% frontend cycles idle ( +- 0.09% )
28,363,350,152,428 instructions:u # 0.65 insn per cycle
# 1.39 stalled cycles per insn ( +- 0.00% )
6,316,784,066,413 branches:u # 364.948 M/sec ( +- 0.00% )
250,808,144,781 branch-misses:u # 3.97% of all branches ( +- 0.01% )
646.531974142 seconds time elapsed ( +- 1.15% )
CONFIG_X86_5LEVEL=y:
Performance counter stats for 'sh -c make -j100 -B -k >/dev/null' (5 runs):
17411536.780625 task-clock:u (msec) # 26.426 CPUs utilized ( +- 0.10% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
331,868,663 page-faults:u # 0.019 M/sec ( +- 0.01% )
43,865,909,056,301 cycles:u # 2.519 GHz ( +- 0.01% )
39,740,130,365,581 stalled-cycles-frontend:u # 90.59% frontend cycles idle ( +- 0.05% )
28,363,358,997,959 instructions:u # 0.65 insn per cycle
# 1.40 stalled cycles per insn ( +- 0.00% )
6,316,784,937,460 branches:u # 362.793 M/sec ( +- 0.00% )
251,531,919,485 branch-misses:u # 3.98% of all branches ( +- 0.00% )
658.886307752 seconds time elapsed ( +- 0.92% )
The patch tries to fix the performance regression by using
cpu_feature_enabled(X86_FEATURE_LA57) instead of pgtable_l5_enabled in
all hot code paths. These will statically patch the target code for
additional performance.
CONFIG_X86_5LEVEL=y + the patch:
Performance counter stats for 'sh -c make -j100 -B -k >/dev/null' (5 runs):
17381990.268506 task-clock:u (msec) # 26.907 CPUs utilized ( +- 0.19% )
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
331,862,625 page-faults:u # 0.019 M/sec ( +- 0.01% )
43,697,726,320,051 cycles:u # 2.514 GHz ( +- 0.03% )
39,480,408,690,401 stalled-cycles-frontend:u # 90.35% frontend cycles idle ( +- 0.05% )
28,363,394,221,388 instructions:u # 0.65 insn per cycle
# 1.39 stalled cycles per insn ( +- 0.00% )
6,316,794,985,573 branches:u # 363.410 M/sec ( +- 0.00% )
251,013,232,547 branch-misses:u # 3.97% of all branches ( +- 0.01% )
645.991174661 seconds time elapsed ( +- 1.19% )
Unfortunately, this approach doesn't help with text size:
vmlinux.before .text size: 8190319
vmlinux.after .text size: 8200623
The .text section is increased by about 4k. Not sure if we can do anything
about this.
Signed-off-by: Kirill A. Shuemov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
This patch converts the of CONFIG_X86_5LEVEL check to runtime checks for
p4d folding.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
For boot-time switching between 4- and 5-level paging we need to be able
to fold p4d page table level at runtime. It requires variable
PGDIR_SHIFT and PTRS_PER_P4D.
The change doesn't affect the kernel image size much:
text data bss dec hex filename
8628091 4734304 1368064 14730459 e0c4db vmlinux.before
8628393 4734340 1368064 14730797 e0c62d vmlinux.after
Signed-off-by: Kirill A. Shutemov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Currently KASAN doesn't panic in case it don't have enough memory
to boot. Instead, it crashes in some random place:
kernel BUG at arch/x86/mm/physaddr.c:27!
RIP: 0010:__phys_addr+0x268/0x276
Call Trace:
kasan_populate_shadow+0x3f2/0x497
kasan_init+0x12e/0x2b2
setup_arch+0x2825/0x2a2c
start_kernel+0xc8/0x15f4
x86_64_start_reservations+0x2a/0x2c
x86_64_start_kernel+0x72/0x75
secondary_startup_64+0xa5/0xb0
Use memblock_virt_alloc_try_nid() for allocations without failure
fallback. It will panic with an out of memory message.
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Andrey Ryabinin <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Dmitry Vyukov <[email protected]>
Cc: [email protected]
Cc: Alexander Potapenko <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
Put the cpu_entry_area into a separate P4D entry. The fixmap gets too big
and 0-day already hit a case where the fixmap PTEs were cleared by
cleanup_highmap().
Aside of that the fixmap API is a pain as it's all backwards.
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The cpu_entry_area will contain stacks. Make sure that KASAN has
appropriate shadow mappings for them.
Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Andrey Ryabinin <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David Laight <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Eduardo Valentin <[email protected]>
Cc: Greg KH <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
[ Note, this is a Git cherry-pick of the following commit:
d17a1d97dc20: ("x86/mm/kasan: don't use vmemmap_populate() to initialize shadow")
... for easier x86 PTI code testing and back-porting. ]
The KASAN shadow is currently mapped using vmemmap_populate() since that
provides a semi-convenient way to map pages into init_top_pgt. However,
since that no longer zeroes the mapped pages, it is not suitable for
KASAN, which requires zeroed shadow memory.
Add kasan_populate_shadow() interface and use it instead of
vmemmap_populate(). Besides, this allows us to take advantage of
gigantic pages and use them to populate the shadow, which should save us
some memory wasted on page tables and reduce TLB pressure.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Andrey Ryabinin <[email protected]>
Signed-off-by: Pavel Tatashin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Steven Sistare <[email protected]>
Cc: Daniel Jordan <[email protected]>
Cc: Bob Picco <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Sam Ravnborg <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Conflicts:
arch/x86/kernel/cpu/Makefile
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <[email protected]>
Reviewed-by: Philippe Ombredanne <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
We are going to support boot-time switching between 4- and 5-level
paging. For KASAN it means we cannot have different KASAN_SHADOW_OFFSET
for different paging modes: the constant is passed to gcc to generate
code and cannot be changed at runtime.
This patch changes KASAN code to use 0xdffffc0000000000 as shadow offset
for both 4- and 5-level paging.
For 5-level paging it means that shadow memory region is not aligned to
PGD boundary anymore and we have to handle unaligned parts of the region
properly.
In addition, we have to exclude paravirt code from KASAN instrumentation
as we now use set_pgd() before KASAN is fully ready.
[[email protected]: clenaup, changelog message]
Signed-off-by: Andrey Ryabinin <[email protected]>
Signed-off-by: Kirill A. Shutemov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The boot data and command line data are present in memory in a decrypted
state and are copied early in the boot process. The early page fault
support will map these areas as encrypted, so before attempting to copy
them, add decrypted mappings so the data is accessed properly when copied.
For the initrd, encrypt this data in place. Since the future mapping of
the initrd area will be mapped as encrypted the data will be accessed
properly.
Signed-off-by: Tom Lendacky <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brijesh Singh <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Larry Woodman <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Toshimitsu Kani <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/bb0d430b41efefd45ee515aaf0979dcfda8b6a44.1500319216.git.thomas.lendacky@amd.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Changes to the existing page table macros will allow the SME support to
be enabled in a simple fashion with minimal changes to files that use these
macros. Since the memory encryption mask will now be part of the regular
pagetable macros, we introduce two new macros (_PAGE_TABLE_NOENC and
_KERNPG_TABLE_NOENC) to allow for early pagetable creation/initialization
without the encryption mask before SME becomes active. Two new pgprot()
macros are defined to allow setting or clearing the page encryption mask.
The FIXMAP_PAGE_NOCACHE define is introduced for use with MMIO. SME does
not support encryption for MMIO areas so this define removes the encryption
mask from the page attribute.
Two new macros are introduced (__sme_pa() / __sme_pa_nodebug()) to allow
creating a physical address with the encryption mask. These are used when
working with the cr3 register so that the PGD can be encrypted. The current
__va() macro is updated so that the virtual address is generated based off
of the physical address without the encryption mask thus allowing the same
virtual address to be generated regardless of whether encryption is enabled
for that physical location or not.
Also, an early initialization function is added for SME. If SME is active,
this function:
- Updates the early_pmd_flags so that early page faults create mappings
with the encryption mask.
- Updates the __supported_pte_mask to include the encryption mask.
- Updates the protection_map entries to include the encryption mask so
that user-space allocations will automatically have the encryption mask
applied.
Signed-off-by: Tom Lendacky <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brijesh Singh <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Larry Woodman <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Toshimitsu Kani <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/b36e952c4c39767ae7f0a41cf5345adf27438480.1500319216.git.thomas.lendacky@amd.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
We used to read several bytes of the shadow memory in advance.
Therefore additional shadow memory mapped to prevent crash if
speculative load would happen near the end of the mapped shadow memory.
Now we don't have such speculative loads, so we no longer need to map
additional shadow memory.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Andrey Ryabinin <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
With CONFIG_X86_5LEVEL=y, level 4 is no longer top level of page tables.
Let's give these variable more generic names: init_top_pgt and
early_top_pgt.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
There's a conflict between ongoing level-5 paging support and
the E820 rewrite. Since the E820 rewrite is essentially ready,
merge it into x86/mm to reduce tree conflicts.
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The E820 rework in WIP.x86/boot has gone through a couple of weeks
of exposure in -tip, merge it in a wider fashion.
Signed-off-by: Ingo Molnar <[email protected]>
|
|
This patch bring support for a non-folded additional page table level.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
With folded p4d, pgd_clear() is a nop. Change clear_pgds() to use
p4d_clear() instead.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
This patch removes fixmap header usage on non-x86 code that was
introduced by the adaptable MODULE_END change.
Signed-off-by: Thomas Garnier <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
This patch aligns MODULES_END to the beginning of the fixmap section.
It optimizes the space available for both sections. The address is
pre-computed based on the number of pages required by the fixmap
section.
It will allow GDT remapping in the fixmap section. The current
MODULES_END static address does not provide enough space for the kernel
to support a large number of processors.
Signed-off-by: Thomas Garnier <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Luis R . Rodriguez <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Paul Gortmaker <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Rafael J . Wysocki <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Stanislaw Gruszka <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: zijun_hu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Small build fix. ]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The kernel doesn't boot with both PROFILE_ANNOTATED_BRANCHES=y and KASAN=y
options selected. With branch profiling enabled we end up calling
ftrace_likely_update() before kasan_early_init(). ftrace_likely_update() is
built with KASAN instrumentation, so calling it before kasan has been
initialized leads to crash.
Use DISABLE_BRANCH_PROFILING define to make sure that we don't call
ftrace_likely_update() from early code before kasan_early_init().
Fixes: ef7f0d6a6ca8 ("x86_64: add KASan support")
Reported-by: Fengguang Wu <[email protected]>
Signed-off-by: Andrey Ryabinin <[email protected]>
Cc: [email protected]
Cc: Alexander Potapenko <[email protected]>
Cc: [email protected]
Cc: Andrew Morton <[email protected]>
Cc: [email protected]
Cc: Dmitry Vyukov <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
<linux/sched.h> to <linux/sched/task.h>
Update all usage sites first.
Acked-by: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
We've got a number of defines related to the E820 table and its size:
E820MAP
E820NR
E820_X_MAX
E820MAX
The first two denote byte offsets into the zeropage (struct boot_params),
and can are not used in the kernel and can be removed.
The E820_*_MAX values have an inconsistent structure and it's unclear in any
case what they mean. 'X' presuably goes for extended - but it's not very
expressive altogether.
Change these over to:
E820_MAX_ENTRIES_ZEROPAGE
E820_MAX_ENTRIES
... which are self-explanatory names.
No change in functionality.
Cc: Alex Thorlton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Huang, Ying <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul Jackson <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
A commonly used lowlevel x86 header, asm/pgtable.h, includes asm/e820/api.h
spuriously, without making direct use of it.
Removing it is not simple: over the years various .c code learned to rely
on this indirect inclusion.
Remove the unnecessary include - this should speed up the kernel build a bit,
as a large header is not included anymore in totally unrelated code.
Cc: Alex Thorlton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Huang, Ying <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul Jackson <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Currently GPF messages with KASAN look as follows:
kasan: GPF could be caused by NULL-ptr deref or user memory accessgeneral protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
Add newlines.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Dmitry Vyukov <[email protected]>
Acked-by: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
After kasan_init() executed, no one is allowed to write to kasan_zero_page,
so write protect it.
Signed-off-by: Andrey Ryabinin <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Luis R. Rodriguez <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Toshi Kani <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Currently we clear kasan_zero_page before __flush_tlb_all(). This
works with current implementation of native_flush_tlb[_global]()
because it doesn't cause do any writes to kasan shadow memory.
But any subtle change made in native_flush_tlb*() could break this.
Also current code seems doesn't work for paravirt guests (lguest).
Only after the TLB flush we can be sure that kasan_zero_page is not
used as early shadow anymore (instrumented code will not write to it).
So it should cleared it only after the TLB flush.
Signed-off-by: Andrey Ryabinin <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Luis R. Rodriguez <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Toshi Kani <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
We decided to use KASAN as the short name of the tool and
KernelAddressSanitizer as the full one. Update log messages according to
that.
Signed-off-by: Andrey Konovalov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Konstantin Serebryany <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Introduce generic kasan_populate_zero_shadow(shadow_start,
shadow_end). This function maps kasan_zero_page to the
[shadow_start, shadow_end] addresses.
This replaces x86_64 specific populate_zero_shadow() and will
be used for ARM64 in follow on patches.
The main changes from original version are:
* Use p?d_populate*() instead of set_p?d()
* Use memblock allocator directly instead of vmemmap_alloc_block()
* __pa() instead of __pa_nodebug(). __pa() causes troubles
iff we use it before kasan_early_init(). kasan_populate_zero_shadow()
will be used later, so we ok with __pa() here.
Signed-off-by: Andrey Ryabinin <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Alexey Klimov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: David Keitel <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Linus Walleij <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yury <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|